AUTHOR=Xie Bingbing , Zhu Chenliang , Zhao Liang , Zhang Jun TITLE=A gradient boosting machine-based framework for electricity energy knowledge discovery JOURNAL=Frontiers in Environmental Science VOLUME=10 YEAR=2022 URL=https://www.frontiersin.org/journals/environmental-science/articles/10.3389/fenvs.2022.1031095 DOI=10.3389/fenvs.2022.1031095 ISSN=2296-665X ABSTRACT=

Knowledge discovery in databases (KDD) has an important effect on various fields with the development of information science. Electricity energy forecasting (EEF), a primary application of KDD, aims to explore the inner potential rule of electrical data for the purpose to serve electricity-related organizations or groups. Meanwhile, the advent of the information society attracts more and more scholars to pay attention to EEF. The existing methods for EEF focus on using high-techs to improve the experimental results but fail to construct an applicable electricity energy KDD framework. To complement the research gap, our study aims to propose a gradient boosting machine-based KDD framework for electricity energy prediction and enrich knowledge discovery applications. To be specific, we draw on the traditional knowledge discovery process and techniques to make the framework reliable and extensible. Additionally, we leverage Gradient Boosting Machine (GBM) to improve the efficiency and accuracy of our approach. We also devise three metrics for the evaluation of the proposed framework including R-square (R2), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). Besides, we collect the electricity energy consumption (EEC) as well as meteorological data from 2013 to 2016 in New York state and take the EEC prediction of New York State as an example. Finally, we conduct extensive experiments to verify the superior performance of our framework and the results show that our model achieves outstanding results for the three metrics (around 0.87 for R2, 60.15 for MAE, and 4.79 for MAPE). Compared with real value and the official prediction model, our approach also has a remarkable prediction ability. Therefore, we find that the proposed framework is feasible and reliable for EEF and could provide practical references for other types of energy KDD.