Abstract:
|
Energy demand forecasting has become a relevant subject in the energy management field.
Different techniques are being currently applied to forecast the energy demand for different
time horizons and for diverse types of loads. Some of them are based in complex Machine
Learning (ML) algorithms, which maps the energy consumption to a set of influence
parameters or inputs, such as the historical data consumption, the weather or other variables,
making it possible to predict the energy demand.
Important management decisions from different stakeholders in the Energy sector are based
on these predictions and, therefore, it is important to rigorously assess the performance of
these predictive models. A specific methodology is presented in this dissertation through its
application over a real-building case-study in which energy demand predictions are being
carried out by a ML model. All the steps in the evaluation process are explained and
exemplified, including the data gathering, evaluation period selection, data preprocess with
special emphasis in the data abnormalities an its relation to the process dynamics and, finally,
the data process itself. The accuracy of the model and the main parameters of influence are
evaluated through four different metrics and data visualizations, based mainly in box-andwhisker
plots.
Several anomalies when predicting energy consumption in a disaggregated load (single
building) have been found in the study. By removing them the stability of the case-study model
is around 88%. The metrics yield a MAPE (Mean Absolute Percentage Error) of 18.05% and
a MBPE (Mean Biased Percentage Error) of -4.67%. While being values within the literature
range they show a poor accuracy. Nevertheless, there is space for improvement and by retraining,
refining and calibrating the model it will be possible to improve its performance. The
day of the week, the working calendar and the hour of the day showed to have a strong
influence over the error metrics analyzed.
Other alernative Machine Learnings methodologies have been applied to the same dataset
and their performance have been analyzed. Artificial Neural Network, k-Nearest Neighbors
and Random Forest based models have been compared after training with more than 1-year
hourly Energy Consumption data and other influence variables. The Random Forest achieved
the best accuracy when re-trained, showing a MAPE below 10%. The importance of passing
a detailed working calendar to the model, using accurate weather variables forecasts and
defining an adequate re-training strategy have been proved to improve model accuracy. |