Modelling Energy Consumption of the Republic of Serbia using Linear Regression and Artificial Neural Network Technique

The objectives of the study are twofold. First, we aim to examine the most influential socio-economic indicators to explain energy consumption in Republic of Serbia. The second objective is to develop models that are able to predict the future energy consumption in the Republic of Serbia. This could be the first important step towards proper energy management in the country. Several potential socio-economic indicators are selected to be the independent variables. Regression analysis is conducted to select the most relevant independent variables as well as building the multiple linear regression (MLR) models. In addition, an artificial neural networks (ANN) model is developed as a comparison. Finally, the energy demand is projected to the year 2022. It is found that both models show the declining trend with respect to the current level of energy consumption.


INTRODUCTION
The study aims to investigate the most influential socio-economic factors for energy consumption in Serbia as well as to model and project the future energy consumption to the year 2022.Looking at the current development of the energy sector in the policy context, the study has become the most important for guiding policy makers to decide the future energy policies (such as capacity expansion planning, conservation programs, power infrastructure expansion, etc.) [1].In addition, accurate projection of energy demand is very useful for implementation of capital-intensive investments [2].
The latest official projection of energy consumption is carried out by Serbian Ministry of Mining and Energy which is stipulated in Energy Sector Development Strategy document [3].In the projection, it is assumed that energy consumption (in terms of energy unit per GDP) remains constant until the end of the projected years.The changes of energy consumption are then estimated by using projected GDP growth.This simple approach tends to generate a higher error, and thus, could provide an inaccurate prediction [4].To the extent of our knowledge, this study is the first of its kind to predict energy consumption in the Republic of Serbia.
The study employs regression analysis and Artificial Neural Network (ANN) to model and project energy consumption in Serbia.Regression analysis is considered as a common and popular modeling technique that has been used for various energy modeling.ANN, by contrast, is a more recent technique which has been gaining popularity in energy forecasting application for the past decade.

REVIEW OF LITERATURE
A considerable amount of literature related to energy forecasting models is available.Each one of them differs in terms of forecasted indicators, temporal and spatial scale, methodology, as well as predictor variables being used.
Based on its approaches, Bhattacharyya [4] classified energy forecasting models into two types: simple approach and sophisticated models.An example of the study using the simple approach is a study by Gilland [5].In the study, the world energy demand is projected for the year 2000 and 2020 on the basis of assumptions of population growth, economic growth and proposed relation between elasticity of energy demand and growth of GDP per capita.Codoni et al. [6] stipulated that the income elasticity of demand is used for an energy analysis of Korea.In addition, as discussed in the earlier section, the Serbian government's energy projections are also applying the simple approach.Grover & Chandra [7] stated that the simple approaches are also applied by Indian state agencies, where the income elasticity is used to forecast primary energy and electricity demand.Although the simple approach is appealing due to its ease of use, it could not explain the driver of changes and relies highly on the subjective judgements of the modeller [4,29].
The more sophisticated models bring to the surface the realization that energy consumption is not merely explained by income as a single variable, but other variables as well.A number of energy consumption forecasting models are developed using social, economic, demographic, geography, and climatic factors.Mohamed & Bodger [10] use economic and demographic variables to forecast electricity consumption in New Zealand.Kankal, Akpinar, Kömürcü, & Özşahin [2] forecast energy consumption in Turkey using socio-economic and demographic variables such as GDP, population, import, export, and employment.
On the other side, well known PRIMES model which is used in Europe, employs different methodology and has different modeling goals.One of the aims of the paper is to show similar energy forecasting model that was practiced in other countries.
In addition, several studies also include climatic variables (such as ambient temperature and degree days) mainly to model the electricity consumption [12,14].
In terms of the methodological approach, techniques such as time series models, regression models, econometric models, and neural networks are reported to be the most common approaches in various energy forecasting studies [2].Sözen [8] employs ANN to project the energy dependency of Turkey by using predictor variables such as total production of primary energy per capita, total gross electricity generation per capita and final energy consumption per capita.Pao [9] uses and compares linear and non-linear models, including ANN, to model and project electricity consumption in Taiwan using economic and demographic variables.Other comparable studies employing ANN in energy forecasting are Ekonomou [12], Nasr et al. [14] and Kankal et al. [2].In other studies, time series models are used.Saab, Badr, & Nasr [13] use and compare several time series models such as the autoregressive, the autoregressive integrated moving average (ARIMA) and a novel configuration combining a first-order autoregressive model AR(1) with a high pass filter to forecast the electricity consumption of Lebanon.Wang & Meng [11] applied the combination of the ARIMA and ANN techniques to model energy consumption of Hebei province.They conclude that the combination of the methods improves the accuracy of the model.
Regression technique is a method to describe, summarize and relate a set of variables.In the energy forecasting studies, regression analysis is one of the most popular and common models in the literature [15,2].For instance, Mohamed & Bodger [10] applied multiple linear regression model in explaining the relevant variables of electricity consumption in New Zealand.The authors conclude that the developed regression model is deemed very comparable to other models.Al-Garni, Zubair, & Nizami [16] modeled the electricity consumption of Saudi Arabia using multiple linear regressions.The authors employ weather data, global solar radiation and population as predictor variables.Egelioglu, Mohamad, & Guven [17] use multiple linear regression for electricity consumption and conclude that the number of customers, the number of tourists and the electricity prices are relevant predictors variable.In another study, by employing the regression model, Tso & Yau [18] found that the weather and cooking style influence energy consumption and the daily energy loading patterns in Hong Kong.
Although the regression method is coherent with statistical theories and could produce good estimates, other approaches might be more precise in developing the predictive models [19].Thus, in this study, we attempt to model energy consumption using linear regression model and ANN.Several socio-economic variables are selected and applied to the models.These models employ the strength of the relationship between energy consumptions and selected predictor variables, thus resulting in more accurate projections.
In Serbia, studies on the subject of energy consumption forecasting are scarce.Few examples such as Sadorsky [20], applied econometrics technique to develop models which relate energy consumption and financial development in Central and Eastern European countries (Serbia included).Marinkovic, Popovic, Orlovic, & Ristic [21] predict the motor fuel consumption in Serbia from 2010 to 2025 using linear regression models.The main predictor variable is gross domestic product per capita, which is also corrected by introducing five influencing factors.
The most related study to ours is the projection of energy consumption in the energy strategy document [3].Thus, our study can become an additional insight for policy makers and academicians in terms of energy system analysis in Serbia.

MODELING SERBIAN ENERGY CONSUMPTION
Energy consumption depicts the total final energy used in the given time period.Sometimes the term is used interchangeably with energy demand.There are many factors that can influence energy consumption, such as the growth of the economy, the industry framework, people's income levels, the weather, the government's policy, etc. [11].Thus, as with any other modeling problem, it is important to choose the most relevant input variables that have significant influence on the output variable.
There are at least two difficulties in terms of building the energy model for Serbia: first, the political dynamics in the country.And second, the data availability.Since the earlier periods, there have been a lot of turbulences happening in the country.Serbia has undergone several civil-wars and in this way changed the structure of the government, regime and regions.The turbulence which has happened in the country makes the data seem irregular.Our early hypothesis is that conflict and political dynamics exert a great influence on economic and social structure, and thus influence the amount of energy demanded by the population.However, since this study tries to estimate energy demand based on socio-economic indicators, we do not include any indicators related to the political dynamics of the country.
The second problem is regarding the availability of the data.While the data for energy consumption has been available since the year 1990, most economic and social indicators are only available from 1995 (some even from 1997).Therefore, in order to assess the potential predictors, several measures to add the missing value are needed.However, as pointed out earlier, the dynamics in the country, especially at the beginning of 1990 played a significant role in the economic and social condition of the country, the simple estimation to add missing value (such as linear interpolation/trend) will tend to give a huge uncertainty (error) in the dataset.
In order to minimize the bias, we opted to choose the year 1997 as the starting year.It is the period from which all variables have their real observed value.

Model Building Steps
The model building steps are shown in Fig. 1.The first step is to collect the data for all variables.In general, the model building processes include two main approaches: linear regression and ANN.The linear regression analysis is conducted not only to generate a linear model, but also to select the most appropriate variables to be used as input variables.
The dependent variable (i.e.energy consumption) data is taken from the International energy agency.The candidate predictor variables are GDP (current price) (X1), total investment (X2), population (X3), unemployment rate (X4), inflation rate (X5), urban population (X6) and GDP per capita (X7).GDP, population and urban population data are taken from World Bank-World Development Indicators (WB-WDI), while the others are taken from the International Monetary Fund (IMF) database.
The data pre-processing step includes transformation of the dependent variable to its natural logarithmic form.This transformation enables the data to be normally distributed, hence, does not violate normality assumptions of linear regression [22].

Figure 1 Modelling steps
Linear regression analysis is applied to perform the independent variables selection process.In this step, the important goals are: to pick out adequate numbers of predictor variables, include the linear relationship between dependent and independent variables and put forward the most relevant independent variables in the model [23].The correlations between variables are calculated to reach an early estimate of how variables correlate.
The correlation calculations show that energy consumption (Y) is significantly correlated with X2 and X 6 .In addition, the highest correlation is between GDP -GDP per-capita (r = 0,99) and population -inflation (r = 0,98).These high correlations between variables mean that one of the variables can potentially be dropped due to multicollinearity.Hence, the GDP per-capita is dropped due to multicollinearity.
To this end, there are 6 potential predictors.When dealing with several independent variables, it is quite important to determine the best combination of these variables to predict the dependent variable [22].In this step, the combination between predictors to form a linear relation with Y is presented.There are 64 possible combinations of variables for the models.The selection criteria used to select the best models are: coefficient of determination (R 2 ), adjusted coefficient of determination (adj.R 2 ), Akaike information criterion (AIC), Amemiya prediction criterion, Mallow's prediction criterion, and Schwarz Bayesian criterion (SBC).
Since one of the best possible models includes the whole predictors, thus, we applied an additional selection test which is the backward selection method.The backward selection process works as the following: first, all variables are included in the model.Then, the most insignificant individual variable is dropped.The process continues until all variables in the model show significance relations to the dependent variable.
The remaining variables in the model after the backward selection method are X1, X 2 , X 3 and X 4 .In our previous selection criterion, it is the one with the best value based on Amemiya and Schwarz criterion.If we see other criterion (such as R 2 ), it still has a considerably good value.Therefore we selected these variables as our predictors.

Multiple Linear Regression (MLR) model
Linear regression analysis aims to find a relationship between the dependent variable (in this case the energy consumption) and independent variables (the predictors) in the form of: In the equation, β 0 … β n are the regression coefficients which need to be estimated based on observation data.This can be done by curve fitting with the least square method which aims to minimize the difference between the observed and predicted values [24].
Based on our selected predictor variables, the linear regression function is in the form: where Y = energy consumption; X 1 = GDP; X 2 = Total investment, X 3 = Population; X 4 = Unemployment rate, β 0 -β 4 are the coefficients for the linear relation and ɛ is error terms.To produce an appropriate regression model, several linear regressions' assumptions cannot be violated.That is, the error terms have to be normally distributed with a mean of zero and constant variance [22].Residual analysis is the type of measure to analyze the appropriateness of the model.

Artificial Neural Networks (ANN) model
Artificial neural networks are computational models which are inspired by the functioning of the biological nervous system.Each consists of a number of processing elements called neurons, which correspond to biological neurons.The function of these neurons is also analogous to the real neurons.The neurons receive and send signals to each other using interconnected paths over weighted connections and activation functions.Weights pose as a 'long-term memory' in ANN which expresses the strength of each neuron input.Neural networks adjust these weights by the accumulated knowledge acquired during the training process [12,4].
The activation function has two distinct properties: first, weighted inputs are summed using the summation function.Second, the transfer function processes the data to be converted to an output.Then, this output can either become an input to other neuron or be the output in question.The graphical representation of these processes is shown in Fig. 2. While mathematically, it can be described in the following activation functions: and ( ) A neuron j may process the signals/inputs through activation functions.The n is equal to the total number of inputs.The artificial processing element/neuron receives inputs (x i ) with weight (w i ) and by using summation function, calculate its weighted average (g j ).Finally, output value (y j ) is produced by using transfer function (Φ).The most common and practical transfer function that has been used such as step, sign, linear and sigmoid functions [25].

Figure 2 Typical processing element/neuron in ANN
There are various types of ANN based on the connections, the neuron arrangement and training algorithm used.The multilayer perceptron (MLP)-type of model, which is the most common one, is used in the study.In terms of training algorithm, we opted to use Levenberg-Marquardt's variation of back propagation algorithm.The algorithm is described in detail in Hagan & Menhaj [26].The combination of MLP and trained with backpropagation algorithm is said to be useful in most of engineering applications [2].

Figure 3 Neural network used in the study
The MLP is a type of feed-forward neural network that typically consists of an input layer, at least one hidden layer and output layer.The representation of neural network used in the study is shown in Fig. 3.The back propagation learning algorithm works as follows [28]: When the output value is different from the observed/desired output, an error is calculated.Then, the error is propagated backward from the output layer to the input layer.As the error is back-propagated, the weights are also modified.With the new weights, the new iteration of finding the output value is then started.The iteration process continues until some stop criterion is satisfied (e.g.number of epochs or the total sum of squared error).

Performance Evaluation
The prediction capability of both models is evaluated based on three statistical indices: the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE).The mathematical descriptions of all three measures are as follow [9]: ) where Ŷ i and Y i are the i th predicted and observed value while n is the total number of observations.

RESULTS AND DISCUSSIONS
The section presents the results and discussions of energy consumption modelling using both linear regression and ANN technique.As discussed in section 3, there are 4 selected predictor variables for the models: GDP (X 1 ), total investment (X 2 ), population (X 3 ) and unemployment rate (X 4 ).The variables influence a significant linear relationship with energy consumption in Serbia.

MLR Model
The estimation of regression coefficients (in Eq. ( 2)) is done using the ordinary least square method.The results show that all variables, except the slope coefficient, are statistically significant to the energy consumption (dependent variable).Given the fact that the slope does not have any intrinsic meaning in the model, we dropped the slope coefficient from the model.The resulting linear regression model is as follows: The natural logarithmic of energy consumption of Serbia (lnY) has a positive linear relationship with GDP (X 1 ), total investment (X 2 ), population (X 3 ) and unemployment rate (X 4 ).From the standardized coefficients in Tab. 1, we can also see that the most influential predictor for energy consumption is X 3 population.The coefficient of determination (R 2 ) for this model is 0,78.That means, 78% of variance in energy consumption is explained by the regression model.
In order to measure the appropriateness of the model, residual diagnostic is done.The residual analysis is meant to check whether any of the features of the linear regression (such as linearity, independence of error term and homoscedasticity) are being violated [24].The plot of residual (error terms) and the predicted value by the model is generated as a measure (Fig. 4).It can be observed visually that the residuals fall within a horizontal band centered on 0, displaying no systematic tendencies to be positive and negative.Thus, the regression function is linear.Also, the error is randomly dispersed within the graph, suggesting the error terms are independent.Finally, for the homoscedasticity, the graph clearly shows that there is constant variance as the dots are relatively spread within the same range.Within this measure, we conclude that the linear regression model is appropriate for the data.

ANN Model
In order to construct an appropriate ANN, it is important to choose the proper network size.A network that is too small may not be able to adequately represent the system, while a too large system may cause overtrained and inaccurate results [2].Recall that the network architecture is shown in Fig. 3.A three-layer network is employed in the study.One hidden layer is said to be sufficient for most common application [27].The same input and output variables as in the regression model are used as the input and output layers of the ANN.
Trial and error procedure is applied in order to determine the number of neurons in the hidden layer (varying from 2 to 10).The MATLAB® neural network toolbox [28] is used to develop the ANN model.For training, 12 random samples (70% of the data) are used, while validation and test of the network each used 3 samples (15% of the data).
The maximum number of epochs used in the training is 1000.The training will stop when the mean square error (MSE) goal reached (0,001).In terms of training algorithm, the Levenberg-Marquardt (LM) training algorithm is used.In addition, the transfer function in a hidden layer (transfer from input layer to hidden layer) is using logistic sigmoid, while for the output (transfer from hidden layer to output layer) we used linear transfer function.
After several trials and errors, the desired ANN is selected.It is the one that has a compact structure, a fast training process and lower memory consumption, in short the one that has the best generalization [12].The final ANN model has the following properties: 1 input layer with 4 nodes, 1 hidden layer with 5 neurons and 1 output layer with 1 neuron.

Performance Evaluation
The models' performance is evaluated based on its predictive ability.We calculate it based on the evaluation of relative errors, RMSE, MAE, MAPE and R 2 .The errors indicate the variance that cannot be described by the model.That means, there are other factors/variables (outside of the model) that influence the value of energy consumption in Serbia.
Fig. 5 shows the results of the prediction based on MLR and ANN.In terms of the error, both of the models show relatively modest errors with the range of 1,0% to 15,9% for MLR model and 0,2% to 13,4% for ANN model.The biggest relative error shown by MLR is -1140 ktoe (15,9%) for the year 1999, while the lowest error is -94 ktoe (1,3%) in the year 2000.For ANN model, the largest error is observed as -1317 ktoe (13,4%) for the year 1997 and the lowest one is 17 ktoe (0,2%) during 1999.Based on the results we can see that in general, the ANN model can predict more accurately in most of the years.

Projection to 2022
Projections of socio-economic indicators from the latest IMF economic outlook database are used as the estimate for the future years' projections.The variables' estimate and numerical results of the projection is shown in Tab. 3.According to the results, both MLR and ANN models suggest the declining trend for energy consumption in the Republic of Serbia to the year 2022.The models, however, differ quite significantly in the degree of the decline.The difference between the two models ranges from 741 to 1388 ktoe, with the ANN model providing higher energy consumptions' estimates.
For the ANN model, energy consumptions in 2015 will drop to the level of 7601 ktoe (from 8118 ktoe in 2014).In the following years till the year 2022, there will be relatively stable energy consumption around 6875 -6941 ktoe.For MLR model, in 2015, the energy consumption declines to the level of 6860 ktoe.Further decline is expected until reaching the value of 5509 ktoe in 2022.
The difference in the projection of both models is expected as both models work in different ways.While MLR shows the linear relationships between dependent and independent variables, the ANN model captures nonlinearities of the relationships between variables.By looking at this result, one can use the ANN result as the higher estimate while the MLR results in the lower estimates.

CONCLUSIONS
The energy forecasts are quite useful and important in order to shape the energy policies of the country.In the case of Serbia, the country that is undergoing various reform processes, accurate energy consumption forecasts can be a good starting point to apply effective policies.The aims of the study are to examine the most influential socio-economic indicators to explain energy consumption and predict the future energy consumption in the Republic of Serbia.
Four socio-economic variables are used to model energy consumption in Serbia: GDP, total investment, population and unemployment rate.Through the results of variables' selection by the regression analysis, it is found that population variable is the most influential indicator that explains the energy consumption in Serbia.
Energy consumptions are modelled with the data range from 1997 -2014.In terms of the results, both MLR and ANN are able to model the energy consumptions quite well.However, ANN in general, performs relatively better, as seen by the results of lower model errors and higher R 2 .The projection is made for the year 2015 to 2022.Based on two models, it is found that energy consumption will decline from 8118 ktoe in 2014 to 5509 ktoe (MLR model) and 6898 ktoe (ANN model) in the year 2022.Finally, it is worth noting that the accuracy of the forecasts made by the models depends significantly on the accuracy of forecasts made for the explaining variables.

Figure 4
Figure 4 Plot of residual against predicted value

Figure 5
Figure 5 Graphical representation of model fitting In terms of overall model performance, the RMSE, MAE, MAPE and R 2 comparison is made (Tab.2).Clearly, ANN shows a better performance for all 4 indicators.It has low overall model errors based on RMSE, MAE and MAPE.In addition, the variance is also explained better by ANN as shown by higher R 2 value.

Table 1
Coefficients estimate for regression model *. Relationship is significant at the 5% level.

Table 2
Error comparison of MLR and ANN models

Table 3
Variable estimates and projection's results