Short-Term Traffic Prediction Based on Genetic Algorithm Improved Neural Network

This paper takes the time series of short-term traffic flow as research object. The delay time and embedding dimension are calculated by C-C algorithm, and the chaotic characteristics of the time series are verified by small data sets method.Then based on the neural network prediction model and the chaotic phase space reconstruction theory, the network topology is determined, and the prediction is conducted by the wavelet neural network and RBF neural network using Lan-Hai expressway experimental data. The results show that the prediction effect of RBF neural network is better. Due to the poor stability of the network caused by the initial parameters randomness, the genetic algorithm is used to optimize the initial parameters. The results show that the prediction error of the optimized wavelet neural network or RBF neural network is reduced by more than 10%, and prediction accuracy of the latter is better.


INTRODUCTION
At present, the operational capacity of China's expressway is unable to meet the increasing traffic demand, resulting in frequent traffic jams, traffic accidents and environment pollution. In order to avoid these aggravating phenomena and to improve the traffic efficiency and safety, the short-term traffic prediction, as an important means of real-time traffic induction, has been widely studied by many scholars. However, due to the expressway's rapid development and traffic flow's time variation and complexity, as well as to the gradual shortening prediction interval, making accurate prediction is more and more difficult. In order to make the transportation industry better developed and more comprehensive, the traffic guidance and management must be more reasonable and effective, and so a large number of traffic forecasting models and the corresponding improved forecasting methods are proposed.
In 1976, the autoregressive moving average model was first proposed, which abstracts the traffic at certain time point into a random non-linear time series. Pawan and Sharma et al applied time series of traffic flow into a shortterm traffic prediction model, according to different travel purposes and distance [1]. And this model was also used to study the spatio-temporal relationship of traffic flow and other types of short-term traffic prediction models [2]. In 1984, Kalman filter theory was used for a traffic flow prediction modelling by Okutani and Stephanedes [3] and later, it was used for short-term traffic flow prediction by Polak and Vythoulkas. Then also it was used for a travel time prediction model by Zhu and Yang, which made a good prediction of the flow after several time periods based on the actual flow data [4]. Similarly, this theory was also used to deal with the boundary and unknown parameters of macro traffic flow, and the impact of different equipment and parameters on the model are discussed in detail [5]. Then a new Kalman filter method was proposed and used to predict vehicle speed and more specific traffic information [6]. The challenges of predicting traffic flows are the sharp nonlinearities due to transitions between free flow, breakdown, recovery and congestion. New research shows that deep learning architectures can capture traffic flow nonlinear spatio-temporal effects and can obtain precise short term traffic flow predictions [7]. In addition, some scholars have studied the near-future traffic flow forecasting in case of sparse data and high flow density, and put forward the corresponding methods [8]. In 1989, the chaos and fractal theory were introduced into the traffic flow research [9]. Subsequently, it was analysed [10] whether there are chaos characteristics in the urban traffic flow time series. Further, a Bayesian theory-based multiple measures chaotic traffic flow time series prediction algorithm is proposed and the numerical experiments verified its effectiveness [11]. Later, some scholars combined chaos theory with other methods to predict shortterm traffic, and got better results [12].
However, the above mentioned models and methods have many shortcomings in the short-term traffic prediction. Particularly, due to the strong nonlinearity and instability of short-term traffic flow, the Kalman filter model cannot achieve higher accuracy prediction results and therefore cannot be used to guide actual traffic. Though the autoregressive moving average model method can make good prediction for the continuously collected data, it is difficult to estimate the initial parameters, and the parameters are not transferable, which easily results in data loss. And the model is only suitable to predict the steady traffic flow, but it is not suitable to predict the traffic flow with emergencies, so that the forecasting effect cannot meet the need. Meanwhile, the calculation which the model needed is more than other models, so the prediction results cannot be well obtained in time.
At the end of the 20th century, artificial neural network was first applied for the prediction of short-term traffic flow [13]. Because of its good effect, it was quickly accepted by many scholars, and some new neural network models were proposed. Specially, by calibrating neural networks for each hour to perform short-term traffic forecasting at specific expressway sites, it is proved to be a better method, and the error rate is less than 10% [14]. By integrating an innovative algorithm with particle swarm optimization and artificial neural networks, the applications range of short-term traffic flow predictors is extended [15]. Besides, genetic algorithms are often used to design time delay neural network (TDNN) models as well as locally weighted regression models to predict shortterm traffic flow, and can obtain high accurate prediction results [16]. Elman neural network is proved to be super to the back propagation neural network (RP) to forecast traffic flow of multi-sections in the road network [17]. The validity of the short-term traffic forecasting model based on feed forward neural network is verified by Hayward's empirical research [18]. Besides these, research shows that many roads in a network may have no local connection, but may still share some common law, and based on this fact, a novel approach to predict UK-wide daily traffic counts on all roads in England and Wales is proposed and its effectiveness is verified [19]. Based on an advanced Time Delay Neural Network (TDNN) model, a new short term traffic flow prediction system is presented, and its performance is validated using both simulated and real traffic flow data obtained from California [20]. In general, neural network avoids some disadvantages of mathematical modelling, and has advantages for its own characteristics in the short-term traffic prediction. However, for the original artificial neural network, the modelling process is rather complicated. The sample size of the training set has great influence on the stability of the network, and the over fitting phenomenon is prone to occur. In addition, the simple neural network also easily leads to significant deviation of the prediction results for the complex modelling process and slow fitting speed, and it needs a large amount of training data for self-learning and the training process is relatively slow.
From the researches of scholars on short-term traffic flow forecasting, it can be seen that the combination of various forecasting models and methods can improve the inherent defects of traditional forecasting models, such as simplicity, lack of applicability, large errors of forecasting results, and get better forecasting results. Obviously, according to the characteristics of various forecasting models and theories, making up for each other's deficiencies and combining them to propose a new forecasting model needs further study. In addition, the short-term traffic forecasting of full mining of the collected historical data to find out its internal laws, and combining with the neural network model has gradually become a new trend [21]. Therefore, considering that chaos theory has a natural advantage for processing nonlinear data [22,23], on the basis of collected traffic flow time series data, with combination of the neural network prediction model and the chaotic phase space reconstruction theory to determine the network topology, the short-term traffic prediction models improved by two genetic algorithms are presented to make short-term traffic forecasting for expressways.

PHASE SPACE RECONSTRUCTION OF TRAFFIC FLOW UNDER THE CHAOTIC CHARACTERISTICS
Expressway traffic flow is a complex system with high time variation and nonlinearity. By judging whether there are chaotic characteristics in the time series of the collected data, the internal rules of expressway traffic flow can be better understood, and it is also the prerequisite of applying chaos theory to forecast short-term traffic. In practical applications, in order to find out the inherent information contained in the finite data for studying the dynamic characteristics of the system, we need to reconstruct the phase space of the collected data so as to better analyse and study its inherent rules.

Phase Space Reconstruction of Data
The derivative reconstruction method and the coordinate delay reconstruction method are often used to reconstruct the phase space [24]. Since the reconstructed phase space and the original phase space are differential homeomorphism, we use the coordinate delay method to construct a m-dimensional phase space vector with different delay time of 1-dimensional time series to describe the original state space. And the reconstructed time series phase space is: in which, M is the number of phase points in the reconstructed phase space. m is the embedding dimension and satisfied with 2 where D is the singular attractor dimension of the state space. τ is the delay time.
For the above parameters, the delay time and embedding dimension can be calculated using the CC algorithm proposed in 1999 [25]. Letting be a one-dimensional chaotic time series, and using delay time τ and embedding dimension m for phase space reconstruction, we can get: At the same time, the corresponding test statistic is defined as: where N is an integer multiple of t. After evenly dividing the test statistics defined in Eq. (3), and letting N → ∞ , then can get: In practical operation, due to all data in the time series being related and the whole length of the series limited, the 2 ( , , ) S m r t obtained is generally not zero. to the total radius r: Hence, the value of the optimal delay time τ d is the first local minimum point of 2 ( , , ) t~S m r t or . According to the Brock-Dhehert-Scheinkman statistical test conclusions, we can get the estimated values of N and m, r, and in here, let N = 3000, m = 2, 3, 4, 5, r i = i × 0,5σ, σ = std(x) (σ is standard deviation of time series) i = 1, 2, 3, 4. Then, The first local minimum point of ΔS 3 (t) is the optimal delay time τ d . And, since Eq. (5) adopting the subsectionaverage method, for a time series with period T, when t = kT (k is an integer greater than zero), both S 3 (t) and ΔS 3 (t) are zero, and define index S 4 (t) as: The global minimum point of S 4 (t) is the value of the optimal embedding window τ ω .Both the Eq. (8) and Eq. (9) are used to measure the deviation. In this paper, we collected the data at the Lanzhou West toll station of Lan-Hai (Lanzhou-Haikou) expressway, and the traffic statistics were taken every 5 minutes from 7:00 a.m. to 7:00 p.m. every day on November 7, 8, 9, 10 in 2017. According to the statistics of the traffic flow of the expressway, the C-C algorithm is used to calculate the data, and the results are shown in Fig.  1.
As can be seen from Fig. 1, the first minimum point of ΔS3(t) in Lan-Hai expressway data is 4, S 4 (t) takes the global minimum at time point 27, so the delay time is τ = 4 and the embedded window width is τ ω = 27, then according to the relation τ ω = (m -1) τ we can obtain the embedding dimension m = 8.

Short-Term Traffic Flow Chaos Discrimination
In order to judge whether there exist chaotic motion characteristics in the short-time traffic data time series, we choose the Lyapunov exponent method which is easy to calculate and the result is highly precise to judge the chaotic characteristics of the collected data sets.
The Lyapunov exponent refers to the average rate of exponent separation of two adjacent trajectories in a chaotic system over time. After reconstructing the phase space of the unit time series, let λ denote the largest Lyapunov exponent, then the distance between the center X(t) in this space and its nearest neighbor point , where, n is the total number of phase points in space. After iteration, the point X(t) becomes ( 1) X t + and the point ( ) According to the physical meaning of the largest Lyapunov exponent, this can be drawn: In this paper, we use the small data sets to calculate the largest Lyapunov exponent of short-term traffic volume time series, and from its definition we obtain: where, q is the sample period of the time series. Taking the logarithm of both sides of Eq. (11) simultaneously, we can get the exponential distribution point of each adjacent two points with the change of discrete time i , and draw the regression line of these points by least square method, then the slope of the line is the largest Lyapunov exponent. After calculation, the largest Lyapunov exponent of Lan-Hai expressway traffic volume time series is 0.0439, which satisfies chaotic conditions and can be analyzed by the chaos theory.

SHORT-TERM TRAFFIC PREDICTION BASED ON PHASE SPACE RECONSTRUCTION NEURAL NETWORK
Neural network is applied to short-term traffic prediction by scholars for its strong self-learning ability. After reconstructing phase space of original time series, the network topology can be better determined so as to achieve better forecasting results.

Original Wavelet Neural Network Prediction Model 3.1.1 Determination of the Number of Neurons in the Input Layer
According to the phase space reconstruction theory, we need to construct a sample time series by using the known time series value t to predict the value at the future time t + τ, sampling every τ unit time in the sample time series. Here, τ is the parameter delay time of reconstructed

Parameter Control
In the wavelet neural network, by using the gradient descent algorithm to continuously adjust and calculate the weights and wavelet basis functions parameters, the network's prediction results are closer and closer to the desired target output. The parameters and weights correction algorithm process is as follows: calculating the network prediction error firstly, continuously adjusting and perfecting the network weights and function parameters according to the prediction error, and then calculating the range of the number of nodes in hidden layer.
After repeated experiments, it is concluded that for the data of Lan-Hai expressway, the prediction effect is optimal when the number of nodes in hidden layer is 10. The result of wavelet neural network prediction model is shown in Fig. 2. And it can be seen in Fig. 3that the mean absolute percentage error (MAPE) of Lan-Hai expressway wavelet neural network is 0.1207.

Original RBF Neural Network Prediction Model
In the RBF neural network, different methods have great differences in the selection of their basis function centers. In this paper, a random selection center method is used to randomly select a certain data as the data center in the input training samples. When the data center no longer changes, the expansion constant closely related to it also stops changing. Then the connection weight from hidden layer to the output layer can be directly calculated by the least squares method, namely: 2 2 max exp , 1, 2, ..., ; 1 2, ..., In this section, we use original RBF neural network to make short-term traffic prediction of Lan-Hai expressway. For the purpose of comparison, the normalization of the data and the design of input layer and output layer using the same parameters as in wavelet neural network, namely, the number of input layer neurons is 8, the number of output layer neurons is 1. The prediction results and relative error are shown in Fig. 4 and Fig. 5 respectively. By calculation, the MAPE of Lan-Hai expressway RBF neural network is 0.0984.  As the fitting speed of wavelet neural network is slow and easy to fall into local optimal solution, the genetic algorithm can optimize the initial weight and threshold of wavelet neural network with its strong global search ability, and assign the calculated global optimal weight and threshold to the wavelet neural network to train its model. Combined with the advantages of the wavelet neural network model itself, the prediction accuracy can be greatly improved.
In essence, using genetic algorithm to optimize wavelet neural network is to make the initial weights and thresholds of the network reach the global optimum, so as to improve the prediction accuracy. By selecting the network topology same as the previous wavelet neural network to determine the length of the chromosome encoding, make genetic operation to find out the best individuals, and set reasonable and effective operating parameters to achieve the optimum. The algorithm process is illustrated in Fig. 6.
In the GAWNN short-term traffic prediction model, the initial population size is set to 40, the generation gap is set to 0.9, the crossover probability is set to 0.7, the mutation probability is 0.001, and the maximal genetic generation is set to 50. Using the previously determined   It is known that the MAPE of original wavelet neural network is 0.1207. By calculation, the MAPE of GAWNN is 0.1078, as can be seen in Fig. 8, and the error is reduced by 11% comparing to the original network.

Genetic Algorithm Improved RBF Neural Network (GARBF)
In RBF neural network, only after selecting the most accurate hidden layer data center, the expansion constant and the weight between the hidden layer and the output layer, can be made the best approximation. Therefore, we need to use genetic algorithm to optimize the above three parameters. The difference between the two neural networks optimization process is that the wavelet neural network obtains the weight and the threshold by decoding, while the RBF neural network obtains the hidden data center, the width and the output weights by decoding and assigns them to the new neural networks, and then uses the training samples to train the network. The overall process is similar.
Using genetic algorithm to improve the parameters of RBF neural network, the initial population size is set as 30, the crossover probability is 0.6, the mutation probability is 0.001, and the maximal genetic generation is 15. Taking the number of input layer neurons and output layer neurons as in the previous original RBF neural network, we conduct the prediction by the GARBF neural network. The prediction results and relative error comparison curve are given in Fig. 9 and Fig. 10 respectively. It can be seen that for the data of Lan-Hai expressway, the MAPE of RBF neural network and GARBF neural network are respectively 0.0984 and 0.0815. Furthermore, it can be concluded that after the improvement by the genetic algorithm, the error between the prediction result and the expected output is reduced by 17%, which shows that the improved network can obtain more accurate results and can be applied to short-term traffic prediction.

ERROR COMPARISON AND ANALYSIS
The errors of the above three neural network models, Mean Absolute Percent Error (MAPE), Root Mean Square Error (RMSE) and Average Absolute Error (MAE) are shown in Tab. 1.
As can be seen from Tab. 1, before the improvement, the MAPE of wavelet neural network and RBF neural network is 0.1207 and 0.0984, respectively. While after the improvement, the MAPE of wavelet neural network and RBF neural network is reduced to 0.1078 and 0.0815, which are decreased by 11% and 17%, respectively. This shows that the improvement of the two kinds of neural networks by genetic algorithm is effective, and the improvement of the network can obtain more accurate prediction results. Hence, it can be applied to short-term traffic volume prediction.  In addition, the prediction errors of the improved wavelet neural network and RBF neural network are smaller than those before the improvement. For the two improved neural networks, the prediction effect of the improved RBF neural network is better than that of the improved wavelet neural network. The average absolute error of the improved RBF neural network is 4.68, which is obviously less than 5.32 before the improvement, and the prediction effect is improved.
It can also be found that RBF neural network has natural advantages for short-term traffic forecasting. In all the models, the error values of the improved RBF neural network are the smallest, for each error is smaller than that of the wavelet neural network. Even the prediction error of the RBF neural network without genetic algorithm is smaller than that of the improved wavelet neural network. Therefore, the improved RBF neural network is more suitable for short-term traffic prediction under the same training data.

CONCLUSIONS
With the rapid development of the social economy, the process of urban integration has accelerated, and the number of car ownership has increased sharply, bringing convenience to people's lives, along with a series of problems such as traffic congestion and environmental pollution. At present, based on real-time traffic data, to make traffic guidance for vehicles on the road has become an effective method to solve urban traffic problems. It can provide support for people's travel route selection, effectively reduce traffic congestion, save travel time, and has certain practical significance. However, in order to realize the traffic guidance and further control the operation status of the traffic system, it is not only necessary to obtain real-time traffic flow information data, but also essential to be able to accurately and reliably forecast the trend of short-term traffic flow in the future based on the collected real-time traffic flow data. The main work of this paper is as follows.
Due to the nonlinearity, time-varying and uncertainty characteristics of short-term traffic data, combined with the self-learning and adaptive characteristics of neural network, wavelet neural network and RBF neural network are selected as short-term traffic flow prediction models. The chaotic property of the collected data is discriminated, and the phase space reconstruction method is used to process the collected data, so that the inherent information contained in the limited data is fully obtained. The interference of noise and other factors is avoided, and the topological structure of the neural network can be better determined, so as to achieve better prediction effect.
For the network poor stability caused by the randomness of the initial parameters, genetic algorithm is used to optimize the initial parameters and weights of the two networks so that the output of the network tends to the target value and the network stability is enhanced. The results of empirical analysis show that the genetic algorithm achieves the expected optimization goal for the prediction effect of the neural network to a certain extent, and the improved models have higher prediction accuracy and smaller error than the model without improvement. The prediction errors of wavelet neural network and RBF neural network optimized by genetic algorithm are reduced by 11% and 17% respectively. At the same time, it can be found that the prediction effect of RBF neural network is better than that of wavelet neural network either before/after improvement. Therefore, RBF neural network is more suitable for short-term traffic prediction of expressway.
Based on chaos theory, this paper uses the phase space reconstruction method to process the collected traffic data, takes the genetic algorithm to improve the parameters of the two neural networks, and the validation of the proposed models is verified by the empirical analysis. However, the research of this paper can be improved from the following aspects. Firstly, in addition to the phase space reconstruction method, there are other methods for processing the collected data such as K-Nearest Neighbor (K-NN) method which can extract similar traffic flow patterns from a historical traffic flow database to reconstruct the training data. The scope, advantages and disadvantages between different methods need further research. Besides, there are still some inherent defects in the genetic algorithm. For example, the setting of different genetic operation parameters has different effects on the optimization effect. Therefore, combining various algorithms to improve the data and neural network structure, optimizing initial parameters, reducing the prediction time and further improving its prediction effect are the next steps of our research.