A Novel Support Vector Machine Model of Traffic State Identification of Urban Expressway Integrating Parallel Genetic and C-Means Clustering Algorithm

: The real-time discrimination of urban expressway traffic state is an important reference for traffic management departments to make decisions. In this paper, a parallel genetic fuzzy clustering algorithm is proposed to overcome the shortcomings of the fuzzy c-means clustering algorithm. A traffic state discrimination model is established by using the support vector machine, and the parameters of the support vector machine are optimized by using particle swarm optimization, network search and genetic algorithm, so as to obtain the parameter group that can make the training model reach the maximum accuracy. Finally, the model is verified by the measured data. The convergence speed and clustering efficiency of parallel genetic fuzzy clustering and original fuzzy c-means clustering are compared. The results show that each iteration can converge to the global minimum value, and the number of iterations is small, and the clustering efficiency is high, which lays a foundation for the subsequent training of SVM.


INTRODUCTION
Urban road traffic state recognition is an important part of modern intelligent traffic system, which can effectively solve the problem of traffic congestion in the city.The realization of traffic state recognition is of great significance to the development of intelligent traffic system.It not only allows the traffic department to understand the specific road traffic conditions and take control measures for the traffic-congested areas, but also provides feedback on the traffic conditions of each road section in time for people through the intelligent transportation system so as to provide reference route choices for people's travel.At the same time, the identification of road traffic status can also analyse the changes of traffic conditions in time and space, guide the construction of urban planning department's road network, and promote the perfection of urban road network.
At present, there are many algorithms for traffic state recognition, such as standard deviation method [1], double exponential smoothing [2], bayes [3] and so on.With the development of artificial intelligence technology, more and more artificial intelligence algorithms are applied to traffic state recognition.Neural networks, fuzzy algorithms and support vector machines are widely used [4][5][6].Hawas [7] established a traffic incident detection model based on fuzzy theory, which can provide traffic information for travellers in practical application and effectively relieve traffic pressure.Ritchie et al. [8] applied the artificial neural network algorithm to the traffic parameter model for the first time.Stephanedes et al. [9] used neural network feedforward model for traffic state recognition, which improved the recognition accuracy to a certain extent.Borkar and Malik [10] processed the traffic characteristic parameters, and divided the specific state of the road into three different levels.The model established by SVM is used to describe and predict the traffic state, with high discrimination accuracy and low algorithm complexity.
In the domestic, Huang et al. [11] divided the traffic state into four categories: smooth, steady, congested and blocked, and put forward an algorithm to distinguish the traffic state of urban roads based on fuzzy c-means clustering.Dong et al. [12] analysed the traffic situation of the road network in a specific road section, and used FCM algorithm for cluster analysis to realize real-time identification of the road traffic status.Li et al. [13] proposed a method of highway traffic condition discrimination based on RBF neural network.The experiments results show that RBF neural network has higher prediction accuracy of average travel speed than BP neural network, and is more suitable for highway traffic condition discrimination.Wang et al. [14] extracted the traffic flow characteristics of the road sections at different time periods, and clustered them through the K-means algorithm to judge the matching degree between the actual traffic conditions and road grades of various types of roads.With the development of science and technology, classification methods have evolved, and SVM is the most common method.Dong et al. [15] divided the collected traffic flow parameters into three categories, corresponding to the three states of congestion, crowded and freedom respectively, the traffic state discrimination model is established by using SVM, through the analysis of three kinds of kernel function training, it is concluded that the radial basis kernel function model has high accuracy and good stability.Yu et al. [16] manipulated three kernel functions of SVM to distinguish the status of urban traffic, and pointed out the importance of normalized data, compared the classification results of the three kernel functions, and found that the classification effect of radial basis kernel function was the best.For the selection of categorical feature quantities, most researchers tend to choose vehicle speed, traffic flow, and occupancy rate.The ITS Research Center of the Department of Transportation Engineering of Tongji University analysed the basic traffic flow data based on the cluster analysis method, selected the traffic flow, average speed and occupancy rate as the classification feature quantities, and divided the road status into four categories: congestion flow, congestion flow, stable flow, smooth flow.Li et al. [17] used the method of fuzzy support vector machine to classify the urban traffic state, and only used the speed as the classification characteristic parameter, and divided the traffic state into five levels: smooth, basically smooth, crowded, congested, and blocked.
The above documents have provided a foundation for expressway traffic status recognition, but most of them only use clustering or classification, which has a large amount of data and high parameter dimension.If the parameters are not pre-processed, it is easy to cause problems such as large calculation, long program running time, or inaccurate classification results.
Therefore, this paper adopts a strategy of clustering before classification to build a novel support vector machine model of traffic state identification of urban expressway integrating parallel genetic and c-means clustering algorithm.In view of the shortcomings of fuzzy c-means clustering, this paper improves the fuzzy c-means clustering algorithm, inserts the fuzzy c-means clustering into the process of genetic algorithm, and proposes a parallel genetic fuzzy c-means clustering algorithm, which effectively solves the problem that the fuzzy c-means clustering algorithm is sensitive to the initial value.Because there is a large amount of calculation in the method of traffic state discrimination based on Euclidean distance, the classification model of traffic state is established by using support vector machine, and the parameters of support vector machine are optimized in three ways: grid search method, genetic algorithm and particle swarm optimization algorithm, so as to obtain the parameter group that can make the training model reach the maximum accuracy.Finally, the model is verified by the measured data.The parallel genetic fuzzy c-means clustering and the original fuzzy c-means clustering are compared according to convergence speed, sensitivity to initial value and clustering effectiveness.The results show that the selection of initial value has no effect on parallel genetic fuzzy clustering, and each iteration can converge to the global minimum.The number of iterations is very small.The clustering efficiency is also higher than the original fuzzy c-means clustering.This method provides a good foundation for the subsequent training of SVM and saves a lot of time.

PARALLEL GENETIC FUZZY CLUSTERING ALGORITHM 2.1 Genetic Algorithm
Genetic algorithm (GA) [18] is a random search algorithm proposed by J. H. Holland in 1962, which simulates the evolution process of organisms and follows a mechanism of heredity, variation and survival.It is a global optimization algorithm widely used in evolutionary algorithm, which provides an effective way for us to solve optimization problems.When using genetic algorithm, coding design should be carried out for specific problems to represent some solutions in the solution space of the problem, and then the optimal solution according to the principle of survival is searched through using selection, crossover and mutation to simulate the process of biological evolution.The following is the specific process of genetic algorithm: Step 1: Code chromosomes; Step 2: Set the evolution algebra to 0 and initialize population; Step 3: Calculate fitness values for all individual; Step 4: Select higher adaptability individuals to cross and mutate for producing new individuals and forming new species groups; Step 5: Stop iteration if the termination condition is satisfied, otherwise the evolutionary algebra will increase loop steps 3-5.
There are two main factors that affect the performance of genetic algorithm: on the one hand, the fitness function will affect the convergence direction of the algorithm; on the other hand, the crossover and mutation operators, to a certain extent, will expand the target solution set, increase the search time, and affect the rapid optimization of the algorithm.

Fuzzy C-Means Clustering Algorithm
Cluster analysis is an unsupervised classification method that can divide a dataset without classification labels into clusters [19].Fuzzy c-means clustering (FCM) is a clustering algorithm based on objective function.Dunn [20] proposed c-means algorithm for the first time, and Bezdek [21] improved it later.In fuzzy clustering, each data point belongs to a certain category to some extent, and membership degree is used to express the degree that each data point belongs to a certain cluster [22].
, , , : , , , : vector set of feature space consists of c cluster center vectors; ij u : Membership degree of the sample j belonging to the Centre i; ij U u      : c×n matrix [23].The objective function of FCM algorithm is: constraint condition: where: J: the objective function; m: the weighted index is also called the smoothing index,

 
1, m   ; c i : the i-th cluster center point; x j : the j-th sample point in the sample set.
The objective function is the sum of the distances from all sample points in the sample set to each cluster center multiplied by the membership degrees of each cluster center.The criterion of clustering is to take the minimum value of objective function, which is a constrained optimization problem about independent (U, C), and the Lagrange multiplier method can be constructed to solve this objective function.
The first order necessary condition for obtaining the minimum value of the objective function is: It can be simplified as follows: -Iterative formula of membership matrix: -Iterative formula of clustering center: If the data set X , the number of clustering categories C and the fuzzy coefficient m are known, the optimal membership matrix and clustering center can be calculated by the iterative formula of membership matrix and clustering center.According to a large number of previous studies, there are three factors that affect the performance of FCM algorithm: the selection of fuzzy coefficient and initial clustering center, as well as the solution method of the objective function.

Parallel Genetic Fuzzy Clustering Algorithm
In view of the slow searching speed of genetic algorithm and the problem that FCM is easy to fall into local optimum, this paper proposes parallel genetic fuzzy clustering algorithm (Parallel Genetic Fuzzy C-Means, PGFCM).Its basic idea is to interleave the iterative formula of FCM algorithm cluster center into genetic algorithm.The Parallel genetic fuzzy clustering merges the advantages of genetic algorithm in global search and fast iteration of FCM, and the global optimal value can be found by fast convergence with a large probability.The specific flow of the algorithm is as follows: Step 1: Input data set A, set population size N, crossover probability P c , mutation probability P m , maximum evolution algebra T max , fuzzy coefficient m, cluster center number c, initialization cluster center to obtain the contemporary population   t P and evaluate its adaptability; Step 3: Use selection operator to select individuals for crossover and genetic operation generation   Step 4: Decode   1 t P  , use Eq. ( 6) and Eq. ( 7) to update (U, C), and get   1 , 1,2, , Step 5: If the maximum number of iterations is reached or the difference of the average fitness of individuals in successive generations of the population is less than a certain threshold value, the algorithm will stop, otherwise, make 1 t t   and cycle Steps 2 to 5.

MODELING OF TRAFFIC STATUS RECOGNITION BY PARALLEL GENETIC FUZZY CLUSTERING ALGORITHM
The most common parameters to study traffic flow characteristics are traffic flow, speed, occupancy, traffic density, queue length, headway and so on [24,25].The classic traffic flow model can be represented by flow, speed and density, but there are some difficulties in obtaining traffic density data.Therefore, by understanding the specific conditions of the parameters used in previous studies, this paper selects three parameters commonly used: speed, flow and occupancy.
Combining the global search ability of genetic algorithm and the fast convergence of fuzzy clustering, the best clustering center is quickly obtained.According to the membership matrix output by FCM objective function, the degree of each sample point belonging to a traffic state is judged.The data set is divided into four corresponding traffic states: smooth, steady, congested and blocked.
Expressing a set of traffic flow parameters (speed, flow, occupancy) on the spatial coordinate axis, selecting several points randomly as the initial clustering center, dividing the data set into four categories and selecting four points randomly are the initial clustering centers of the four categories: where: S: speed; F: flow; O: occupancy.Each row represents a cluster center.Each row represents a cluster center.Coding the selected cluster center with real value the expression form of chromosome is: The smaller the objective function value of FCM is, the larger the fitness value of the individual will be.The individuals with higher fitness will be retained for crossover and mutation, and the new generation of clustering center matrix will be decoded, and then the third generation of clustering center matrix will be calculated by the clustering center iteration formula of FCM.Cycle the above operations until the difference of average fitness is less than a certain threshold or reaches the maximum number of iterations, and output the optimal clustering center and membership matrix.According to the degree to which each sample belongs to each type of traffic state, the data set is divided into four categories.

DECISION ANALYSIS MODEL OF TRAFFIC STATE IDENTIFICATION 4.1 Support Vector Machine Theory
Support vector machine (SVM) [26] is based on statistical theory, which can solve the problem of classification and regression with a small number of samples quickly and accurately.In essence, SVM is mainly used to solve the problem of two classifications.However, in reality, most cases are multi-classification.In this paper, the "one-to-one" classification method is adopted to apply the concept of two classifications in SVM to other multiclassifications, and six classifiers are constructed for four traffic state categories.
SVM is developed from the optimal classification hyperplane in the case of linear separation.For the linear separable case, the optimal hyperplane is required to solve the optimal combination of (w, b) parameter.The optimization problem constructed is as follows: where: w is the normal vector of the hyperplane; b is the constant term of the hyperplane; x i is the i-th sample of the input pattern;

 
1, 1 i y    ; w : the modulus of the vector x; n is the number of samples In this way, the original classification problem is transformed into the problem of solving the quadratic programming.
For the linear inseparable case, Vapnik puts forward the concept of soft interval, introducing the penalty factor C and relaxation variable ξ in dealing with problems.C is mainly to adjust classification error and the weight of classification interval.ξ represents the degree of error classification, so as to achieve the purpose of compromise.Therefore, the optimization problem is modified as follows: In the nonlinear case, SVM transforms the input sample into high-dimensional space.In order to avoid this situation, kernel function can be introduced into SVM, so that the original data can be transformed from linear indivisible to linear separable.
In the identification of urban traffic state, combined with the actual situation of traffic, define the observation matrix x = [S F O], select the appropriate kernel function, and bring the observation matrix x into the SVM discriminant function to achieve state classification.
The kernel functions commonly used at present are: -Linear kernel function: . This function is a kernelless function parameter, but its application is very narrow.
-Polynomial kernel function: where s, c and d are all parameters -Radial basis kernel function: , exp 2 .
There is only one parameter g in this function.-Sigmoid kernel function: It includes two parameters: s and c.Among them, radial basis kernel function has a wide range of applications, relatively few parameters and its solution is less restricted by constraints, which can make the optimization of parameters simpler [27].Therefore, this paper introduces radial basis kernel function in support vector machine.

Modelling of Traffic State Classification Model Based on SVM
Essentially, support vector machines are mainly used to solve binary classification problems.However, most of the real situations are multi-classification situations, and the mathematical principles of binary classification cannot support other types of classification problems [28].Multiclassification problems can be divided into two types: "one-to-one" and "one-to-many".
(1) one-to-one When using this classification method, it is necessary to build a classifier among all categories.That is to say, two categories are randomly selected from all categories to construct a classifier, so that ( 1) 2 n n  two-class classifier can be generated [29] where n is the size of training samples.In the process of analysing the samples, the two classifiers where they are located are also classified by the voting method, and finally the votes are counted for each sample.The category with the most votes is the category of the sample.
(2) one-to-many Using this method, the samples need to be divided into two categories: the first category and the other categories, and then all the remaining samples are classified [30].Repeat this sorting operation until sorting is complete.From another point of view, the number of support vector machines should be the same as the number of categories.When classifying unknown samples, the category with the largest output value of the decision function is the category to which the sample belongs.
Using the "one-to-one" classification method, this paper needs to construct 6 classifiers, and using the "oneto-many" method it only needs to construct 4 classifiers.Using the "one-to-many" method requires training 1440 samples each time, and the negative class samples for each training are much larger than the positive class samples.The problem of unbalanced samples may lead to inaccurate classification.For discriminating new sample points，all models need to be retrained.The "one-to-one" method only needs to train a few hundred samples at a time.Although there are more classifiers constructed in the "one-to-one" method, the "one-to-one" method considering the overall training time is faster, and there is no phenomenon that some sample points are inseparable like the "one-to-many" method.Combined with the research questions, this paper adopts a "one-to-one" approach to classify and model traffic states.

Traffic State Discrimination Model Based on Improved PGFCM and SVM
FCM algorithm needs a lot of historical data as the basis, so if only this algorithm is used, the results will be lack of timeliness.SVM model necessitates data and corresponding state labels as the basis to ensure the accuracy of model classification, but only using SVM algorithm will lack data support.Therefore, the above two algorithms should be applied synthetically in practical application.Based on the multi-classification model of SVM and the improved fuzzy c-means clustering algorithm, the real-time traffic state discrimination model of urban expressway is constructed.The flow chart is shown in Fig. 2.

Figure 2 Flow chart of traffic state discrimination model based on PGFCM and SVM
The accuracy of SVM algorithm is mainly affected by algorithm parameters, so it is necessary to select reasonable parameters to give full play to the role of SVM.
In this paper, the value of penalty factor C, parameter g, number of support vectors L and paranoid coefficient b should be defined.Among them, C and g can only be obtained by optimization algorithm.Due to the development of intelligent algorithm, the method of parameter optimization based on intelligent algorithm has also been applied in support vector machine.The frequently-used optimized algorithms mainly include particle swarm optimization, network search, genetic algorithm and so on.However, each optimization method has obvious advantages and disadvantages.In this paper, the above methods are applied respectively in parameter optimization, and the combination of parameters with the highest classification accuracy after optimization is selected and applied to the model.

CASE STUDY 5.1 Data Preprocessing
The data used in this study are 1440 sets of traffic parameters (flow, speed, occupancy) for 24-hour on August 19, 2017 provided by the expressway section detector of a city in Shanghai, and the collection interval is one minute (see Tab. 1).
In order to ensure the reliability of clustering results, it is necessary to guarantee the quality of the obtained data and preprocess the traffic flow parameters.The main processing content is to detect outliers and normalize data [20], that is, to detect whether there are outliers (missing values) in all the data firstly, normalize the data obtained, convert the three indicators detected into constants within the range of [0, 1], and improve the accuracy of SVM classification and fuzzy c-means clustering algorithm (see Tab. 2).After data normalization, the relationship between the three parameters (flow, speed and occupancy) is shown in Fig. 3.After smoothing, a three-parameter smoothing graph is obtained, as shown in Fig. 4.   3 and Fig. 4 intuitively show the changing trend of flow, speed and occupancy.The change of flow during complex traffic operation periods does not clearly show the impact on speed and occupancy, while the relationship between speed and occupancy is obviously very different.The analysis of the change of traffic flow with time provides a basis to study the change of traffic state, and also confirms the feasibility of studying the problem from the perspective of flow, speed and occupancy.

Comparison of Clustering Results Between FCM and PGFCM 5.2.1 FCM Clustering
The specific operation process is as follows: -Initialization parameters:   Each row of the matrix represents the clustering center of smooth, steady, congested and blocked states, the elements of each column in the matrix are traffic flow (veh/min), speed (mph) and occupancy (%).The specific distribution of these samples in state space is shown in Fig. 5.

PGFCM Clustering
The specific operation process is as follows: -Parameters of genetic algorithm: population number N = 50, evolution algebra T max = 30, crossover probability p c = 0.6, mutation probability p m = 0.1; the number of clusters c = 4; the fuzzy coefficient m = 1.6.-Code the initial value with real value.
-Population initialization p t ; Determine the upper and lower bounds of the three parameters, and generate three random numbers in the bounds of the three parameters respectively as an initial cluster centre.In this paper, the clustering number is four, so four cluster centers are generated four times.Four randomly generated initial cluster centers are coded by real numbers to form a chromosome, and fifty chromosomes are randomly generated.
-Fitness function: -Design genetic operators: Two methods of fitness ratio algorithm and elite preservation are applied synthetically; The arithmetic crossover operator based on shortest distance gene matching is selected; Mutation operator adopts basic bit mutation; -The second generation cluster center matrix is obtained by decoding, and the new generation cluster center matrix is solved by FCM iterative formula; -When the number of iterations reaches the maximum or the fitness changes little or no, the operation of the algorithm is over.Otherwise, the coding, fitness evaluation and genetic operation will continue.Thus, the clustering centers of the four traffic states obtained are as follows: The specific distribution of the four categories of samples in state space is shown in Fig. 6.According to the clustering center matrix, the difference between the classes obtained by the algorithm is obvious, which means the clustering effect is excellent.

Comparative Analysis of Convergence Ability and
Misjudged Rate Between PGFCM and FCM

Convergence Analysis
It is found in Fig. 7 and Fig. 8 that the improved PGFCM algorithm will gradually approach the optimal value after 5 iterations, and the maximum value is 50911.263.However, if you adopt the initial algorithm, you need to iterate up to 20 times to approach the target value 50912.649slowly.The convergence speed is slow, and the objective function has converged before reaching the minimum value.However, the minimum value obtained by the parallel genetic fuzzy clustering algorithm is much smaller than that obtained by FCM alone, and it does not fall into the local minimum.It can be seen that parallel genetic fuzzy clustering has obvious effect.Compared with FCM algorithm, PGFCM algorithm has obvious advantages in convergence speed and optimization ability.

Analysis of Misjudged Rate
The cross-estimation method of misjudged rate is used to compare and analyse the misjudged rate of PGFCM and FCM in the article.Set the sample size as N, use PGFCM and FCM to divide the data into four categories, and record the total number of samples of each category.The steps mainly include: -Select a sample from all samples and remove it, then use the above two methods to cluster the remaining samples and record the results respectively.The cluster center and membership degree are judged by the eliminated samples to determine which category they belong to; -Repeat the first step to remove each sample.Compare the result of clustering after elimination with the original result.If the result is different, the sample will be regarded as a misjudged sample and the sample size will be recorded.Calculate the error rate with the following formula: where:  According to Eq. ( 12), the error rate of FCM is 11.2, while the error rate of PGFCM is 5.3%, which is about twice as low as that of FCM.It shows that the improved algorithm provides a good data base for the classification model.The numbers 1, 2, 3 and 4 represent states one to four, respectively, which are called labels.

Comparative Analysis of Training and Testing of Two Clustering Results
Based on the clustering results of the two methods obtained in the previous section, the original SVM was used to classify the two sets of clustering results, and the test and training accuracy and program running time were recorded.Support vector machine setting: using radial basis kernel function, C = 5, g = 0.1, the average results of 10 trials are shown in Tab. 5 and Tab.6:  From Tab. 5 and Tab.6 it can be seen that the improved PGFCM clustering results have more clear distinction between classes, and it is easier to get the classification boundary when using SVM for classification, so it has advantages in test time and test accuracy.

Parameter Optimization of SVM Model
(1) Grid search method It can be seen from Fig. 9 that the result of optimization by grid search method is as follows: when C = 5.2780000, g = 0.035897, the optimal classification accuracy is 98.2638%.Because the grid search method is aimed at all parameters in the search range, the computation is huge and affected by the search step.When the search step size is small, the high precision can be obtained.If the search step size is increased, the optimal parameter combination will be skipped, and the sub-optimal result will be obtained, so the classification accuracy will be reduced.It is not particularly suitable for the problem of urban expressway traffic state discrimination, which has a large amount of data and needs to get results quickly.(2) Particle swarm optimization Fig. 10 shows the results of particle swarm optimization.When C = 0.5172000, g = 0.0100000, the optimal classification accuracy is 97.7431%.When particle swarm optimization is used to optimize the parameters, the optimal fitness value can be achieved quickly.

Figure 10 Iterative fitness curve of particle swarm optimization
The convergence speed is very fast in the early stage of evolution, while the convergence speed is slow in the late stage of evolution.At the same time, the convergence accuracy of the algorithm is relatively low.However, the algorithm is suitable for the problems studied in this paper.
(3) Genetic algorithm Fig. 11 shows the optimized results of genetic algorithm.When C = 0.962920, g = 0.0038147, the best classification accuracy is 98.7269%.Due to the global search characteristics of genetic algorithm, the fitness of the initial evolution has declined.The classification accuracy of this method is the highest, and the running time of the algorithm is longer than that of PSO, which is suitable for the problems studied in this paper.It can be seen from Tab. 7 that the optimization effect of genetic algorithm is the best and the time is relatively short, so the parameters C = 0.962920, g = 0.0038147.

Training and Testing of Optimized SVM Model
Set the number of categories n = 4, C = 0.962920, g = 0.0038147, and use SVM to read training data and training label.The test set data is brought into support vector machine for training, and the corresponding traffic state prediction label value is obtained.Tab. 8 shows the partial test and forecast tag values.
Only one test label deviates from the prediction label according to the result of classification, and the classification accuracy reaches 98.61%.The experimental results show that the model based on parallel genetic fuzzy clustering and SVM has high accuracy.By inputting the traffic flow parameter matrix into SVM model, the real-time discrimination can be completed, so that the real-time situation of traffic state can be understood.Therefore, the real-time traffic state discrimination method of urban expressway established in this paper is feasible.

CONCLUSIONS
Based on the measured data, a parallel genetic fuzzy clustering algorithm is established to divide the traffic state in this paper, and then the SVM model is used to distinguish the traffic flow data, which can obtain the realtime traffic situation on a certain section, so as to provide a reference for people's travel.On the one hand, the PGFCM-SVM model uses the genetic algorithm for the optimization of FCM, and obtains a clustering method that has fast convergence speed and can search for the best value with high probability.On the other hand, using grid search method, genetic algorithm and particle swarm algorithm for SVM optimization can get more accurate traffic state discrimination results.However, there are still many deficiencies in the research process.Firstly, in the fuzzy C-means clustering algorithm, the value of the fuzzy coefficient m has a great influence on the clustering effect.The basis of this paper is only based on the previous research experience.When m is set to [1.5, 2.5], the clustering effect is optimal.The value range of m needs to be further explored.Secondly, the weight of the three parameters of traffic flow in clustering is not considered.Lastly, this paper studies the traffic status on a single road section.In the follow-up research, the traffic status analysis of multiple road sections can be carried out to explore the traffic status relationship between each adjacent road section.
1 1 1 2 2 2 3 3 3 4 4 4 S F O S F O S F O S F O .According to the population size, several of the above chromosomes are generated.Evaluating the fitness of each chromosome the fitness function is: Technical Gazette 29, 3(2022), 731-741

Figure 1
Figure 1 SVM multi-classification process diagram Four categories of the traffic status are set in this paper.First, a part of the four categories of traffic status data sets obtained based on clustering is selected as categories 1, 2, 3, and 4 for training to obtain the parameters of the support vector machine model.The new samples are tested using the parameters.For example, the newly added sample A is classified by 6 classifiers, and the classification status of each classifier is recorded.If a class belongs to a certain class, one vote is counted for a certain class, and then the traffic status of the newly added sample point A is judged by Max vote {1 2 3 4}.The process is shown in Fig. 1.

Fig.
Fig.3and Fig.4intuitively show the changing trend of flow, speed and occupancy.The change of flow during complex traffic operation periods does not clearly show the impact on speed and occupancy, while the relationship between speed and occupancy is obviously very different.The analysis of the change of traffic flow with time provides a basis to study the change of traffic state, and also confirms the feasibility of studying the problem from the perspective of flow, speed and occupancy.
center and membership matrix; -Termination condition: t = 30 or the difference between the centers of two generations is less than ε;The clustering centers of four traffic states are obtained:

Figure 5
Figure 5 Spatial distribution of traffic state based on FCM

Figure 6
Figure 6 Spatial distribution of traffic state based on PGFCM

Figure 7 Figure 8
Figure 7 Convergence curve of FCM algorithm 1 2 3 4 , , , n n n n are the sample sizes of the four types of original samples; * * * * 1 2 3 4 , , , n n n n are the sample sizes of the misjudged samples.The results of the two clustering methods are shown in Tab. 3.
Model Based on SVM 5.4.1 Data Set Partition 60% of the data sets were randomly divided into training set and the other 40% into test set (see Tab. 4).

Figure 9 C
Figure 9 C, g parameter contour diagram

Figure 11
Figure 11 Iterative fitness curve of genetic algorithm (4) Comparison of accuracy and test time of three optimization methods

Table 1
Original sample table of traffic flow data

Table 2
Data normalization results

Table 3
Comparison of misjudged cross estimates

Table 4
Training set and test set of support vector machine model

Table 7
Comparison of operation time and classification accuracy of the three algorithms

Table 8
Prediction label and test label