Calculation Method of Probability Integration Method Parameters Based on MIV-GP-BP Model

In order to guarantee the precision of the parameters of the probability integral method (PIM), starting from optimizing input and improving algorithm an algorithm integrating the genetic algorithm (GA) and particle swarm optimization (PSO) was put forward to optimize the prediction model of BP neural network and the mean impact value algorithm (MIV) was applied to optimize the input of BP neural network. The mean impact value algorithm (MIV) was applied to optimize the input of BP neural network. The measured data of 50 working faces were chosen as the training and testing sets to build the MIV-GP-BP model. The results showed that among the five parameters, the RMSE was between 0.0058 and 1.1575, the MaxRE of q, tanβ, b and θ was less than 5.42%, and the MeaRE was less than 2.81%. The RMSE of s/H did not exceed 0.0058, the MaxRE was less than 9.66% and the MeaRE was less than 4.31% (the parameters themselves were small). The optimized neural network model had higher prediction accuracy and stability.


INTRODUCTION
With China's rapid economic development, there is a growing demand for coal resources. Recently, lots of underground coal resources have been exploited, which has caused a series of environmental problems, such as surface subsidence, cracks, dust, solid waste and etc., thus posing serious threats to the production and life of mining areas.
In order to maximize the production of coal resources and reduce large-scale surface subsidence, scholars have carried out extensive research on the prediction theories of mining subsidence. PIM based on stochastic medium mechanics theory has been widely applied for mining subsidence prediction [1]. Therefore, the acquisition of parameters is very essential, and the commonly-used methods were given. The first method is to use the intelligent optimization algorithm to fit the parameters based on the measured data, such as genetic algorithm, quantum annealing method, etc. [2,3]. The second method is to apply the machine learning method, such as neural network, support vector machine, etc. [4]. The third method is to use the similarity theory to calculate the lithology of overlying rock [5]. These prediction methods have improved the accuracy of parameter determination to a certain extent.
There are complex non-linear relationships between the basic parameters and geological and mining conditions, so it is hard to describe them by using a specific mathematical model. Both methods 1 and 3 are based on the measured deformation to obtain the parameters. This often requires a large number of surface movement observation stations, which consumes a lot of human and material resources, and the observation period is long, so it is impossible to obtain the parameters of the new mining area. The aforementioned neural network technology has good autonomic learning, self-adaptive and fault-tolerant performance, and can be used to establish complex nonlinear mapping relations. It has been extensively applied in pattern classification, clustering analysis, regression fitting analysis and optimization calculation [6,7].
In order to fully use the existing mining data and establish the relations between the geological and mining conditions and the parameters of PIM, parameter acquisition method based on neural network learning has been applied extensively. Bian combines PSO algorithm and BP neural network to establish a parameter-seeking model [8]. Zhao established a combined parameter acquisition model based on grey relational analysis and BP neural network [9]. Hu builds prediction method based on improved genetic algorithm [10]. These models have achieved good prediction results. The relevant research has made full use of the measured data, which could provide ideas for the calculation of the PIM parameters in mining areas without measured data.
The prediction accuracy of BP neural network is closely related to the amount of input layers. The more the input layers, the larger the network is needed and the closer to the correct results, which reduces the prediction accuracy to some extent. Therefore, it is useful to decrease the dimensionality of the sample. The average impact value method (MIV method) is used for data dimensionality reduction based on BP neural network, currently widely used. Therefore, in this paper, the MIV method was introduced into the model to analyze the geological and mining conditions, simplify the input layer, reduce the complexity of the neural network and improve the prediction accuracy.
For BP neural network, its weight and threshold selection are random, which makes the standard BP neural network converge slowly and is easy to fall into local optimum. The training efficiency could be improved by using the intelligent optimization algorithm to optimize the parameter optimization process in the BP neural network training process. These algorithms include simulated annealing (SA), GA algorithm, PSO algorithm, and etc. The PSO algorithm is simple and efficient in these optimization algorithms, but it has poor accuracy and easily falls into local optimum. To address the issue, Benvidi proposed a GA-PSO hybrid algorithm to optimize the ANN, and used the algorithm to deal with the relationship between the absorbance data and the concentration of the analyte. The algorithm is simple, the convergence speed is fast, and the convergence precision is high [11]. Mousavi uses fuzzy GA-PSO algorithm to solve the problem of automatic guidance vehicle scheduling optimization. Through simulation experiments, it is proved that the hybrid algorithm outperforms single GA and PSO algorithms in all aspects [12]. However, there is no relevant research in the field of mining subsidence. In view of the above problems, this paper uses GP algorithm to optimize BP neural network, which is introduced into the mining subsidence field. This paper collects data from 50 working faces. Firstly, MIV method was applied to screen input variables of the BP neural network, and sort factors affecting the parameters of PIM. It could eliminate lower-order factors, simplify the input of the network and reduce the network complexity. Then the GP algorithm was adopted to optimize the weight and threshold of the BP neural network, thus establishing the MIV-GP-BP prediction model.

CONSTRUCTION OF MIV-GP-BP PREDICTION MODEL 2.1 MIV Algorithm
Proposed by Dombi et al., MIV is applied to show the variation of the weight matrix in the neural network. Later, it is considered to be one of the best indicators for evaluating the most relevant correlation in the neural network [13]. It could be applied to quantitatively evaluate importance of the respective variables for the dependent variable. The algorithm steps are given: (1) After termination of the neural network training, the single independent variable sample will be increased and decreased by a certain ratio based on the original value to form two sets of new samples P1 and P2. In this paper, the regulation rate was 10%, that is, the original sample was increased and decreased by 10% to form two new samples.
(2) The new samples P1 and P2 are taken as input layers, and their output layers T1 and T2 are trained. (3) The Impact Value (IV) of the change of the independent variable on the output could be obtained by calculating the difference between the two groups of predicted data T1 and T2. (4) The IV of the obtained sample value is averaged according to the number of observed samples, and the MIV value of the independent variable to the dependent variable is obtained (Fig. 2).
MIV values of each independent variable are sequentially calculated according to the aforementioned steps. The obtained MIV value has positive and negative points. Its symbols represent the direction of influence [14]. In this paper, the training sample data was tested several times to obtain MIV values. To improve the calculation efficiency, the number of operations for each group was 50 times, and the outliers in each group were eliminated. The MIV values of the same independent variable in each group were taken as absolute values and then added and averaged. Finally, the filtering of variables can be achieved by sorting according to the size of the obtained MIV values.

GP Algorithm Optimization BP Neural Network 2.2.1 GP-BP Model
GP algorithm optimization BP neural network is to minimize the output error to enhance the weight and threshold parameters.
Structural parameters of the BP neural network need to be encoded into the individual information of the particle swarm. The number of neurons in the input layer, the hidden layer and the output layer is x, y and z, respectively, then each neuron in the hidden layer includes x weights and one threshold information. Each neuron in the output layer includes y weights and one threshold information. The BP neural network needs to optimize I = (x + 1)y + (y + 1)z parameters. All the parameters to be optimized are expressed as D-dimensional vectors and they are used as the coding information of the particle swarm individuals.  (1) Initializing topology of the neural network; that is, setting the number of input layers, hidden layers and output layer nodes according to the problem; (2) Initializing the population; encoding the weight and threshold information to be optimized according to a specific coding rule, and initializing the population according to the set population size and initialization method; (3) Conducting iterative operation of the GP algorithm; the population obtained in Step (2) is iteratively operated according to GP algorithm rule, until the algorithm satisfies the condition. Finally, the global optimal individual is obtained, and the weight and threshold information are obtained by decoding; (4) Initializing the weights and thresholds; bringing decoded weights and threshold information to obtain the initial weights and thresholds; (5) GP-BP model operation; the repeated learning training of the network is carried out until the condition is met, and finally the output of the optimized BP neural network is obtained.

GP Algorithm
The genetic operation of GA algorithm was introduced into the PSO algorithm in this study. In the iterative process of particle swarm, the position and velocity vector of the particle were respectively cross-existed and mutated. The algorithm firstly evolved a certain algebra by PSO. Then according to the fitness value, particles were sorted from large too small. The population was divided into two parts. M particles with large fitness values were retained, and the remaining N − M particles were removed. The remained M particles were taken as the reference, and N − M individuals were obtained by copying. Next, the genetic algorithm operator operations such as cross mutation was carried out. Finally, the positional values of the M particles retained by the PSO and the N − M particles evolved by the GA algorithm were combined into a new particle swarm for the next generation of operations. Fig. 4 shows the flow chart of GP algorithm. The algorithm is given, as follows: (1) Setting of relevant parameters, including the number of particles N, the number of particles after evolution M, the probability of variation Pm , the probability of crossover Pc, the maximum velocity of particles in the particle swarm algorithm V, the inertia constant ω, the learning factors c 1 , c 2 and the number of iterations epoch, the accuracy of the solution E and etc.
(2) Initialization of the population; setting the size N of the population and the dimension n of the particle, using real-number random coding to encode the particle's velocity and position, using fitness function to obtain the individual fitness value of the particle swarm, and improving the individual extremum of the particle and the global extremum. The fitness function used was the Eq. (11).
(3) Iteration of the particle swarm; updating the particle's velocity and position information based on Eq. (1) Eq. (2), and improving individual extremum and the global extremum of the population. 1 1 1 22 where m 1 , m 2 is the accelerating factor, t is the inertia coefficient, r 1 , r 2 is a random number between [0, 1], v i , v i+1 is the current speed of the particle and the updated speed, x i , x i+1 is the current and updated position, pb i is the current optimal position of the individual extreme value, gb i is the global optimal extremum. The crossover operation of the u chromosome θ k and v chromosome θ v at the j position is given: where λ is a random number between [0, 1], θ uj , θ vj is chromosomes obtained using crossover. The operation method of mutating the j th gene of the i th individual is: the current number of iterations, H max is the largest number of evolutions, λ is a random number between [0, 1]. (6) Combining 2N/3 individuals generated by GA algorithm with N/3 particles to form N new particles, calculating the fitness of particle swarm, and updating individual extremum and global extremum.

APPLICATION EXAMPLES AND RESULTS ANALYSIS 3.1 Variable Selection by MIV Algorithm 3.1.1 Experimental Data
The literature [15] summarized the geological and mining factors of China's 408 working faces and the corresponding probability integral method parameters. In this paper, 50 working faces were selected as experimental data, in which the first 40 working face data were used as training set, and the last 10 working face data were taken as testing set to test the model's prediction accuracy.
Commonly-used parameters in the probability integration method selected in this paper include q, b, tanβ, s (can be expressed by s/H), θ. The geological and mining conditions include f, H, M,T, n,W, X (first mining uses 0 as the input variable, repeated mining adopts 1 as the input variable). Tab. 1 shows some data.

Variable Selection
Based on MATLAB program, the MIV algorithm program was programmed in this study. The adjustment rate of the MIV algorithm was set as 10%. Each probability integration method parameter was tested by 50 MIV measurement experiments. After the absolute value of the MIV value obtained in each experiment, the weighted average was performed. Then, the MIV values of the influence factors corresponding to each probability integration method parameter could be obtained, as follows: The MIV values of each influencing factor were obtained and sorted from large to small. See Tab. 3.

Table 3 Sorting results of MIV values of various influencing factors
According to the sorting results, the top five influence factors with large MIV values were selected as the input layer of the model, thereby achieving the goal of simplifying the network input and improving the prediction accuracy.

De-Noise Processing
Due to the complexity of the observation environment, the acquisition of experimental data also had certain errors, and the data fluctuation range was large. The error of the data itself was also introduced to further expand errors. Therefore, the data should be denoised before prediction to minimize errors.
The Savitzky-Golay algorithm was applied to denoise the data, which is a filtering method based on local polynomial least squares fitting. It could ensure width and shape of the signal during the denoising process, making the data relatively smooth. The comparison between the processed results and the measured results are shown in Fig.  5a to Fig. 5f. It can be seen from Fig. 5a to Fig. 5f and Tab. 4 that correlation coefficients after smoothing were all above 0.547, and the correlation coefficient was high. The smoothed experimental data was less up and down, and the curve tended to be smoother. Some points with more fluctuating were smoothed to achieve the effect of reducing the error. According to the data processing results, the correlation coefficient R between the denoised data and the measured data can be obtained, as shown in Tab The processed data and the measured data Figure 6 Comparison of predicted values before and after BP neural network optimization

Prediction Results and Analysis 3.3.1 The Results of the Model
To reduce the complex structure of the neural network, a neural network was established for each probability integration method parameter, that is, a many-to-one topology (multiple input layer and single output layer) was adopted.
The parameters were set as follows: the training times were 5000, the learning rate was 0.01 and the training target error was 0.000001. The neural network in Fig.1 used a 3-layer structure. The number of the hidden layer was determined according to the formula J = 2m + 1, where m is the number of input layers. The sigmoid function was selected as the transfer function between the input layer and the hidden layer, and the purelin function was the transfer function between the hidden layer and the output layer. In PSO algorithm, the population of the particles was 20, the maximum number of allowed iterations was 20, the acceleration factors c 1 , c 2 were 1.49445, and the maximum speed limit was 5. In the GA algorithm, the population number was 40, the crossover probability was 0.4, the mutation probability was 0.06, and the maximum number of iterations was 50.
To better verify the MIV-GP-BP model, the measured data of the first 40 working faces in Tab. 1 were taken as training set and the 41-50 working faces were taken as the testing set, meanwhile using the MIV-BP model as a predictive comparison. The results are shown in Fig. 6a to Fig. 6e. It could be seen that after optimization by GP combination algorithm, BP neural network had better generalization ability between geological and mining conditions and parameters of probability integration method. The prediction accuracy was higher than that of non-optimized BP neural network at most points.
To quantitatively analyze the prediction performance differences between the optimized BP neural network prediction model and the non-optimized BP neural network model, the accuracy evaluation indexes of the prediction values of MIV-GP-BP prediction model and MIV-BP prediction model were calculated. In this paper, the maximum relative error (MaxRE), the mean relative error (MeaRE) and the root mean square error (RMSE) of the prediction results were taken as the accuracy evaluation indexes.
In Eqs. (5), (6) and (7), x' i is the predicted value, x i is the actual value, n is the number of predicted samples.
The calculation results of the evaluation index are shown in Tab. 5. Tab. 5 and Fig. 6a to Fig. 6e showed that the MIV-GP-BP prediction model had much smaller MaxRE and MeaRE values than those in MIV-BP prediction model in the five probability integration method parameters. The RMSE could effectively reflect the degree of deviation between predicted values and true values of the model, so the predicted value of the MIV-GP-BP model with the smallest RMSE value was closer to the true value. Considering that some predicted parameter values were small, small changes would cause a large change in relative error. However, the absolute error was small, so the accuracy was still high and could meet the engineering needs.

Correctness Validation of MIV Algorithm
In order to prove the effectiveness of MIV algorithm, five factors were randomly selected from the influencing factors as the input layer of the neural network to establish the GP-BP prediction model, and then compared with the MIV-GP-BP prediction model established by the MIV algorithm as the input layer. (At least one factor of the five random factors was not screened by the MIV algorithm). In this paper, three groups of geological and mining conditions were randomly selected. The subsidence factor prediction model and the horizontal movement factor prediction model were selected as the case analysis. The grouping scheme is shown in Tab. 6.
For the above 8 sets of different input factor, the GP-BP prediction model was established by the data shown in Table 1. The prediction results are shown in Tab. 7, Tab. 8, Fig. 7 and Fig. 8.
From the prediction values of the four groups in Fig. 7 and Fig. 8

Verify the Superiority of the GP Combination Algorithm
To verify that the established GP algorithm had better computing performance, GA algorithm and PSO algorithm were applied to optimize the BP neural network prediction model (MIV-GA-BP model and MIV-PSO-BP model). The input layer data was obtained from the MIV algorithm (the input layer data was the same). The training data and the testing data were derived from Tab. 1. The parameters of the relevant algorithms were the same as those described above.
This paper only establishes the prediction model of the subsidence factor and the horizontal movement factor. The prediction results of other models were similar. To analyze the reliability of the results more intuitively, the training set and the testing set were compared and analyzed according to the built models. The results are shown in Tab. 9, Tab. 10, and Fig. 9 and Fig. 10. From Tab. 9, Tab. 10, Fig. 9 and Fig. 10, we can see that MIV-GP-BP prediction model is superior to MIV-GA-BP prediction model and MIV-PSO-BP prediction model in prediction accuracy. It proves that MIV-GP-BP neural network prediction probability integration method has better accuracy and can meet the actual requirements of engineering technology.

CONCLUSIONS
The acquisition of parameters of PIM is a hotspot and a difficult point in the data processing of surface deformation monitoring in coal mines. To solve the problem that the parameter accuracy of the traditional BP neural network prediction probability integration method is not high, this paper has proposed a MIV-GP-BP prediction model. This model used a combination of MIV algorithm and GP combinational algorithm to optimize BP neural network. The main conclusions are as follows: (1) Taking the measured data of 50 typical mining areas as an example, the denoising process and MIV algorithm were applied to optimize the input of BP neural network. The MIV algorithm was used to sort the geological and mining factors affecting the parameters of the PIM. The factors with small influence factors were eliminated and the remaining factors were taken as the input. The smooth denoising method was adopted to eliminate the accidental errors that may be contained in the measured data to replace the original input layers. The prediction accuracy was improved by analysis.
(2) The integration of the algorithm of PSO and GA was applied to optimize the weight and threshold of BP neural network. The MIV algorithm was used to decrease the topology and establish MIV-GP-BP PIM model. The prediction results showed that among the five parameters, the RMSE was between 0.0058 and 1.1575, the MaxRE of q, tanβ, b and θ was less than 5.42%, and the MeaRE was less than 2.81%. The RMSE of s/H did not exceed 0.0058, the MaxRE was less than 9.66% and the MeaRE was less than 4.31%. MIV-BP model, MIV-GA model and MIV-PSO model has better accuracy than that of BP model. Due to limited resources, the amount of data collected for modeling in this article is not enough, and more data will be added for verification in the subsequent application process.