Identification of Extreme Temperature Fluctuation in Blast Furnace Based on Fractal Time Series Analysis

In this study, we aim to estimate the density distribution for the return intervals of extreme temperature fluctuation in blast furnace during iron making process. We first identified the fractal feature of the data based on R/S analysis and also calculated the Hurst coefficient. Secondly, based on the fractal feature of the data, we estimated a stretched exponential distribution of the return intervals of extreme temperature fluctuation. Finally, in order to test the result, we applied this method to the data of two blast furnaces, and compared with the traditional kernel density estimation method. The comparison was based on 100,000 times K-S test. The comparison results showed that the fractal time series estimation provides a greater fitness than traditional estimation method since it has no rejection in K-S test. With this method, the density of return intervals of unexpected temperature fluctuation can be estimated. This can be applied as a tool of temperature control and also can be applied as a tool to evaluate the efficiency of the control system.


INTRODUCTION
During the blast furnace iron making process, the unexpected temperature fluctuation (extreme events) is universal phenomenon. Since its effects are always devastating, it is very significant to analyse its fluctuation characteristics. Recently, researchers had already researched on return intervals of extreme events in many fields. R. W. Katz and B. G. Brown applied the return intervals of extreme events analysis in climate research [1]. G. D. Luca, Z. Paola, and R. Stelzer applied this method in financial market [2]. Even in 2002, A. Bunde, J. Kropp, and H. J. Schellnbuber had already successfully applied extreme events analysis in the medical field [3]. Most of these researches are based on the long-term correlations and density distributions. If long-term correlation can be found in return intervals, the extreme events will be able to be predicted by analysing their return intervals. Being nonlinear time series, it is difficult to estimate their autocorrelation or high-order moment relations. However, in the review of J. F. Eichner, J. W. Kantelhardt, A. Bunde, and S. Havlin, it is stated that the density distribution is a very useful tool to describe non-linear system [4].
Normally, it is difficult to measure the temperature of a blast; however, it is proved that the silicon content of blast furnace (known as chemical temperature) is positively correlated to physical temperature of blast furnace. Thus, the analysis of silicon content will provide the evidence for the temperature control of blast furnace. The previous research had proved that during the iron making process, the silicon content series is non-normal, non-linear and long-term negative correlated, based on these features, C. H. Gao and his team applied a lot of models to predict the silicon content during the ironmaking process, such as: data-driven model based on Volterra series [5], data-driven time discrete models [6], multivariate phase space reconstruction and neural networks model [8], and modelling of the thermal state change of blast furnace hearth with support vector machines [10]. A. Nurkkala, F. Pettersson, and H. Saxén estimated the analyse model based on the non-linear feature [7], while S. Ueda, S. Natsui, and H. Nogami reviewed all the possible mathematical models that may provide help in this research [9]. All these researches are mostly based on the non-linear features of the data. However, considering the features that are identified by these researches, it is interesting to point out that this also shows a very notable fractal feature as the long-term negative correlation had been identified. Since the longterm correlation of return interval series is highly correlated to its original series, the extreme events of unexpected temperature fluctuation will be able to be predicted by predicting the return intervals of extreme events, Such idea had been stated by A. Bunde, J. F. Eichner, J. W. Kantelhardt, and S. Havlin [11] and the group of E. G. Altmann & H. Kantz [12] at the beginning of 20 th century. Estimating the density function of return intervals, we will be capable to speculate about the frequency of unexpected fluctuation. This is essential in smooth control of blast furnace system.
Based on the original research of 1969 by B. B. Mandelbrot and J. R. Wallis in the research of water resources [13], during 20 th century, J. W. Kantelhardt and his team had successfully improved the fractal time series analysis and tested them in many different fields such as: river run-off record research [14], heart beat research in sleep [15], breathing research in deep sleep [16]. Based on these researches, a statistical system of fractal analysis had been established [17].
In this research, we firstly measured the data by R/S analysis and identified it to be fractal. Then based on the feature of fractal time series, the silicon content series is analysed. Under different threshold, different return interval series were obtained, and then their density distribution was estimated. Meanwhile the traditional kernel density estimation method was also applied to compare with the method. Using the density distribution, we could provide certain theoretical advices to blast furnace temperature control system, which might finally contribute to the close-loop control of blast furnace system. The rest of the paper is organized as follows. Section 2 introduces the fractal time analysis, and presents the estimation of parameters. In Section 3 the method is applied into the real data from two different blast furnaces, and compared with traditional kernel density estimation method. Section 4 concludes the paper.

FRACTAL TIME SERIES ANALYSIS
For a long-term correlated series (x i ), i = 1, 2, …,N, mean of the series can be expressed as x �, standard deviation can be expressed as σ x , its auto-correlation function attenuator should follow a power law form which is illustrated below [18]: When the sample size N is limited to infinity, the average related time can be expressed as Assuming q is the threshold and the events out of the range of q are the extreme events, and the return interval between two extreme events is r q . This can be expressed as: With different thresholds, the density of r q is altered. From Fig. 1 below, it is obviously that with different threshold q 1 = 1 and q 2 = 2, the density of return intervals is altered, as the greater the threshold, the sparser the density.

Figure 1 Description of different return intervals under different threshold
With the original series (x i ), i = 1, 2,…, N, the sample mean and variance can be calculated as below: In this article the threshold q is identified as: The return interval series of extreme events under a threshold q is expressed as (r q, j ), j = 1, 2,…, N q where N q stands for the total number of the return intervals. The average return interval can be calculated: For non-auto correlated series (white noise), its return intervals of extreme events are not correlated, and follow the exponential distribution: For fractal time series, its auto-correlated function and its lag s have power-law relation as shown in (1) where γ is the auto-correlation parameter: Generating the above, the stretched exponential relationship could be estimated as: and: where H is the Hurst Coefficient. a γ and b γ are functions related to γ, what is more a γ and b γ are independent from threshold q. These two parameters can be calculated by the equations below: Based on Eq. (7): As long as b γ , R q , γ are all greater than zero, let t = b γ (r/R q ) γ so r/R q = (t/b γ ) 1/γ . Thus: Therefore, the original equation can be transferred to This equation is equal to The differential part is a gamma distribution Γ(1/γ). Thus the equation turns to: (1 ) ( Based on Eq. (8): As long as b γ , R q , γ are all greater than zero, let t = b γ (r/R q ) γ so r/R q = (t/b γ ) 1/γ . Then it can be transferred into: Then in the similar way it can be transformed into: (2 ) (2 ) 1 Solving the Eqs. (12) and (15) the two parameters can be estimated as: (1 )

SIMULATION AND DATA ANALYSIS
Based on fractal time series method, two series of raw data were analysed by applying Matlab and R. The two series of raw data were from Jiangxi Nanchang iron and steel factory blast NO.1 (short as NO.1) during the period October 2009 to November 2009 which has 1049 data and Jiangxi Xinyu steel factory blast NO.9 (short as NO.9) during the period September 2012 to July 2013 which has 1956 data.

The Estimation of Return Intervals and Parameters.
The threshold selection will determine the return interval series. In this article, the threshold q was selected as mean return interval plus or minus 1, 1.25 and 1.5 times standard deviation. For NO.1 blast furnace, the autocorrelation formula was calculated by using the cf tool function in Matlab, the result is shown below: The goodness of fit is shown below in Tab. 1. This shows that γ = 0.7887 is rational. Then the parameter of the stretched exponential distribution can be calculated by: (2 ) (2 ) (1 ) (1 ) This gives that a γ = 1.1912 and b γ = 1.3843, thus the distribution of return intervals of extreme events of NO.1 blast furnace is: For NO.9 blast furnace, the auto-correlation formula was calculated by using the cf tool function in Matlab, the result is shown below:  This shows that γ = 0.573 is rational. Then the parameter of the stretched exponential distribution can be calculated by: (2 ) (2 ) (1 ) (1 ) This gives that a γ = 2.0526 and b γ = 2.0774, thus the distribution of return intervals of extreme events of NO.1 blast furnace is:

The Comparison with Kernel Density Estimation
The fractal time series method that was applied above made an assumption that the return intervals are following a stretched exponential distribution. In order to prove the rationality of this method, a more objective method: kernel density estimation was applied to compare with the fractal time series method.
The kernel density estimation was directly done by R, in this method the bandwidths were determined by Direct Plug In method.
The estimation result shows that if the threshold is too small, the kernel density estimated density is not smooth enough. Therefore, the series with the threshold of 1.5 time standard deviation were selected.
Based on the result of kernel density estimation, a fitted distribution was estimated.
For NO.1 blast furnace the result is: About 100 data were selected from the distribution randomly, and another 100 data were selected from the raw data randomly, applying the K-S test for 100,000 times.
The test results show that there are nearly 20,000 times of rejection of null hypothesis, while the fractal time series methods show non rejection in the 100,000 times K-S test.
For NO.1 blast furnace the result is: . .
. f x x .
About 100 data were selected from the distribution randomly, and also another 100 data from the raw data were selected randomly, applying the K-S test for 100,000 times.
The test results show that there are nearly 8000 times of rejection of null hypothesis, while the fractal time series methods show non rejection in the 100,000 times K-S test.
Therefore, it can be concluded that the fractal time series method performed much better than traditional nonparameter method in illustrating the blast furnace system.

The Analysis under Log-Log Scale Coordinates System
In common coordinate system, the series of return intervals cannot be evaluated fullest. The reason is that when r→∞, theoretically Pq(r) should close to 0, however, in reality even calculated P q (r) is very small too, the differences between actual value and theoretical value cannot be seen from the common coordinate system. Therefore, in order to inspect the effectiveness of the density distribution, the log-log scale coordinates system was applied.
(1) For NO.1 blast furnace, Figs. 2, 3, 4 illustrate the density distributions under different thresholds. Blue stars are the actual value while the black curve is the density. It is obviously that the tail part of the curve has a very bad fitness, especially when threshold is calculated by mean plus or minus 1.5 times standard deviation. When r/Rq reaches a certain level, the tail part of R q P q (r) will be a fixed value. The reason is that R q (average return interval) is fixed, and P q (r) is fixed (depends on the threshold) as well. If the raw data is unlimited, the tail part of the curve should coincide with the actual value. However, in real life, the number of data is always limited; this determined that the effectiveness of the estimated density distribution will depend on the number of data and the threshold.  Log-log coordinates system for NO.9 blast furnace threshold = mean ±1.25σ

Figure 7
Log-log coordinates system for NO.9 blast furnace threshold = mean ±1.5σ (2) For NO.9 blast furnace, Figs. 5, 6 and 7 illustrate the density distributions. Compared with NO.1 blast furnace, it can be found that with more data, the fitting results will be better.

CONCLUSIONS
Fractal analysis is a very useful tool of complex system. In this study, the temperature fluctuation was found having fractal features. Based on the fractal features, we are able to estimate the density distribution of the return intervals of extreme temperature fluctuation in the iron making process. During the iron making process, the distribution could be applied as the tool to evaluate the current control system, and also could be used to predict the unexpected temperature fluctuation.
In this article, the analysis was based on two groups of data gathered from Jiangxi Nanchang iron and steel factory NO.1 blast furnace during October 2009 to November 2009 and Jiangxi Xinyu steel factory NO.9 blast furnace during September 2012 to July 2013. The results can be concluded as follows: (1) Compared with traditional kernel density estimation, the stretched exponential distribution based on fractal time series method performs much better in fitness. (2) The effectiveness of the distribution generally depends on three factors. First of all the number of raw data, more raw data would provide much better results. Secondly the threshold, if the threshold is too small, the model will provide very bad smoothness. If the threshold is too large, the length of the time series would be very short which may cause bad fitness. The third element is the range of r/Rq, in this particular situation, when 10 −0.5 < r/R q < 10 0.5 the model shows a rational fitting result.
Even this study made a progress in the fitness, however, it only considered the situation of single dimension data. Nowadays, with the development of technology, the data that are captured from the process will be greater and greater, the analysis may turn to high frequency and multidimensional data analysis in the future. We will consider the situation that the data turning will be of high frequency and multi-dimensional, such as new method to identify the data features and new models to analyse the data with these features. For instance, ANOVA method, which was introduced by M. C. Tang, D. Gong, S. Liu, and X. Lu, can be applied to find the key factors that are affecting the temperature fluctuation in the process [20] and the method applied by P. He to simulate the process under the circumstance of uncertainty [21].