Application of Soft Computing Methodologies to Predict the 28-day Compressive Strength of Shotcrete: A Comparative Study of Individual and Hybrid Models

Shotcreting is a popular construction technique with wide-ranging applications in mining and civil engineering. Com - pressive strength is a primary mechanical property of shotcrete with particular importance for project safety, which highly depends on its mix design. However, in practice, there is no reliable and accurate method to predict this strength. In this study, existing experimental data related to shotcretes with 59 different mix designs are used to develop a series of soft computing methodologies, including individual artificial neural network, support vector regression, and M5P model tree and their hybrids with the fuzzy c-means clustering algorithm so as to predict the 28-day compressive strength of shotcrete. Analysis of the results shows the superiority of the hybrid model over the individual models in predicting the compressive strength of shotcrete. Overall, data clustering prior to use of machine learning techniques leads to cer - tain improvement in their performance and reliability and generalizability of their results. In particular, the M5P model tree exhibits excellent capability in anticipating the compressive strength of shotcrete.


Introduction
The importance of concrete as a construction material is indisputable. Sprayed concrete or shotcrete is a variant of concrete with an over 50 year history of use in rock/ soil support and stabilization (Moffat et al., 2017). Shotcrete can be described as a concrete or mortar that is sprayed through a high-pressure nozzle onto the surface at high velocity (ACI, 2005). Due to the compressed air used in the shotcreting process, the hardened shotcrete has slightly different properties than ordinary concrete. Thanks to properties such as high initial strength, flexibility, and good durability, shotcrete has found extensive application in mining and construction activities (Franzen et al., 2001;Thomas, 2008;Watanabe et al., 2010). Compressive strength is an important property of shotcrete that largely depends on its mix design. Nevertheless, quality of mix design often depends on the experience of the shotcreting crew, who rely on costly and time-consuming trial and errors to adjust the mix (ACI, 2005). Shotcrete strength is a function of many parameters including water-cement ratio, quantities of fine and coarse aggregates, admixtures, etc. Due to the large number of such parameters, it seems difficult to predict shotcrete strength. There are several empirical relations for predicting the strength of ordinary concrete, but the validity of these models for predicting the compressive strength of shotcrete is uncertain (Abrams, 1919;Janković et al., 2011). The prediction of shotcrete strength before shotcreting can save time, reduce operating costs, and improve operational planning and quality control.
Nowadays, data mining techniques are growing rapidly for data analysis in many fields of science. In line with this trend, these techniques have found widespread application in some branches of civil and mining engineering. One of these applications is the prediction of the mechanical and physical properties of cement-based materials. The ability of machine learning methods to predict concrete properties such as compressive strength of high-performance concrete, the tensile strength of steel fiber reinforced concrete, the elastic modulus of selfcompacting concrete, etc. has been extensively researched (Behnood et al., 2015;Cheng et al., 2014;Golafshani & Ashour, 2016;Yücel & Özel, 2012). In the case of compressive strength, the different data mining techniques used for prediction include an artificial neural network, a support vector machine, a classification and regression tree, neuro-fuzzy inference, and genetic programming. A summarized list of major publications in this particular line of research is provided in Table 1. In the case of shotcrete, however, the progress of research has been limited by the difference in properties and re-Rudarsko-geološko-naftni zbornik i autori (The Mining-Geology-Petroleum Engineering Bulletin and the authors) ©, 2021, pp. 33-48, DOI: 10.17794/rgn.2021.5.4 water-total material ratio, w/c: water to cement ratio, and mc: micro-silica. Clearly, given the differences in the constituting parts of cement-based materials in different parts of the world, these techniques are not powerful enough to be further extended for all kinds of concrete.
Using a preliminary phase of data clustering based on the greatest similarity between records, prior to data processing with artificial intelligence techniques, may be able to improve the prediction power and precision of data mining methods. Despite the widespread use of shotcrete in mining and construction activities during the past decades, there is still no strong quantitative method for the realistic anticipation of the compressive strength of shotcrete based on its mix design. Therefore, this study aimed to develop a probabilistic model for the anticipation of the 28-day Table 1: Continued  compressive strength of shotcrete. This aim was pursued by the use of three artificial intelligence techniques, namely artificial neural network, support vector regression, and M5P model tree, as well as their hybrids with a clustering method called fuzzy C-means (FCM). In the hybrid models, the aforementioned artificial intelligence techniques are applied to the clusters constructed by FCM rather than the entire dataset. In the end, a comparison is made between the performance of individual models and their hybrids in predicting the compressive strengths of shotcrete. The experimental data used in this research is a dataset compiled from the records of 59 laboratory tests conducted during the construction of the Karun-3 dam project in Iran.

Methodology
Machine learning is the main branch of artificial intelligence that utilizes learning methods to recognize complex patterns in experimental data (Taffese & Sistonen, 2017). This method has been successfully used for the simulation of material behavior in a variety of fields (J.-S. Chou et al., 2014). In this study, the compressive strength of shotcrete is predicted with three machine learning methodologies: artificial neural network, support vector machine, and M5P model tree, with FCM used in advance to cluster the data. A brief description of the methods utilized for prediction is provided in the following section.

Artificial neural network (ANN)
An artificial neural network is an information processing system with functional characteristics similar to biological neural networks. Being generalizations of mathematical models of the human brain or neurobiology, artificial neural networks are based on the following assumptions: 1-information is processed in a large number of simple elements called neurons. 2-Signals are transmitted between neurons along the communication link. 3-Each communication link has a given weight which is multiplied by the transmitted signal in a general neural network. 4-Every neuron adds an activation function (usually a non-linear function) to the input (the sum of weighted input signals) in order to determine the output signal (Friedman & Kandel, 1999). ANNs are characterized by (1) the pattern of the relationship between neurons (architecture), (2) how connections are weighted (training method or algorithm), and (3) the activation (transmission) function. Neural networks are particularly useful for the modeling of phenomena where there is no specific definition of or clear understanding of internal processes (Beale & Jackson, 1990;Fausett, 1994). Feedforward neural network is the best-known variant of ANN and has widespread application in many applied sciences. Generally, feed-forward networks, which are also known as multilayer perceptron (MLP), consist of one input layer, one or more hidden layers and one output layer (Adhikary & Mutsuyoshi, 2006).

Support vector regression (SVR)
First developed in 1995 by Vapnik, a support vector machine is a supervised learning model with two variations: support vector regression (SVR) and support vector classification (SVC) (Cortes & Vapnik, 1995). SVR is known for its substantial ability to solve nonlinear problems and has been successfully used for such purpose in various fields (Ghasemi, Kalhori, & Bagherpour, 2016). The core concept of SVR is to map the input data to an n-dimensional feature space by means of a non-linear mapping procedure, which is usually a kernel function (Golafshani & Behnood, 2018). As a result, a nonlinear solution in lower-dimensional input space will be corresponding to a linear solution in the higher--dimensional feature space. The kernel function to be used for mapping can be, for example, a linear kernel function , a polynomial kernel function , a radial basis function (RBF) , etc. (Gunn, 1998). In highly nonlinear spaces, RBF usually yields better results than other kernel functions, so it is also more suitable for the purpose of this study. The generalizability of SVR results highly depends on its learning parameters, such as the penalty factor (C) and the deviation (width) of the radial basis function kernel (g).

M5P model tree algorithm
The concept of model tree called M5 was first introduced in 1992 by Quinlan as a new learning model for prediction problems (Quinlan, 1992). Model trees obtain a structural display of data and a piecewise linear fit of the class. They are in fact a generalized form of the decision tree or regression tree in which discrete class labels or numerical values in the leaves are replaced by linear regression functions. These models are particularly suitable for handling large volumes of data sets with a high number of features and dimensions. The prediction accuracy of model trees is comparable to other data mining techniques, such as ANN and CART. However, the real advantage of model trees is their ability to provide a description of inherent patterns of relationships between data with the help of rules and regression equations; an ability that is absent in other intelligent models, such as ANN and SVM, where those relations remain hidden. Even though model trees are simple, they are a robust and accurate method for simulating the patterns and relationships for large data sets Criteria for division are used so that they minimize the intra-subset variation in the values down from the root through the branch to the node. The standard deviation of the values that reach through the branch from the root to the node is used to measure the variation. The splitting criterion is the minimization of error within each subset, which is measured by the standard deviation of the instance values that reach a node through branches, starting from the root. This is achieved by calculating the expected reduction in error from testing each attribute at the node and selecting the attribute that maximizes the expected error reduction. This splitting process stops when the output values of the instances that reach a node vary by less than 5% of the standard deviation of the original dataset, or when only a few instances remain. The standard deviation reduction (SDR) is calculated by Equation 1 (Khoshnoudian et al., 2013). (1) where T is the set of records that reach the node, T i are the sets that are resulted from splitting the node according to the chosen attribute, and sd denotes the standard deviation (Wang & Witten, 1997). In the next step, M5P calculates, for every interior node, a linear multiple regression model based on the values pertaining to that node and all the attributes that participate in tests in the subtree rooted at that node. Then, linear regression models are simplified by removing the attributes if this results in a lower expected error for future data. After this simplification, a pruning technique is used to overcome the over-training problem. The tree is pruned from the leaves if SDR for linear model in the root of sub-tree is smaller or equal to the expected error for the sub-tree. In the final step, a smoothing process is performed to compensate for sharp discontinuities that may occur between the adjacent linear models in the leaves of the pruned tree. This smoothing process often improves the prediction, especially for models based on training sets containing a small number of instances (Bonakdar & Etemad-Shahidi, 2011).

Fuzzy c-means (FCM)
The fuzzy c-means (FCM) algorithm, first developed by Dunn and then improved by Bezdek, is one of the well-known and most widely used fuzzy clustering techniques (Bezdek, 1981;Dunn, 1973). The primary motive for development of FCM was to address the deficiency in working with overlapping groups shown by the hard algorithm k-means (Silva Filho et al., 2015). Therefore, in accordance with fuzzy logic, each data can have a membership value between [0,1], and can belong to two or more clusters (Ren et al., 2019). In FCM, each cluster is described with respect to its center and the distance between a point and a cluster is measured by Euclidean distance. FCM relies on three basic operators: a set of prototypes V, a fuzzy partition matrix U, and an objective function J (U, V). This method operates based on the minimization of objective function 2: (2) where x j is the jth measured data point or object, v i is the center of cluster i, u ij is the membership degree of x j with respect to cluster i, m is a weight exponent controlling the degree of fuzzification, and is the Euclidean norm, which represents the similarity between any measured data and the center. In FCM, the minimization process is performed by an iterative algorithm. In each iteration, the values of u ij and v i are updated by formulas 3 and 4: (3) Once FCM processing is complete, membership degrees decide which individual belongs to which cluster. Each point joins to each cluster with a certain membership degree, but the cluster which gets the highest membership degree constitutes the actual cluster of that point (Esme & Karlik, 2016).

Data Preparation and Description
As mentioned earlier, this study uses a dataset compiled from the records of 59 laboratory tests, each with   three replicates, conducted based on different shotcrete mix designs during the construction of Karun-3 dam project in Iran. In these data, five mix design parameters, namely the quantities of cement, water, fine aggregates, coarse aggregates, and micro silica, were used as input variables, and the 28-day compressive strength of shotcrete was considered as the output variable. The statistical description of these variables is provided in Table 2, and their histogram is illustrated in Figure 1.
To investigate the prediction ability of the developed models, data were randomly divided into two separate groups, one for training and another for testing purpose. The training dataset consisted of 47 (80%) input-output pairs, which were used as training instances. The testing dataset consisted of 12 (20%) input-output pairs, which were withheld from the training process and were used only at the testing stage to gauge the prediction ability of the models.

Performance Evaluation Criteria
To estimate the anticipation accuracy of the models, their outputs needed to be compared with the actual values measured in the tests. This comparison was made based on three statistical measures: the coefficient of de-termination (R 2 ), the Mean Absolute Percentage Error (MAPE), and the Root Mean Squared Error (RMSE). The coefficient of determination represents the degree of similarity between the predicted and measured values. The closer R 2 is to one, the better the prediction power. The MAPE is another measure of prediction accuracy,   quality of aggregates, admixtures and plasticizers, etc., which all need to be incorporated into the model for it to make accurate predictions. However, this is almost impossible because of the diversity and variability of these parameters, which require us to incorporate a large number of inputs, which in turn leads to overlearning and overcomplexity of the model. Therefore, for modelling, we handpicked some of the primary determinants of compressive strength of shotcrete including the quantities of cement, water, fine aggregates, coarse aggregates, and micro silica. In view of our objective, which was to inquire the possibility of using intelligent methods to predict the compressive strength of shotcrete, and also assess the effect of using a preliminary data clustering phase prior to data mining technique on the prediction accuracy, development of the models is described in two section: one dedicated to individual models, and another to hybrid models, and finally, all models are compared based on the aforementioned statistical measures.

Development of individual models
This section describes the development of three machine learning models used as benchmark models, namely artificial neural network (ANN), support vector regression (SVR) and M5P model tree.

Artificial neural network
ANN models can be developed with various algorithms and topologies. ANN architecture consists of an input layer, an output layer, and hidden layers, each containing a number of neurons linked through weighted connections. The number of neurons in the input and output layers is equal to the number of input and output variables. However, the number of hidden layers and the neurons of each hidden layer is variable and greatly affects the performance of the model. Using a single hidden layer reduces the complexity of the model. Determining the number of hidden layer neurons is an important and sensitive part of the development of ANN model. This issue has been investigated by numerous researchers, who have proposed several methods and equations for this purpose (Caudill, 1988;Hecht-Nielsen, 1989;Kaastra & Boyd, 1996;Kanellopoulos & Wilkinson, 1997;Ripley, 1993). Considering 2Ni+1 as the maximum number of neurons required in the hidden layer, we tested the model with different numbers of neurons and chose the one with the best performance. The schematic structure and general characteristics of the ANN model used in this study are presented in Figure 2 and Table 3, respectively. As can be seen, the single hidden layer of this ANN model contains 7 neurons. The ANN model was implemented in MATLAB. As noted above, approximately 80% of data was used to train the model and the remaining 20% was reserved for testing. In Figure 3, the outputs of this ANN model are compared with the measured compressive strength values.
Where T i denotes the measured values, P i denotes the predicted value, is the mean of the measured values, and N is the total number of input data.

Modelling and results
The compressive strength of shotcrete is a function of several parameters, including the amount of different components, curing conditions, type of cement, type and Rudarsko-geološko-naftni zbornik i autori (The Mining-Geology-Petroleum Engineering Bulletin and the authors) ©, 2021, pp. 33-48, DOI: 10.17794/rgn.2021.5.4

Support vector regression
Like the ANN model, the SVR model was developed with five input parameters (quantities of cement, water, fine aggregates, coarse aggregates, and micro silica) and one output parameter (28-day compressive strength). The LIBSVM toolbox, developed by Chang and Lin (Chang & Lin, 2011), was used to develop the SVR model in the MATLAB environment. Given the superiority of the RBF kernel over the alternatives, the model was developed with this function. Learning parameters including the penalty factor and the deviation (width) of the RBF kernel function were determined via a grid searching method coupled with cross-validation. The model was trained with 47 data instances and evaluated using 12 data instances. In Figure 4, the outputs of this SVR model are compared with the measured compressive strength values.

M5P model tree
The M5P model tree for anticipation of the compressive strength of shotcrete was run in the machine learn-   ing utility software WEKA. The optimum value of threshold called minNumInstances, which represents the minimum number of instances allowed to be placed at each leaf and has a significant impact on the model performance, was obtained by trial and error. The M5P model tree was developed in two modes, pruned and unpruned, with minNumInstances set to 6. Like other models, the M5P model tree was trained with the training dataset set then evaluated by the test dataset. Diagrams of the pruned and unpruned M5P model trees for prediction of the compressive strength of shotcrete are plotted in Figures 5 and 6. As can be seen, the unpruned and pruned trees consist of respectively 15 and 3 linear regression models (see Table 4). Unpruned model trees often have an excessive number of leaves which complicate the analysis and may result in overlearning and reduced generalizability. In this condition, pruning the tree by merging some of the sub-trees simplifies the model, making it more generalizable, but may slightly reduce the prediction accuracy. Scatter diagrams of the meas-

Development of hybrid models
Generally, most studies in the field of data mining applied individual learning techniques with minor modifications to construct single models. However, hybrid models or combinations of two or more techniques have proven superior to many individual models (J.-S. Chou et al., 2014;Frosyniotis et al., 2003). The strategy adopted in this study is to combine unsupervised FCM and supervised learning methods including ANN, SVR, and M5P Tree in a parallel setup to achieve a new group of hybrid models. The block diagram of this hybrid model is shown in Figure 9. As can be seen, first, the dataset is classified by FCM algorithm into several clusters with similar characteristics, and then ANN, SVR, and M5P Tree are applied separately to each cluster. In view of the volume of data and after evaluating the performance of learning models with a different number of clusters, this number was set to 2. Data clustering was performed using the FCM toolbox in MATLAB. The train-and-test technique, which is one of the most common approaches to establishing learning algorithms for a given database, is also used to develop the hybrid models (Ghasemi et   al., 2017). Accordingly, the dataset was divided into two clusters, the first consisting of 35 datasets (28 datasets assigned to the training set and 7 datasets assigned to the testing set), and the second cluster consisting of 24 datasets (19 datasets assigned to the training set and 5 datasets assigned to the testing set). It should be noted that the training and testing datasets used in this phase are similar to the training and testing datasets used during the development of individual models.

FCM-ANN
Given the split of dataset into two clusters, the feedforward backpropagation ANN model was constructed for each cluster separately. The architecture of these networks is similar to that of individual ANN architecture consisting of one hidden layer composed of sigmoid neurons and a linear output layer. The number of neurons considered for the model of both clusters was 8. The model constructed for each cluster was tested by the testing dataset of the same cluster. In Figure 10, the final results of the hybrid model are compared with the measured compressive strength values.

FCM-SVR
For each cluster, the intelligent SVR model was developed using the training dataset of that cluster. Each model was then evaluated using the testing dataset of the same cluster. The results obtained from these models are presented and compared in Figure 11.

FCM-M5P
For each cluster, the FCM-M5P hybrid model was developed in two modes, pruned and unpruned. First, the unpruned M5P tree (see Figure 12) was implemented for both first and second clusters with minNumInstances set to 4 and 6, respectively. The results of this model are presented in Figure 13. Then, the models were pruned for more simplicity. The model trees obtained in this step are shown in Figure 14. The results of evaluation of these models with the testing dataset are presented in Figure 15.

Analysis of results
Performance of the aforementioned models was evaluated based on the criteria described in section 2.6. The values of each statistical measure for the above models are presented in Table 5.
The results show that ANN models have a weaker performance than the other two. The SVR model outperforms the ANN model, not only in accuracy, but also in execution time and memory consumption. However, the best performance among the models has been achieved  by the M5P model trees. In these trees, pruning has slightly reduced the accuracy, but has also made a significant reduction in tree size and the operation necessary to predict the compressive strength. The most important advantage of the M5P model tree over the other two methods is the ability to create a simple tree structure with linear models in the leaves, which can explicitly describe the relationship between the input and output parameters. In other intelligent data mining techniques, completed once the model is constructed, the relative importance of its inputs need to be determined through a sensitivity analysis. However, in decision trees, the top-down structure of the tree also reveals the importance of the parameters, as the parameters placed at a higher position participate in the final prediction of a larger portion of the input instances. A comparison of hybrid models with their corresponding individual models shows the higher accuracy of hybrids. In general, it can be claimed that a phase of data clustering improves not only the prediction performance of these models, but also their generalizability, and thus their applicability to a wider range of projects.

Summary and conclusion
An accurate estimation of the compressive strength of shotcrete is important for construction and mining projects. In this study, we investigated the ability of various data mining techniques, including artificial neural network, support vector regression and M5P tree to predict the 28-day compressive strength of shotcrete, as well as the effect of using a data clustering phase prior to data mining technique on the prediction accuracy. To achieve this purpose, the mentioned techniques were used individually to develop a series of standalone prediction models. Then, these techniques were combined with FCM clustering to construct a series of hybrid models. The results of all models were compared with the outputs of the corresponding individual models. The comparisons showed that, among the tested individual models, the M5P model tree has the highest accuracy in predicting the compressive strength of shotcrete. Apart from the superior accuracy, the most important advantage of this model over the other two is its ability to derive the linear regression relations between input and output data. Analysis of the results also showed the better prediction power of the SVR model as compared to the ANN model.
This study showed that the application of a data clustering phase prior to soft computing techniques can significantly improve the performance of models in anticipating the compressive strength of shotcrete. In general, the hybrid models managed to outperform the individual models. This superiority of hybrid models was particularly significant in the case of FCM-ANN and FCM-SVR in comparison with their individual counterparts. In addition to performance improvement, other benefits of data clustering phase as we described in this study include better generalizability and applicability to other projects.

Data Availability Statement
All data and models that support the findings of this study are available from the corresponding author upon reasonable request.