Algerian northwestern seismic hazard evaluation based on the Markov model

The main aim of this paper is to develop a markovienne model for the evaluation of seismic hazard in the north-western part of Algeria. A region that accommodates from moderate to strong seismic activity (ML ≥ 2.5). This work is an attempt to conceive a stochastic model of the earthquake occurrences in order to assess the seismic hazard based on the use of a discrete time Markov chain with a fi nite state model. The presented model is applied on a complete data sample comprising most of the earthquakes that occurred in the Algerian northwestern area located between latitudes (34°N, 37°N) and longitudes (2°W, 3°E) since 1928 up to now (2018). The Markov chain is built over a homogeneous and completed catalogue, then the transition probability matrix of the chain is used to simulate the occurrences of the earthquakes in the coming decades. The results are compared to a classic Poisson model.


Introduction
Algeria is located in the northern part of the African plate facing the Eurasian plate. Thus, the northwestern region of Algeria is stuck between the Nubian and Eurasian tectonic plates with a very complex boundary between them, which is converging along a N-S to NNW-SSE direction since at least the early Quaternary period (McKenzie, 1972), (Philip, 1987). Based on paleo-seismological data (Meghraoui, 1988) and on the study of source mechanisms (Meghraoui et al., 1996) the motion rate is evaluated 4-6 mm per year. This convergence geodynamic process produces a large band of about 100-150 km, composed of Neogene deposits and deformed quaternary, comes into sight in the northern region of Algeria as the Tellian chain (Philip and Thomas, 1977), (Philip and Meghraoui, 1983), (Thomas, 1985). More precisely, the Tellian chain is formed by folds lying NE-SW which are organized in an echelon system, probably caused by the presence of deep E-W strikes slip (Bouhadad and Laouami, 2002). When compared to other countries in North Africa, the seismicity of Algeria is moderate to strong, scattered and concentrated in its northern part. Seismic activity is induced by Nubian-Eurasian convergence accumulated particularly in the Tellian chain, where the coastal part of this chain is the most active (Déverchère et al., 2005). According to historical documents and instrumental measurements, sev-eral strong earthquakes took place within the north border of the country.
Thus, a considerable amount of seismic activity observed in the northwest of Algeria is due to the convergence area between the Eurasian and African plates. An important number of shakes recorded during the last decades have been highly felt, and a number of them have been damaging, for instance the Oran earthquake of October 9, 1790 of intensity X (MSK scale), where about 3,000 human lives were claimed (Buforn et al., 2017), and the Beni-Chougrane August 18, 1994 earthquake of intensity VIII (M s = 5.6), which caused 171 deaths, and the collapse of about 1,000 constructions (CGS, 1995). Recall also the 1980 El-Asnam earthquake which occurred on October 10 at 13:25:25 local time with a surface wave magnitude of M s = 7.3. It was the largest earthquake in Algeria, and was followed three hours later by an aftershock of magnitude 6.2. Both events caused considerable damage with at least 2,600 killed and 8,300 injured (Dewey, 1991). This history shows that the northwestern region of Algeria is exposed to signifi cant seismic risk and is considered as an active area in Algeria. Especially the fact that this region includes the city of Oran, the largest and the most populated city in the northwestern region of Algeria.
This paper presents a discrete-time Markov chain to model the seismic activity in northwestern region of Algeria in order to infer the past seismic events to evaluate probabilistic earthquake forecasting results. The transition probabilities of the chain are estimated along with its relevant measures, resulting in the calculation of earthquake occurrence probabilities. The model is based on the local magnitude, date, time, and epicenter location of each past seismic event.
The rest of this paper is organized as follows. Section 2 presents the methodology and details of all the elements of the suggested model. After that, Section 3 describes and analyses the data set that will be used. It includes the homogenization and the completeness of the data. Then, Section 4 explains how the studied region is divided into two principal zones. Next, Section 5 justifi es the choice of a time unit of 248 days to sample the seismic event data of the compiled catalog. Finally, Section 6 discusses the obtained results based on the Markov chain.

Methodology
The use of probabilistic methods in the study of seismic activity has been greatly considered in literature. Several studies show that two main methods are used: the Poisson process that is more adapted to frequent events (Cornell, 1968), (Cornell and Vanmarcke, 1969), (Schenková and Kárník, 1970), (Liu and Fagel, 1972), (Merz and Cornell, 1973), (Caputo, 1974), (Der Kiureghian and Ang, 1977), and the Markov process which is adapted for the analysis of less frequent random events (Vagliente, 1973), (Veneziano and Cornell, 1974), (Lomnitz-Adler, 1983), and more recently (Beyreuther and Wassermann, 2008), (Beyreuther et al., 2012), (Votsi et al., 2013), (Quang et al., 2015). The choice of the Markov model for the analysis of seismic risk is based on the fact that, in general, an earthquake is a direct result of the earthquake that precedes it (Vere-Jones, 1966), (Knopoff, 1971). Additionally, as enough data is available nowadays, methods based on patternrecognition are recently very popular among research work (Peresan et al., 2005).
Moreover, scholars active in this fi eld, as exact prediction is extremely diffi cult, are only trying to estimate a probable time, a probable location, and a probable magnitude in which a future event is likely to happen. Therefore, research on methods of prediction focus on empirical data analysis, with two principal approaches: 1. Identifying distinctive precursors to earthquakes: with potential utility for short-term earthquake prediction or forecasting. These methods include the analysis of: animal behaviour (  . The method described here is based mainly on the use of the Markov approach for the analysis of seismic events. This method essentially uses the closest event to estimate when the event will happen in the future. One of the goals of seismic activity analysis of a region is to predict future earthquakes. Thus, building a Markov chain based on the seismic events of the past allows a probabilistic prediction of future events (Serpil and Celebioglu, 2011), (Chambers et al., 2012), (Chen and Liu, 2013), (Votsi et al., 2013).
The following subsections detail the elements and the necessary parameters used to construct the Markov chain associated with the studied region.
This section introduces the theoretical concept of the Markov chain. Markov processes are named after their discoverer, Andreï Markov (Markov, 1906). A Markov process is a stochastic process having the property to predict the future only by knowing the present. The choice of this model for the analysis of seismic risk is based on the fact that, in general, an earthquake is a direct result of the earthquake that precedes it (Vere-Jones, 1966), (Knopoff, 1971). The following sections focus on the Markov chain in discrete time on a discrete state space.
A discrete time Markov process is a sequence X 0 , X 1 , … X t of random variables with values in the state space E. The characteristic property of a Markov chain is: predicting the future from the present cannot be more precise by considering the past, because all the useful information for the prediction of the future is contained in the present state of the process. In general, this property is expressed by the following formula: t ≥ 0,  (e 0 , …, e t , e)  E t+2 : (1) Where Pr denotes the probability of an event.
It is assumed that most often Markov chains are homogeneous, i.e. the transition mechanism does not change over time. The homogeneity property is expressed as follows: Notice that only homogeneous Markov chains are considered in this work. Let ε be the magnitude threshold from which a seismic event is taken into account or not. Let H be a seismic catalogue for a given region R.
And H is the list of all the seismic events observed in the region R during a period let's say T. Each seismic event v is represented by its position p v , its magnitude M v , and the date of its occurrence t v .
The period T should be divided into identical units of time, during which the seismic activity is observed. The time unit is denoted as ∆t. Assume that the studied region is divided into n zones z 1 , z 2 , …, z n . Thus the region R can be seen as a system of n zones. During the time unit ∆t, each zone has two possible states, either 0 or 1: • z i = 0 if the i-th zone is seismically quiet during the time interval ∆t: all the magnitudes of seismic events observed during ∆t are strictly lower than the threshold magnitude ε. • z i = 1 if the i-th zone is seismically active during the time interval ∆t: at least one seismic event with a magnitude greater than or equal to the threshold magnitude ε was observed during the time unit ∆t. Thus, the overall state of the region R during ∆t is a combination of the states of the n zones: e = [z 1 , z 2 , ..., z n ]. This implies that e Î E = {0,1} n and the number of possible global states of the region R is |E |= 2 n . In this case, the model is a Markov chain C composed of 2 n states. Each state of the Markov chain represents an overall state of the region R. Therefore, E is the state space of the Markov chain. The transitions between these states are probabilities that must be estimated from the seismic history H. As well, θ eé denotes how often during the period T, the global state e has been observed during a unit of time, followed by the observation of the global state é during the following unit of time: This allows to calculate the probability of transition from the global state e to the global state é: The set of all transition probabilities is gathered in a matrix A of size 2 n X 2 n called the transition matrix which is defi ned as follows: The Markov chain is well defi ned by its state space and its matrix of transitions: Given a seismic catalogue H for a region R over a period T, and R need to be partitioned into n zones. It is necessary also to defi ne a threshold magnitude ε from which a seismic event is considered as important. And as this work considers discrete Markov chains, a time interval ∆t to sample the period T should be set.

Data set
Despite the fact that earthquake catalogs cover a much shorter period of time compared to paleoseismological periods, the earthquake records are indispensable

Homogenization
A homogeneous earthquake catalog with a uniform magnitude scale for measuring the size of past earthquakes is a prerequisite for an accurate evaluation of seismic hazard. Table 1 shows a total of 7636 records where more than 53% of the records are expressed in terms of the local magnitude M L . Thus, in this work, all other magnitudes will be converted to the local magnitude. To do so, events of the catalog where several magnitude scale measurements were given for the same event need to be considered. For instance, there are 764 events for which, at the same time, the body wave magnitude M b and the local magnitude M L were given. The relationship between M b and M L given by (Benouar et al., 1994) (for the Ibero-Maghreb region) and by (Kramer, 1996) tend to underestimate local magnitude for small body wave magnitudes and tend to overestimate local magnitude for big body wave magnitudes. In fact, a linear regression (minimizing the least squares method) was applied to quantify an empirical relationship between M b and M L for the region of study. The same procedure was applied to convert other magnitude scales into the local magnitude when avoiding less accurate conversion relationships given in literature as the

Completeness
A look at catalog records shows that the magnitude distribution of events is not homogeneous over time. Thus, to determine the mean rates of occurrence λ, from the entire period  leads to serious underestimations of λ for the middle and low magnitudes. However, if the sample is shortened to the time interval in which the lowest magnitudes included in the computation of λ are completely reported, mean rates of occurrence cannot be established for the largest observed earthquakes because of lack of data. To overcome this problem, the approach suggested by (Stepp, 1972) was applied to determine the interval in a magnitude class over which the class is complete.
The earthquake data is grouped into four magnitude classes such as: M L < 3, 3 ≤ M L < 4, 4 ≤ M L < 5, and M L ≥ 5. With a time interval of one year, the average number of events per year in each magnitude range is determined. If k 1 , k 2 , ..., k n are the number of events per year in a magnitude range, then the mean rate for this sample is: λ = where n is the number of unit time intervals. The variance is given by: σ λ = where T is the duration of the sample. If λ is constant, σ λ would vary as . Thus the standard deviations of the mean rate for the four magnitude intervals as a function of sample length are plotted along with nearly tangent lines with slope .
The deviation of standard deviation of the estimate of the mean from the tangent line indicates the length up to which a particular magnitude range may be taken as complete. The standard deviation shows stability in shorter windows for smaller earthquakes and in longer time windows for large-magnitude earthquakes. The last graph of Figure 7 shows a typical completeness test, with the standard deviation of the estimate of the mean of the annual number of events as a function of sample length for the catalogue.
As presented in Table 2, the analysis shows that data is complete for the slices M L < 3, 3 ≤ M L < 4, 4 ≤ M L < 5 and M L ≥ 5 for the past 15, 20, 50, and 90 years, respectively.
tude above which all events in a given region are detected, is found from the complete part of the catalog.
The M c is obtained through regression analysis from the frequency magnitude distribution as the value where the data departs from a straight line (Wiemer and Wyss, 2000). Figure 8 expresses the relationship between the magnitude and frequency of earthquakes in the considered region.
The threshold magnitude of completeness obtained from the complete part is seen to be around 2.5. Thus,

Threshold magnitude of completeness
After dividing the seismic data, the threshold magnitude of completeness M c , defi ned as the lowest magni-

Spatial analysis
The studied region in this paper, noted R, is the northwest of Algeria located between latitudes (34 • N, 37 • N) and longitudes (2 • W, 3 • E). As explained above, the region R should be divided into several seismic zones. To do so, for each point p on the region, the punctual annual frequency of earthquakes is computed. For each point p, it is necessary to determine for every event v in the catalog if that event v could be perceived or not by human form point p. Thus the used formula is the one developed in (McCue, 1980) which provides an empirical relationship between earthquakes of various magnitudes and the radial distance over which the effects of that earthquake should be felt by people: For example, an earthquake of magnitude M L = 3.3 could be felt by people in a radius of 23 km. The appli- cation of this method in the considered region allows us to distinguish mainly two active seismic zones (see Figure 9): • z 1 : the city of Oran (35° 41' 27''N,0° 38' 30''W) and its surroundings, and • z 2 : the city of Chlef (36° 9 ' 54''N, 1° 20' 4''E) and its surroundings. Based on this zoning, the Markov chain state space will be composed of four states described in Table 3.

Temporal analysis
In general, e φ which is the state defi ned by e φ = [z 1 = 0, z 2 = 0,..., z n = 0] represents the state in which no zone in region R presents the occurrence of an earthquake. As well, e All which is the state defi ned by e All = [z 1 = 1, z 2 = 1,..., z n = 1] represents the state in which all zones of region R presents an occurrence of an earthquake. Time unit ∆t was defi ned as the time interval used to sample the total period T. When ∆t is too high, the probability of having earthquakes in all the zones z i increases. When ∆t is too small, the probability of having no earthquakes in  [z 1 = 0, z 2 = 0] The whole region is seismically quiet eOran [z 1 = 1, z 2 = 0] The zone of Oran is active and that of Chlef is calm eChle f [z 1 = 0, z 2 = 1] The zone of Oran is calm and that of Chlef is active eAll [z 1 = 1, z 2 = 1] Both zones are seismically active Figure 10: The relationship between ∆t, Pr (X t+1 = e φ | X t = e φ ) and Pr (X t+1 = e All | X t = e All ). any zones increases. Therefore, if ∆t decreases Pr (X t+1 = e φ | X t = e φ ) increases, and if ∆t increases Pr (X t+1 = e All | X t = e All ) increases. Figure 10 shows the values of Pr (X t+1 = e φ | X t = e φ ) and Pr (X t+1 = e All | X t = e All ) for several values of ∆t when building a Markov chain over the homogeneous catalogue for each value of ∆t. According to a study conducted in (Nava et al., 2005), the most suitable value of ∆t is the one that makes the two probabilities Pr (X t+1 = e φ | X t = e φ ) and Pr (X t+1 = e All | X t = e All ) as close as possible. This is represented by the point of intersection between the two curves in Figure 10. Thus, in conclusion ∆t is chosen to be equal to 248 days.

Application & Results
This section uses all the data and parameters defi ned in the previous sections to assess the probability of the occurrence of earthquakes in the future decades in each of the zones when using two models: the Poisson model and the Markov model.

Poisson model
The temporal occurrence of earthquakes is commonly described by a Poisson model, which is a simple model that assumes an independent event between the different earthquake occurrences. In this paper, the Poisson model is combined with the (Gutenberg and Richter, 1956) law to predict the probability of at least one exceedance of a particular earthquake of magnitude m in a period of t years by the expression: where N is the number of earthquakes of magnitude larger than m, and λ m is estimated from the a-value and the b-value evaluated from using the empirical application of Gutenberg-Richter law on the homogeneous and completed catalogue:

Markov model
Based on the methodology described in the previous sections a Markov chain is built according to the parameters showed in Table 4.
Indeed, Figure 13 presents the transition graph of the obtained Markov chain. To understand the graph given in Figure 13, some explanations must be given: • Each vertex of the graph represents a possible state of the system. • The loop of the fi rst vertex e All labelled with the value 0.79 expresses the probability of the occur-   all the regions during the next unit of time where some earthquakes were observed in the unit of time before only in Oran. The following matrix shows the transition matrix of the obtained Markov chain. It includes the various possible transition probabilities and it implies an equilibrium equation system that allows us to evaluate the stationary probabilities: The transition matrix will be used to predict the probability of the occurrence of important earthquakes in each of the studied zones by simulating the Markov chain over 10 5 path.
For the zone of Oran (respectively Chlef) Figure 14 (respectively Figure 15) shows the evolution of the probability of the occurrence of an earthquake of a local magnitude greater than 5 when assuming that no earthquakes (of magnitude greater than 5) were observed during the year of 2018 (Pr(e φ ) = 1). The results also show that the region of Chlef is more risky than the region of Oran. For both models, in the region of Oran the model described in this paper predicts a chance bigger than 80% to have an earthquake of a local magnitude greater than 5 after the year of 2033. This chance is more than 99% after the year of 2063. In the region of Chlef, after the year of 2026, an estimation to have an earthquake of a local magnitude greater than 5 is estimated to a likelihood bigger than 80%. This likelihood is more than 99% after the year of 2042.

Conclusion
Geological, and particularly, seismic hazards in Northwestern Algeria remain an area of research that needs to be studied by other approaches and methods in order to enrich the results. Indeed, in this paper, this work is based on the concept of the Markov chain to estimate seismic activity in the coming years by using the seismic history of the last century. An important part of the work was devoted to the homogenization and the completeness of the data.
The use of Markov chain allowed us to make probabilistic predictions of earthquakes for the next decade. The presence of two seismically active areas is noticed: one is located in Oran and another one in Chlef. It has also been noticed that the recent seismic activity is more concentrated in Chlef. Thus, this region has a high vulnerability compared to the rest of Northwest Algeria. Notice The reported estimation of hazards shows that the region of Chlef presents a high to a very high hazard, however the region of Oran presents a moderate hazard.