Autocorrelation and cross-correlation flow analysis along the confluence of the Kupa and Sava Rivers

In this paper, an autocorrelation and cross-correlation analysis of the flow of the Kupa and Sava rivers was performed. The analysis was performed at hydrological stations close to the confluence of these two rivers near the city of Sisak, based on data of mean daily flows and daily precipitation. The analysed time period is from 2008 to 2017, with the series being divided into two parts of five years each, from 2008 to 2012 and 2013 to 2017. Daily flow data were measured at the hydrological stations Farkašić on the Kupa River and Crnac on the Sava River, and data on precipitation at the main meteorological station and the automatic meteorological station Sisak. The maximum value of the cross-correlation function between the hydrological stations at the Kupa and Sava rivers is very high, but at a time lag of zero days. The value of the cross-correlation function remains high, up to 0.6 and up to a 4 day lag. The cross-correlation function between precipitation and hydrological data has a very low maximum value.


Introduction
In this paper, an autocorrelation and cross-correlation analysis of the flow of the Sava River and the Kupa River near their confluence was made. The confluence of the Kupa and Sava rivers is located in Sisak. Sisak is a city on three rivers, the Sava, Kupa and Odra. The river Odra flows into the Kupa River, also near Sisak. Of the three mentioned rivers, the Sava and Kupa rivers have the most impressive hydrological regime, since the size of the basin next to the Farkašić station on the Kupa River is 8,992 km 2 , and the basin next to the Crnac station on the Sava River is 22,852 km 2 . Compared to the Kupa and the Sava, the Odra River is a very small river, only 83 km long with a basin size of 604 km 2 . The upper course of the Odra River has changed greatly because the Sava-Odra-Sava Canal has been in operation since 1979 as part of the flood protection of the Sava River. Due to this, but also the fact that the Croatian Hydrometeorological Institute (DHMZ) does not perform constant flow measurements on the Odra River, the influence of the Odra River in this analysis is neglected. The city of Sisak and its surroundings, which include smaller towns and villages through which rivers flow, are hydrologically very interesting. Also, a lot of areas near these three rivers are floodplains and floods are not uncommon in the area. Depending on the part of the year and the amount of precipitation, river water levels and their flow can change significantly.
A comparison and analysis of the flow of the Sava and Kupa rivers was performed by autocorrelation and crosscorrelation analysis. Autocorrelation analysis can be defined as the relationship of a variable and the lag time of that variable to itself. Cross-correlation analysis is used to compare two sets of data over time. The cross-correlation method is, among others, one of the most commonly used methods in modeling time series data in hydrology.
The aim of this paper is to practically analyze the potential interaction between the Sava and Kupa flows through autocorrelation and cross-correlation analysis. It is likely to be expected that due to such basin sizes in the hydrological profiles Farkašić and Crnac, the impact of precipitation measured at the considered meteorological station will be small, or even non-existent on their flows.

Research area
The investigated area is located in the central part of the Republic of Croatia, in the town of Sisak and its surroundings (see Figure 1). The research area stretches from the hydrological station Farkašić, which is located about twenty kilometres upstream of Sisak, all the way to the hydrological station Crnac, which is located a few kilometres downstream of the confluence of the Kupa and Sava rivers. The possibility of backwater effect at both considered stations is negligible because the Farkašić station on the river Kupa is located almost 50 km upstream from the confluence with the Sava River and the station Crnac on the Sava River is located downstream from that confluence. Hydrological analysis of trends in the Kupa River Basin was conducted by  Figure 2 shows the research area on a basic geological map. It can be seen that along with the hydrological stations themselves, flood sediments, siltstones and sands predominate. The investigated area belongs to the extreme southwestern part of the Pannonian Basin. The oldest rocks in this area are metamorphic rocks formed during the Hercynian orogeny and have not been discov-ered on the surface. The oldest rocks present on the surface are sediments and eruptive products that belong to the Upper Senone. The dominant sediments are micritic and marly limestones. Limestones are rich in fossils in places. In parallel with the sedimentation, volcanic activity took place, so tuff rocks and spilites can also be found. Through the Paleocene and Eocene, different transport mechanisms lead to the deposition of clasts, which make up most of the sediments. Coarser clasts formed by sedimentation in the Helvetic zone can also be found in the study area. These clasts are mainly of proluvial and alluvial origin. In Baden, there was a marine transgression that affected the entire research area. Terrigen clasts are sedimented. Baden sediments contain macro and microfauna. The sediments deposited in the Lower Sarmatian are mainly fine-grained and mixed clasts and contain Bracic fauna. Sedimentation continued through the Pannonian and Pontus. Limestones and calcareous marls are mainly deposited, and the upper pontoon is characterized by marls, sands and sandstones. Paludine deposits are deposited in the Pliocene, which are characterized by clays, sands and gravels, and also contain layers of coal. In the Pleistocene, sediments are deposited, and in the Sava area, coarse clasts of mainly alluvial origin and wetland flag are deposited. Proluvial and deluvial sediments were deposited in the Holocene, and sediments of alluvial and marsh origin were deposited in the areas of watercourses (Pikija, 1986).

Geological characteristics
Looking at the Kupa River Basin as a whole, a large part of the upper part of the Kupa Basin and the tributaries Korana, Mrežnica and Dobra, from the source to the town of Karlovac is in the karstic area, while downstream from Karlovac, the river Kupa flows through the non-karstic area. The size of its karstic basin is 5,450 km 2 , so that more than half of the considered basin to the Farkašić station is in the karstic area. A detailed description of the karstic part of the Kupa Basin was given by Pavlić et al., 2017. Since the altitude difference of the river Kupa in the karstic area is about 200 m, where it is a mountain river, in the non-karstic area between Karlovac and Sisak the altitude difference is only 10 m, so the water regime downstream from Karlovac is much more moderate and corresponds to rivers whose basin slope is moderately steep.
The Sava River Basin is diverse. The mountain relief dominates in the northern, western and southern part of the basin where the Alps and the Dinarides are located. The height difference of the whole basin is about 2,800 m, and the mean altitude of the basin is 545 m a.s.l. The mean slope of the Sava River Basin is 15.8%, which corresponds to a moderately steep basin (International Sava River Basin Commission, 2014). The basins of the considered area of this paper are located in the Pannonian plain, so it can be said that both of these rivers have similar relief characteristics.

Hydrogeological characteristics
The aquifers of the investigated area according to the lithostratigraphic division of the rocks belong to the Lonja Formation. The formation mainly consists of weakly bound sands and gray-green or bluish clays, and in the final layers there are gravels, flags and loose surface cover. The share of permeable layers is dominant. Two units within the formation are distinguished. The upper unit is characterized by a smaller thickness of several hundred meters. It mainly consists of unbound materials and has a high proportion of permeable layers and saturation with fresh water. The lower unit reaches up to 1,000 m, the degree of consolidation is higher, and the share of permeable layers is smaller. The first and second hydrogeological zones also differ. Zone I includes aquifers with water temperatures below 20°C, and in zone II, the water temperature is over 20°C. The cover layer is of different thickness. In some places it reaches 20 m, and in the immediate vicinity of Sisak, the thickness of the cover layer is less than 2 m. The cover layer mainly consists of powder particles with different pro- portions of clay and sand. The cover deposits mainly belong to the floodplain of the Sava River. The average values of the coefficient of hydraulic conductivity vary from 1x10 -1 m/day to 5.63x10 -1 m/day. The aquifer mainly consists of gravel and sand. Groundwater with favorable physicochemical properties has accumulated in shallow deposits. The value of the hydraulic conductivity of aquifers varies over a wide range. In Lekenik and Greda, values of 40.35 m/day were recorded, in Dubrovčak 20 m/day and in the area of Martinska Ves 5.0 m/day. By reducing the value of hydraulic conductivity and due to the smaller thickness of the gravellysandy aquifer, the transmissibility is gradually reduced. In Sisak, the average transmissibility is 500 m 2 /day (Larva, 2002).
The research area covers part of the Sava Depression, which is composed of Quaternary sediments that form groundwater reserves. Groundwater in this area is located at great depths and has relatively low yields. In the area of Sisak, a neotectonic uplift was formed and the aquifer is only about 5 meters thick. The aquifer mainly consists of sands with clay and powder. The thicknesses of the cover deposits above the alluvial aquifer are mainly composed of powder, dusty sand, and clay, and are highly variable. They can be from 5 m to 20 m, and somewhere they can even reach 60 m. The values of vertical hydraulic conductivity range from 3x10 -3 to 5x10 -3 m/day, and the values of effective porosity vary from 3x10 -2 to 9x10 -2 m/day (Brkić et al., 2009).

Climatological characteristics
Croatia is located in the northern temperate zone. The influence of the Atlantic Ocean is felt from the northwest, and the so-called zonal westerly winds blow from that direction. In the south, the Mediterranean Sea mitigates the adverse effects of dry and hot North Africa. The warm air originating from the Sahara is humidified by crossing the Mediterranean Sea, and to a lesser extent by the Adriatic Sea, so this air current makes the winter in Croatia (especially in Coastal Croatia) pleasantly warm, but occasionally very humid. Land influence also comes from Europe, north of Croatia. In winter, cold and dry air flows from there. The investigated area is located in the continental part of Croatia, where according to Köppen's classification of climate, a moderately warm humid (rainy) climate with warm summers, Cfb, prevails, as in most of Croatia. The influence of continental Croatia (Pannonian Bay) is strongest in winter when it is filled with cold air in the entire area between the Alps, the Dinarides and the Carpathians. This cold air occasionally spills over into the Adriatic and Mediterranean seas. In summer, continental Croatia warms up quickly and strongly, which leads to increased convection, i.e. to a stronger share of precipitation in the warm half of the year (Šegota & Filipčić, 2003).
It is characteristic of a moderately warm humid climate that the average temperature of the coldest month is not lower than -3°C, and at least one month has an average temperature higher than 10°C. In this area, the state of the atmosphere is very variable (a variety of weather situations with frequent and intense changes during the year). Winter is characterized by stationary anticyclonic weather types with frequent fog or low clouds and low currents, which favors the formation of frost. Spring is characterized by cyclonic weather types which cause frequent and sudden weather changes. There is a change in precipitation and non-precipitation periods, quiet and windy, and colder and warmer periods. Summers are characterized by baric fields with a small pressure gradient and a refreshing night breeze. In autumn, there are calm anticyclonic periods, but also rainy days in cyclones. The average annual air temperature in the study area is about 11°C. The annual amount of precipitation on the continental part of Croatia decreases from west to east. The annual rainfall in the city of Sisak is 865 mm. There are about 123 cloudy days in a year, and about 45 are completely clear days. The highest number of cloudy days is in December, and the lowest in July. The sun shines about 1,900 hours a year, which is about 5 hours a day. July and August have the highest amount of sunny hours, when the days have about 9 hours of sunshine, and December has the lowest amount of hours of sunshine where the sun shines about 1.5 hours a day. Relative humidity also decreases from west to east. In the study area, the humidity is relatively high since it is located along the rivers themselves, which are a constant source of water vapor. Relative humidity is about 80%, the highest in the winter months, and the lowest in April and May (Zaninović et al., 2008).

Methods
Correlation is the relationship or interdependence of two measurable variables. Correlation analyses provide a statistical dependence between random variables at different points, i.e. the degree of correlation between the two variables is examined. To describe correlation, the most commonly used measure is the correlation coefficient. The correlation coefficient measures the strength of the relationship between two variables and expresses their linear relationship. Equation 1 shows the most commonly used correlation coefficient, the Pearson correlation coefficient. (1) Where: r jk -correlation coefficient, cov jk -the covariance of two variables, s j , s k -standard deviations of individual variables. The correlation coefficient is r ≤ | 1 |. When the value of the correlation coefficient is close to +1, we say that the correlation is positive and then the variables j and k have a strong positive correlation. In this case, an increase in the value of the variable j results in an increase in the variable k. When the value of the correlation coefficient is close to -1, the correlation is negative, i.e. the variables j and k have a strong negative correlation. Then as the value of the variable j increases, the values of the variable k decrease. A correlation coefficient of value r > | 0.8 | indicates a strong correlation, and coefficients r < | 0.5 | indicate a weak correlation. When the correlation coefficient is equal to zero, there is no linear correlation between the variables (Posavec and Škudar, 2016).

Autocorrelation
Autocorrelation, also known as serial correlation, can be defined as the relationship between a variable and the lag time of that variable over certain time intervals (Box  et al., 2008). Thus, autocorrelation is the correlation of time series with itself depending on the lag time. We look at the lag of a variable and how each data point has an impact on each subsequent one. Autocorrelation is not a numerical value, but a function and is viewed in the range from 0 to 1. A graphical representation of the calculated autocorrelation coefficient for different lag times is the autocorrelogram.
"K" is very important for comparing the observed points in the time series. It indicates a lag, i.e. a shift in the observed interval. The autocorrelation function can be used to define the linear dependence of successive data values within a time series depending on their time lag. The time series is compared to itself with a discrete increase in time lag and then the autocorrelation coefficient is calculated according to Equations 2, 3 and 4: (2) (3) Where: C -autocovariance, r (k) -autocorrelation coefficient,

Cross-correlation
Cross-correlation is a statistical method that determines the degree of correlation of two time series data. By determining the degree of correlation between two time series, we obtain information about the strength of the connection between these two series as well as the time lag. Cross-correlation analysis is performed by calculating the cross-correlation coefficient for each time shift. The coefficients are plotted on a cross-correlogram. The offset or lag time is plotted on the x-axis, and the correlation coefficient is plotted on the y-axis. From the cross-correlogram, we can see for which time shift the highest cross-correlation coefficient was obtained and its value. The time shift with the highest cross-correlation coefficient is actually the reaction lag time when the time series are maximally aligned.
Cross-correlation can be defined as the comparison of two series over a period of time. It is defined as the degree to which two arrays correlate with respect to the lag of one array after another. For cross-correlation, it is important to compare time series that correspond to each other. This means that measurements should be performed at the same time intervals (every day, every hour, etc.). The strength of the connection between two time series is determined by the correlation coefficient. The correlation coefficient is calculated for each consecutive lag. The value of the time lag between the two series was determined by the highest calculated correlation coefficient. Zero lag represents the alignment of two time series to the same start time. The comparison is made for zero lag, but also for each subsequent consecutive lag, including positive and negative time lag positions. The results of cross-correlation analysis are presented on a cross-correlogram (Posavec and Škudar, 2016). The cross-correlation coefficient is calculated according to Equations 5, 6 and 7: Where: r xy (k) -cross-correlation function, k -time lag, σ x , σ y -standard deviations of input and output sequence, x i , y i -time series, N -number of data, C -cross covariance. When the input sequences are not connected, the cross-correlation function represents the system's response to the input pulse, and in the second case, when the input sequences are connected, the cross-correlation function reflects the connection between the input and output sequences. The graphical representation of the calculated cross-correlation coefficient is a cross-correlogram. The cross-correlogram is not symmetric when its center is observed, i.e. the cross-correlation function is not symmetric (r xy (k) ≠ r yx (k)). If k < 0, in expressions (5) and (6) x replaces y and vice versa. If r xy (k) > 0 for k > 0 the output string is dependent on the input, and if r xy (k) < 0, for k > 0 the input string is dependent on the output. A symmetric cross-correlation function with the center of symmetry in k = 0 indicates that the input and output series are dependent on some third series. The delay time of the output function after the input defines the time offset k at the maximum value of r yx (k).

Data used
In this paper, hydrological and meteorological data obtained from the Croatian Meteorological and Hydrologi-cal Service were used. Data of daily flows were taken from the hydrological station Farkašić at the Kupa River and the hydrological station Crnac at the Sava River, and the daily precipitation data from the main meteorological station (GMP) and the automatic meteorological station (AMP) Sisak, which are at the same location.
The hydrological station Farkašić (station ID 4010) (see Figure 3), is located on the left bank of the river Kupa in Stari Farkašić, upstream from Sisak, 47.150 km from the confluence with the Sava River. The size of the basin is 8,992 km 2 . The station is located in the area belonging to the Sava River Basin and the Black Sea Basin. The Farkašić station started operating on September 17, 1964. Water levels were measured from 1965 to 2019, flows were measured from 1965 to 1990, and from 2000 to 2019. The temperature was measured from 1969 to 1986. The minimum flow was recorded on August 29, 2000 and amounted to 13.61 m 3 /s, and the maximum was recorded on October 8, 1974 and amounted to 1,631 m 3 /s. In this paper, the data of measured daily flows from 2008 to 2018 will be used. Figure 4 shows a graph with mean and maximum annual flows for the period from 2008 to 2018.
The mean value of the mean annual flows at the Farkašić station of the first series (2008-2012) is 166 m 3 /s, and the second series (2013-2018) is 237 m 3 /s. This relationship of values between these two series is also visible in Figure 4, at mean and maximum annual flows.
The hydrological station Crnac (station ID 3020) (see Figure 5), is located on the right bank of the Sava River downstream from the confluence of the Kupa and Sava rivers. It is located 575.000 km from the mouth. The size of the basin is 22,852 km 2 . The area where the station is located belongs to the Sava River Basin and the Black

Results and discussion
The autocorrelation function was obtained using a spreadsheet program in Excel (Posavec et al., 2017) by entering data on the daily flows of the Kupa and Sava rivers over a period of 10 years, divided into two periods of 5 years each. The considered period 2008-2017 was chosen primarily because during that period at both hydrological stations (Farkašić on the Kupa and Crnac on the Sava River), there were no interruptions of measurements. So, in that time period, the series of mean daily flows are complete. The selected period was divided into two time series, 2008-2012 and 2013-2017, due to the mentioned difference in average annual flows. However, in addition, this analysis will provide an answer to whether, apart from the flow rates, something has changed in the speed of influence between the Kupa and the Sava rivers in these two periods at that confluence. Autocorrelation analysis was performed up to a maximum interval of 30 days. Figure 7 shows the autocorrelogram obtained from the daily flows on the Kupa River during the periods from 2008 to 2012 and from 2013 to 2017.
From the autocorrelogram, it can be seen that the autocorrelation function for period from 2008 to 2012 drops sharply from zero to the fifth day, after which it has a milder decline, but does not fall below the value 0.2, below which value, according to Mangin (1984), the system loses memory. As with the previous autocorrelogram, the autocorrelation function for the period from 2012 to 2017 drops sharply from zero to the fifth day, after which it has a milder decline, but after day 20, the correlation value falls below the value 0.2, indicating a shorter memory effect of this time series (Mangin, 1984). Figure 8 shows the autocorrelogram obtained from the daily flows on the Sava River during the periods from 2008 to 2012 and from 2013 to 2017. From the autocorrelogram, it can be seen that the autocorrelation function for the period from 2008 to 2012 drops sharply from zero to the fifth day, after which it has a mild decline, but does not fall below the value of 0.2, the same as on the Kupa River in the earlier period. The autocor-relation function for the period from 2012 to 2017 drops sharply from zero to the fifth day, after which it has a milder decline, but after day 21, the correlation value falls below 0.2, indicating a shorter memory effect of this time series.
The cross-correlation function was also obtained using a spreadsheet program in Excel (Posavec et al., 2017) by entering data on daily precipitation and flow of the Sava and Kupa rivers over a period of 10 years, divided into two periods of 5 years each. Figure 9 shows the cross-correlation function between the hydrological stations on the Kupa and Sava rivers during the periods from 2008 to 2012 and from 2013 to 2017. From the cross-correlogram of the earlier period, it can be seen that the maximum value of this function is r = 0.837 at a time lag of 0 days, which indicates that these two time series are very well correlated with this time lag (flow maxima coincide at both hydrological stations). The Kupa River affects flows on the Sava River for up to four days, as the value of the cross-correlation function drops to 0.615 at a four-day interval.
The cross-correlogram of the later period shows that the maximum value of this function is slightly lower than in the previous period r = 0.729 at a time lag of 0 days, which indicates that these two time series are very well correlated with this time lag (flow maxima coincide at both hydrological stations). The Kupa River affects the flows on the Sava River somewhat shorter than in the previous period, because after the third day the value of the cross-correlation function drops to 0.623, and after that, the value decreases much faster than in the previous period.
The values of the cross-correlation function at zero lag are not drastically different, however, the lower value and steeper decline of the cross-correlation function in the recent period show that the flow change between the two considered periods is not proportional between the two stations. In other words, the increase on the Sava River is relatively higher than on the Kupa River.  Figure 10 shows the cross-correlation function between the daily precipitation from the Sisak meteorological station and the daily flows of the Kupa River during the periods from 2008 to 2012 and from 2013 to 2017. The maximum value of the cross-correlation function of the earlier period is at a time lag of two days and is r = 0.249, which suggests that precipitation measured at that meteorological station has very little effect on the maximum measured flows at the hydrological station on Kupa after two days. As in the previous period, the maximum value of the cross-correlation function is r = 0.312, after two days. The value is slightly higher, but still small enough to be valid for the same conclusion as in the previous period. Figure 11 shows the cross-correlation functions between the daily precipitation from the Sisak meteorological station and the daily flows of the Sava River during the period from 2008 to 2012 and from 2013 and 2017.
The cross-correlation functions shown in Figure 11 (r = 0.206 and r = 0.264) have a very similar shape to those shown in Figure 10, so the same conclusion applies to them as for the functions in Figure 11. Precipitation measured at that meteorological station in these two periods have an even smaller impact on the maximum measured flows at the hydrological station on the Kupa River after two days due to lower maximum values of the cross-correlation function.

Conclusions
Autocorrelation functions obtained from data from the hydrological stations on the Kupa and Sava rivers show a very long memory effect in the first analysed period, while in the second period, this effect is somewhat shorter, and is about 20 days. Cross-correlation functions between sequences from the hydrological stations on the Kupa and Sava rivers indicate a very good correlation of these sequences at a time lag of zero days. This actually shows that the maxima at these hydrological stations coincide with each other, but that there is an impact of the Kupa River on the Sava River for some time to come. In the first considered period, this influence is still four days, until that time when the value of the cross-correlation function is higher than 0.6, while in the more recent considered period, this influence is somewhat shorter and amounts to three days.
Based on such a short time series, it can be said that this is not a drastic difference, but if a physical explanation were to be sought, then it would probably be a stronger relative increase in flow on the Sava River compared to the Kupa River in recent times. Therefore, the impact of the Kupa River on the Sava River is weaker in this recent period. Very low values of the maximum cross-correlation function between precipitation and flow on the Kupa and Sava rivers indicate a small impact of this precipitation on flows. This is explained by the fact that the sizes of the basins near the considered hydrological stations (Farkašić at Kupa 8,992 km 2 , Crnac at Sava 22,852 km 2 ) are very large and precipitation measured only in this area may have little impact on flows at these hydrological stations whose area is so large.