Application of the Polynomial Function in the Analysis of Statistical Indicators of Risk and Safety in Shipping

Navigation safety issues are of common interest to all maritime countries due to their actions in a complex and risky environment. The world’s most serious ship accidents can cause major economic losses, loss of life, and severe environmental pollution in the sea. The data included in trust and security databases are usually collected from different sources. The accuracy of the collected data may vary from source to source due to a greater number of external and internal factors such as different manufacturers, and working and environmental conditions. The quality of the data may vary due to the incompleteness of the details, subjectivity, and experience of the data collector. Databases on the frequency of failures, the estimation of which can be based on recorded failure events, are difficult to access. Therefore, the aim of the research, based on the available statistical indicators of maritime risks and safety, is to indicate the application of mathematical statistics, analysis and assessment of unwanted events. The statistical methods proposed in the paper aim to point out the importance of prediction and the possibility of improving safety. In this sense, the polynomial function was applied. To visualize the results obtained during the research, the Excel program is used.


Introduction
The international shipping industry is responsible for transporting about 90% of world trade, so the safety of vessels is essential [1]. During the early 1990s, the global fleet lost more than 200 ships a year. South China, Indochina, Indonesia, and the Philippines are the main focuses of global losses. The Southeast Asian area also had its biggest losses in a decade, Figure 1.
During this research, the analysis of total losses in the period from 2015 to 2021 was made. The type of vessel was taken into consideration, as well as the cause of losses, within the period from 2011 to 2020. The malfunction of the device while sinking stands as one of the causes, but so do fire, explosion, and collision. People interact with different types of equipment to perform work tasks. Therefore, the risks of different types of violations when performing work tasks can occur. Regarding this notion, Allianz Global Corporate & Specialty (AGCS) offers mari-time risk advice for loss prevention [2]. In this research, estimates based on statistical observations, aim to highlight the possibility of risk management [3,4]. In case the system failure is the cause, this assessment is possible by calculating the time leading up to the failure, as well as the number of failures per system component [5,6].

Research
The paper analyzed the available statistical data on total losses in the period from 2015 to 2021. The data are displayed in Table 1 and presented graphically in Excel chart. The average number of losses was determined, as well as the average rate of change, and the calculation of the trend forecast of the observed phenomenon in shipping. [7,8]. The number of losses by vessel type and cause in the period from 2011 to 2020 was also analyzed, Table 3. In the analysis, the polynomial function was applied, because the data does not show constant monotonous growth or decline.

Methods (statistical and computational, application of Excel)
Empirical methods evaluate the risk based on existing data on the frequency of the occurrence of an adverse event. The available data shows the total losses between 2015 and 2021, Table 1. For transparency, the data is presented graphically (Figure 2).
Monitoring of various phenomena (economic and other) is important, as is the statistical processing of such data. When there is a set of chronologically edited values of the phenomenon, i.e. time series, the average number of losses in the observed period can be calculated by the expression [7]. It can be concluded that the number of losses in 2021.54, represents a significant improvement over the seven-year average number of losses of 81. Assuming that the values of total losses in shipping will continue in the future in the same way, the average rate of change can be calculated, and in the observed period, using the geometric mean, starting from the last element (Y N ) in the series, a prediction of its movement can also be made. The average annual rate of change means the average relative change in the value (in %) of a phenomenon during one time unit of the observed time period. It is calculated according to the expression: where is , Y N is the last, Y 1 is the first element in the observed period.
Table 1 data account: The rate is negative, which means that in the observed seven-year period, the annual number of losses decreased by an average of 10,45% per year. Based on the calculated rate, assuming that the values of a phenomenon will continue to move in the future in the same way, that is, according to the calculated average rate of change as in the observed period, a prediction of its movement can be made over the geometric mean, starting from the last element (Y N ) in the series: (6) -prediction value of the phenomenon assuming an unchanged G in the (N+t) period, Y N -the last value of occurrence in a series, G -calculated or assumed geometric mean of the vertical indices, t -the number of time periods after the last of the series for which the prediction is made. Table 1 covers the period of 7 years, N = 7. If we are interested in the number of losses in 2030, i.e. the prediction for 9 years in advance, then we can note: The southern China region remains the focal point of the biggest losses in the past decade, driven by factors including high levels of local and international trade, congested ports, older fleets, and extreme weather. We also analyze the number of losses by the type of vessel or cause in the period from 2011 to 2020, table 2.
As can be seen from the data, as well as from the graphical representation, the total shipping losses by vessel type do not show a steady monotonous increase or decrease in values. So, in this case, we cannot use the average rate to describe or predict the annual change in shipping losses. In such cases, the use of trend is a good basic option for such analysis. For this paper, we'll focus on the Fishery vessels.  The line itself can take on many forms depending on the shape of the data: straight, curved, etc. This is common practice when using statistical techniques to understand and forecast data [10, 11].
If we take a closer look at the graph showing the total losses of the Fishery vessels, we can see that the graph resembles to some kind of polynomial function. So, our trend function, or polynomial regression function, will be: (8) We can see from the graph above that there are five "extreme values", so it would be appropriate to choose a polynomial of at least 5 th order.
With the adequate choice of the origin of the coordinate system. It is common to place the origin (that is x = 0) in the beginning of the time interval, so our dataset now is Table 3 Number of losses for fishery vessels, Y, of the coordinate system, X We want to choose the coefficients of the polynomial function to be such that the total sum of the squared differences between the actual and the estimated value of losses in the observation points is minimal, that is So, to minimize the function ∑ (12) we need to calculate , . . . , 5, which gives us a system of linear equations from which we obtain the coefficients a 1 , a 2 , a 3 , a 4 , a 5 . Applying all this to our dataset, we get the following polynomial equation (14) and the fit of the trendline to the data is shown in Figure 5 below. The natural question that arises is: how convenient is such an estimation? As we can see in Figure 5, some original values (blue points) are quite distant from the trendline. But are they too distant so that we can say that this simple model is a good representative of the original data?
The answer to this question is given by calculating the R-squared: where (17) The closer the value of R-squared is to 1, the better the model is. So, the calculation of R-squared gives So maybe we can find a better fit. If we increase the degree of the polynomial to 8, that is (21) Again, calculating the values of the coefficients via Least Square Method, the polynomial trend is as follows (22) The fit of the trendline is now shown in Figure 6. Now it is clear that the fit on the trendline is much better than the previous one. So, let's check the R-squared for the polynomial of the 8 th degree. The calculation now gives R 2 = 0.9872, which is very close to 1, so our easy polynomial estimation is good.

Results and Discussion
Cargo ships accounted for more than a third of all vessels lost in 2020. According to the data presented this amount is about 44%. The most common cause of loss was sinking and the majority of cargo ships were lost in southern East Asian waters. From January 1, 2020 to December 31, 2020, contributing factors included bad weather, poor visibility leading to flooding and water intrusion, sinking, and machine failure [7]. Damage or machine failure is one of the causes of maritime incidents. Failure due to wear and tear is a common factor in the failure of mechanisms and devices, especially in the conditions of the marine environment and vessel operation. For example, excessive strain on mooring lines contributes to failures of this equipment. There are still container ship fires and incidents involving RoRo vessels. Some safety studies [3] highlighted the routine and monotony of daily tasks, resulting in a reduced concentration of crew members, which contributes to ship collisions. It is difficult to obtain data on the causes of machine failures, but predictive maintenance [9] and probabilistic assessments could reduce the frequency of, for example, human errors, which lead to damage or failure. Graphical presentation of the average number of losses provides a clearer and more transparent picture of the movement of the observed phenomenon in a certain period of time.
It is shown that using some basic statistical methods we can obtain good mathematical models that can be used for forecasting future vessel losses. Excel and statistical data analysis were used in the work.

Conclusion
In specific statistical research, it is preferable to analyze longer periods of time for which more relevant conclusions can be drawn about the characteristics of the observed phenomenon's values. The quality of available data may vary due to incompleteness of details, subjectivity and the experience of the person collecting the data.
Probabilistic assessment of technical risks, as well as safety, includes procedures for identifying and analyzing hazards, determining the consequences of random, unwanted events, and determining the probability of consequences caused by such events. Risk management includes the process of identification, assessment, selection and application of methods aimed at reducing the risks to which human health and the environment are exposed. It is important to check what undesirable things can happen, what the probability is, and what the consequences are.
For further research, it is important to develop the possibilities of predictive methods, as well as simulation methods in the maritime field, to improve safety. Probabilistic assessments are analytical, extensive, demanding and assess the risk of unlikely events, but with potentially large consequences.

Funding:
The research presented in the manuscript did not receive any external funding.
Author Contributions: data collection, research, concept, statistical analysis, editing, writing, review, checking JB.; statistical analysis, checking, editing, writing and review the article, BDB.