Predicting main engine power and emissions for container, cargo, and tanker ships with artificial neural network analysis

13 with the 14-input model for container, cargo, and tanker ships, respectively. In order to make accurate predictions with maximum precision in the ANN analyses, the study attempted to use different values for the numbers of hidden neurons and inputs and then presented the performance results. The developed model can be used in future studies to be done on fuel consumption and energy efficiency for ships in maritime transport.


Introduction
Artificial intelligence (AI) has risen to a highly prominent position with the advancement of technology in the modern world.AI is currently considered the fastest and most critical method for solving problems and finding solutions in many professions.The importance of artificial intelligence has come to be understood in the shipping and maritime transport sector, albeit somewhat late.With the increasing problems coming from global warming, many sanctions are being applied to sectors that cause environmental pollution.The International Maritime Organization (IMO) applies the necessary measures and sanctions to reduce pollutants emitted from ship and maritime transport.Because the pollutant emissions originating from ship transportation constitute a significant proportion of the total emissions, the measures to be taken in maritime transport also set an example for terrestrial facilities and vehicles.Ekmekcioglu et al. [1,2] analyzed the data and movements of ships arriving at sampled ports for one year and presented these ships' pollutant emissions calculated for CO2, NOx, SO2, particulate matter (PM), and CO.Gunes [3] calculated main engine power and emissions generated through regression analyses involving data from 9,174 up-to-date bulk carriers separated into six different bulk carrier sizes.His regression model presented the results with an accuracy of 93.2%.Huang et al. [4] focused on dynamically calculating the emissions that occur with respect to ship's navigational trajectories.
The biggest determinants of ship emissions are the fuel used and main engine power.The rules IMO has adopted minimize the more harmful substances in fuels.However, main engine machine power is also significant in terms of energy efficiency as well as many other factors.Therefore, many studies in the literature have focused on how to best determine ships' main engine power.Jeon et al. [5] proposed regression analysis for accurately predicting the specific fuel consumption of a ship's main engine using data collection, clustering, and big data analysis methods in an artificial neural network (ANN) and attempted many variations in order to obtain accurate results.Sahin et al. [6] studied ANNs for estimating the Baltic dry load index (BDI) and showed a comparison of the results and performances for three different ANN models.Cepowski et al. [7] performed a regression analysis on data regarding ships built between 2000-2018 and presented preliminary design and engine power results for tanker, bulk, and container ships.Xhaferaj [8] wrote code to parametrically estimate ships' resistance and main engine power and validated the obtained results over conventional boats and data.Another study by Theodoropoulos et al. [9] estimated ship main engine power, fuel consumption, and emissions calculations using the Gaussian mixture model (GMM) and ANN methods.Cepowski et al. [10] estimated optimal container ship length using ANN and multiple nonlinear regression (MNLR) methods and presented a comparison of the results from the two methods.Similarly, Cong et al. [11] optimized performance and emissions for a two-stroke mainframe by applying multi-objective particle swarm optimization (MOPSO) and multivariate nonlinear regression (MNLR) analyses.Alexiou et al. [12] compared the performances of multiple regression algorithms such as ANN, tree regressions (TRs), random forest regression (RFR), k-nearest neighbor (k-NN), linear regression, and adaptive boosting (AdaBoost).Yan et al. [13] studied a two-stage ship fuel consumption estimation and reduction model for a dry cargo ship and compared the obtained results using two separate regressions.Tran [14] investigated the parameters of wind speed, wave height, ship speed, distance, and shaft speed regarding ship navigation using the fuzzy logic analytic hierarchy process (fuzzy AHP) method and resultantly determined optimal load and fuel consumption values.Peng et al. [15] calculated the energy consumption of ships in port over 15 parameter inputs as an example of a green port.They also presented five different machine-learning calculations to increase the energy efficiency of the ships in port.Jeong et al. [16] suggested using machine learning as method for time planning and process optimization in the shipbuilding industry regarding fabrication, ship block assembly, reel manufacture, and painting.Cepowski [17] conducted an ANN study to estimate the additional resistance in regular head waves using inputs from ship design parameters such as length, width, draft, and Froude number and obtained data from the experimental results supporting the model, showing the experimental data and ANN estimates to be quite similar.Borua et al. [18] studied the aspects of international freight transport management (IFTM) that can be improved with machine learning and proposed four different methods.Gkerekos et al. [19] conducted a comparative analysis of multiple regression algorithms to estimate ship main engine fuel consumption by considering two different ships' data collection strategies.Yuan et al. [20] studied the Gaussian process metamodel to estimate ship fuel consumption under different scenarios during a voyage; their metamodel also took into account wave and wind factors in addition to speed and trim.Yan et al. [21] presented an optimization study of the effect of route, speed, and environmental and mechanical factors on ships' main engine power while cruising and during the voyage.Gurgen et al. [22] conducted an ANN for a chemical tanker ship in the preliminary design stage, giving weight and speed parameters as inputs and calculating five important design parameters as outputs.Yildiz [23] used ANN models to estimate the residual drag coefficient of a trimaran ship form model with four inputs and showed different ANN functions to result in different performances.Kalajdzic et al. [24] conducted research on the Energy Efficiency Design Index (EEDI) and Energy Efficiency Existing Ship Index (EEXI), to which IMO attaches great importance; they presented numerical comparative results on 153 bulk carriers built between 2000-2020 for reducing power consumption and increasing energy efficiency.Cepowski et al. [25] developed equations to estimate engine power and the related varying fuel consumption for tanker, bulk, and container ships; the compared the equations they created from numerous variations for ANNs and presented their performances.Farag et al. [26] examined ANN and multiple regression models to estimate ships' required power and specific fuel consumption while cruising and showed the compatibility of the proposed models with previous reports.
Unlike the literature studies, this study conducts a very comprehensive data analysis by sifting through data on thousands of ship to present an ANN study with 14 input parameters regarding 762 containers, 816 cargo, and 1,442 tanker ships.Revealing the models that provide the most accurate results using different ANN models and arrays.The study also presents the performances of the models and the obtained results, thus proving that the desired results can be obtained quickly and accurately according to ship type using ANNs.

Marine
Traffic is an open community-based initiative that uses a database of shipidentifying data such as IMO number to offer real-time information about ships' whereabouts [27].The Marine Traffic database also serves as the data source for this study.The database can be regarded as a current resource regarding the global fleet and comprises more than 90 technical details (e.g., ID, type, shipbuilder, year built, average recorded speed, depth, deadweight [DWT], flag, engine power, and length overall [LOA]) of over 10,000 ships.Table 1 provides an example of the many different classification ships for each type of ship model [28][29][30].The study investigates data acquired from over 10,000 ships and has distinguished the parameters related to ship main engine power and pollutant emissions.The first stage reduced the over 90 parameters to 30.After omitting the errors/deficiencies in the values of the parameters desired in some of the obtained data, data were obtained from a total of 3,020 ships.Therefore, all the data used in this study are complete and accurate.Of these 3,020 ships, 762 are container ships, 816 are cargo ships, and 1,442 are tanker ships.As shown in detail in the flow chart in Figure 1, the number of inputs were reduced to 14, which is when the desired margin of error was reached.In addition, the opinions of academicians and engineers who are experts in the field were consulted for their help while eliminating the parameters.The data obtained regarding over 10,000 ships needed to be extracted.The number of data is very important in ANNs and regression analyses, but the right data must be used for the right purpose.These data involve 30 different parameters, including different field data such as IMO number, flag, height, depth, and propeller type.Data unrelated to machine power and emission calculations were omitted based on the opinions of the expert engineers and academicians.Next, the number of inputs was changed with a loop function in the code written in the MATLAB program (Figure 1) until the desired margin of error was reached.The best results were obtained with 14 inputs.The 14 inputs used in the analysis are very important parameters for calculating ship engine power and emissions.Maximum speed is also important as the highest speed a ship can reach.Similarly, the average speed of the ship while cruising is necessary to obtain accurate results.The breadth of the ship affects the geometry of the ship in terms of the block coefficient.The date the ship was built is another important parameter that shows the required machine power in comparison with the developing technology.Ship type is a datum type that should also be considered, as each ship type has its own characteristics.Status indicates whether a ship is actively engaged or out of service.As is known, to only use new ships in the data analysis would not be proper.In addition, ship length and displacement determine ship geometry and must be taken into account for ship resistance and propulsion.Due to DWT and gross tonnage determining a ship's load carrying capacity, these data directly relate to engine power.The study may additionally require knowing the engine cylinder size and engine stroke length in order to determine machine power.Also, the required power varies with the number of cylinders.Finally, including the fuel a ship uses in the ANN analysis was determined to yield more sensitive results.ANNs are information processing technology that takes their cues from how the human brain functions.ANN is used to model the rudimentary biological nervous system's algorithmic process and is namely a digital representation of actual neuron cells and the synapses that connect them.The weights' starting values are first assigned randomly.The following equations are then applied to determine the output value: 0 1 () (2) An activation function f transfers the final summation in order to obtain the node's output.In this study, the hidden-layer and output-layer activation functions respectively use the logsigmoid (logsig) function and purelin function, whose general definitions are stated in the following equations: Where x is the value of the input, n is the number of inputs per neuron, output j is the value of the output for the hidden nodes, m is the number of neurons in the hidden layer, output k is the value of the output for output nodes, and p is the number of neurons in the output layer.Also, wij is the weight between the input neurons and the hidden neurons.
The mean square error (MSE) has been calculated as a measure of network performance.For network comparisons, the statistical techniques of mean absolute percentage error (MAPE) and coefficient of determination (R 2 ) are also applied.These are stated as follows: ) where t is the target value, o is the output, ̄ is the mean of the output, and n is the number of samples.The ANN dataset was calculated from the ship database using 30 input and 1 output data.In order to calculate ship emissions, five more outputs were added by adding extra calculations with the correct main engine power.For the analysis of container ships, the dataset was divided into 115 samples for validation and testing and 532 samples for training.For the analysis of cargo ships, the dataset was divided into 123 samples for validation and testing and 570 samples for training.For the analysis of tanker ships, the dataset was divided into 217 samples for validation and testing and 1,008 samples for training.In the ANN analysis made in the MATLAB program, the boundary conditions were calculated according to the values in Table 2, with the model using the 14 parameters of maximum speed, average speed, breadth, year built, ship type, status, LOA, light displacement, summer displacement, fuel type, DWT, gross tonnage, engine cylinder size, and engine stroke length.The Levenberg-Marquardt technique outperforms other algorithms according to trial data [31][32][33], and the model only uses 14 input parameters, as a result of more specific parameters having been removed from the dataset for the sake of simplification.As a result, the output calculation convergence is sufficient.The 14input ANN system was then trained, validated, and tested.The employed perceptron model is depicted in Figure 2. Calculating emissions necessitates the use of emission factors.Table 3 lists the various emissions produced by bulk carriers based on the amount of energy used (kWh).The values in Table 3 were determined using information from Entec International [34] for 31,000 ships around the world.The following equation is used to calculate emission: ) Where E represents emissions in tons, EPP represents emissions per unit of engine power in grams per kWh, P represents installed engine power in kW, and YWH represents annual working hours.

Results and Discussion
The results regarding main engine power for container, cargo, and tanker ships and the resultant pollutant emission values using ANN are presented in Figures 3-10.The results are calculated and shown according to the ANN model, function, number of hidden neurons, number of inputs, and changes in parameters.

ANN analysis structural results
Figure 1 shows the code flow chart in detail.This code study that was made in the MATLAB program tested the trainlm, trainscg, and trainbr functions as training functions and used the tansig-tansig, tansig-purelin, logsig-tansig, tansig-logsig, logsig-logsig, purelinpurelin, and logsig-purelin functions as transfer functions.The best performances were obtained with the trainlm function and the tansig-purelin transfer function, which result in both speedy and accurate calculations in terms of MSE.The study shows the ANN analysis results just for container ships in some results as showing all results would be overwhelming and confusing due to the presence of the three different types of ships (i.e., container, cargo, and tanker).Predicting main engine power and emissions of container, Ibrahim Ozsari cargo, and tanker ships with artificial neural network analysis 85 Seeing the difference between the ANN model and the actual values is essential.The ANN analysis performed with 762 container ships obtained results very similar to the actual values, most of which are shown in Figure 4.The orange line indicates results that were correctly calculated.When looking at the wide range of results, the distribution is seen to occur between -0.276 and +0.3.As a result, the actual ships' main engine powers are seen to be able to be estimated using an accurate ANN model.Applying more complex structures to the ANN analysis only occasionally had a positive effect.An appropriate neural network should be created for each situation and problem.The number of middle layers and hidden neurons in ANN analysis vary due to computation time, underfitting, overfitting, and dropping.Figure 7 shows the best results in the code written for the three different ship types to have been obtained using 30 and 40 neurons.The MSE results were demonstrated to worsen when the number of neurons was less than 30 and greater than 40.For container ships, the lowest MSE result was found with 30 neurons at a value of 0.0067 for the training and with 40 neurons at a value of 0.02 for the validation and 0.18 for testing.For cargo ships, the MSE in the training was 0.081, 0.134 in the validation, and 0.182 in the testing, all of which performed best with a 30-neuron neural network.Similarly, a neural network with 30 neurons showed the best performance among tanker ships, with the MSE values for training, validation, and testing being 0.135, 0.243, and 0.245, respectively.As can be seen, all analyses were carried out with 30 hidden layers in the ANN model as this generally provided the best performance.The data from over 10,000 ships had more than 90 parameters.The input parameters initially numbered greater than 90 before being first reduced to 30 based on the opinions of expert engineers and detailed studies.After removing ship data that had missing or incorrect information regarding the 30 parameters, data from a total of 3,020 ships remained.The analysis made with the code study showed the 14-input ANN analysis to give the most accurate results.Figure 8 shows the results regarding MSE when estimating ship main engine power while changing the number of inputs from 2 to 14.As can be seen in detail in Figure 8, the 2-input ANN analysis provided an MSE greater than 20,000 for all three ship type.Similarly, the MSE results with four inputs were around 5,000, and this value continued to decrease up to 12 inputs.As a result of the 12-input ANN analysis, the MSE results had dropped below 1 for all three ship types, with container, cargo, and tanker ship MSEs being 0.1, 0.38, and 0.62, respectively.With 14 inputs, the MSE values became 0.03, 0.081 and 0.13, respectively.As a result, the ANN analyses are clearly seen to provide much more accurate and sensitive results by combining the abundance of data with an appropriate number of inputs.Figures 9a-c present the actual ship main engine power data and estimated ship main engine power values obtained from the ANN analysis.Figure 9a shows most of the values to overlap in the analysis made for container ships.While also similar for cargo ships, the values differ slightly more at certain points.The analysis made using the data for 1,442 tanker ships have the actual values and ANN analysis estimates overlap at most points, showing the results to be quite accurate.As the figures show, very accurate and sensitive results were obtained.Figures 7 and 8 present the MSE values from the ANN analysis results in detail.Also, the MAPE values for ship main engine power were calculated as 0.0091 for container ships, 0.012 for cargo ships, and 0.0124 for tanker ships.However, discrepancies were observed regarding certain ship data.While better results should be obtainable from a large dataset, ANNs have been found to result in errors at some points.By eliminating these ship data points, more accurate results can be obtained.However, the ANN analysis was performed over all the data, as doing otherwise would not be a proper behavior for scientific progress.In general, the consistency of the results has been entirely satisfactory for this study as well as for future studies.
Figures 10a-d show the annual pollutant emission amounts estimated by the ANN analysis for container ships and the results obtained by semi-empirical coefficient correlation.The results must be examined because they fall into a different order in each ANN analysis.In addition, due to annual emission totals being given in tons, the difference between the estimates and the actual values should be noted.When examining the figure, both quick and very sensitive results are seen to be obtained with the ANN model.Therefore, IMO's sensitivity regarding pollutant emissions needs to be combined with AI studies.Table 4 shows the annual average emissions from container, cargo, and tanker ships.The MAPE values seen in Table 4 show the ANN analysis to have a good level of accuracy, with the values for each pollutant emission calculation for all ship types being within 2% or closer.

Conclusions
The issue of clean and efficient energy has come to the fore due to limited energy resources and the serious problems caused by environmental pollution.Maritime transport, which plays a most crucial role in world transport, fulfills its duty in this regard.For this purpose, IMO attempts to reach the maximum efficiency and minimum emissions targets regarding energy on ships with indexes such as the EEDI, EEOI, and EEXI.Therefore, the main engine power required for ships and their resultant emissions should be calculated.As one of the essential tools of today's technology, AI should be used for these purposes.Using an ANN to estimate ship main engine power during the design stage is faster and easier than traditional methods.This study presents the most comprehensive ANN analysis made with ship data among the studies in the literature.A detailed analysis study was performed with a total of data regarding 3,020 ships (i.e., 762 container, 816 cargo, and 1,442 tanker ships).This precise analysis was made with 14 inputs (i.e., max.speed, average speed, breadth, year built, ship type, status, length overall, light displacement, summer displacement, fuel type, DWT, gross tonnage, engine cylinder size, and engine stroke length) to calculate outputs regarding the ship's main engine power and pollutant emissions.The study shows the stages of the detailed ANN analyses, the performance values, and the accuracy of predictions and concluded the regression graphs of the ANN analysis to be 0.99773 for container ships, 0.98964 for cargo ships, 0.97755 for tanker ships, and 0.97189 for all ships in general.The study also shows the estimated results and actual data regarding container, cargo, and tanker ships in order to make comparisons.The accuracy of the obtained results have been shown by calculating MSE values.Meanwhile, the number of hidden neurons was tested with many variations for the ANN structure, and the best results were seen to have been obtained with the ANN analysis possessing 30 neurons.In

Figure 1 .
Figure 1.Big data analysis process for main engine power and pollutant emissions.

Figure 2 .
Figure 2. The ANN's fundamental operating principles and organizational design.

Figure 3 .
Figure 3.A performance graph for the most precise neural network model regarding container ships.

Figure 3
Figure 3 shows the performance graph for the ANN model built over the container ship data.The developed artificial neural network (ANN) model made a total of 108 iterations, with the best validation performance being obtained on the 102 nd iteration.The cargo and tanker ships showed similar characteristics in the ANN model, with the 98 th and 112 th Iterations, respectively, providing the best validation.

Figure 4 .
Figure 4. Errors between the desired values and values from the ANN outputs for container ships, as well as the distribution of residuals.

Figure 5 .
Figure 5.The most accurate neural network model's regression graphs for container ships.

Figure 5
Figure5shows the regression graph between the values predicted by the ANN model and the actual main engine power values and presents the analysis results using data from 762 container ships.The correlation coefficients are 0.99883 for the training, 0.99765 for the validation, 0.99248 for the testing, and 0.99773 for all.The fact that the R values are very close to 1 is clear evidence that the results from the ANN model are consistent.

Figure 6 .
Figure 6.The most accurate neural network model's regression graphs for (a) container, (b) cargo, and (c) tanker ships, as well as for all ships (d).Meanwhile, Figure 6 shows the regression graph of the values estimated by the ANN model and the actual values for container, cargo, and tanker ships individually by type as well as collectively.All regression values, including the training, validation, and test values for the four different situations, are seen as 0.99773 for container ships, 0.98964 for cargo ships, 0.97755 for tanker ships, and 0.97189 for all ships.The training, validation, and test regression values for cargo ships are 0.99484, 0.98656, and 0.97094, respectively.Likewise, the ANN results for tanker ships are 0.98568 for training, 0.9551 for validation, and 0.96116 for testing, thus showing the importance of data compatibility.Performing separate ANN analyses according to ship type obtain more accurate results.

Figure 7 .
Figure 7. MSE for the training, validation, and testing results in terms of the number of neurons in each hidden layer.

Figure 8 .
Figure 8. Mean squared error (MSE) of the test results in terms of the number of input parameters.

Figure 9 .
Predicting main engine power and emissions of container, Ibrahim Ozsari cargo, and tanker ships with artificial neural network analysis 89 3.2 Ship main engine and pollutant emissions results ANN results presented alongside real marine engine power data for (a) container, (b) cargo, and (c) tanker ships.

Table 1 .
[28]dard main classes of container ships and their rough dimensions[28]

Table 2 .
Values for the training parameters used in the artificial neural network models

Table 4 .
Estimated annual pollutant emissions and MAPE results