1. INTRODUCTION / Uvod
Since 2004, after the International Maritime Organisation (IMO) adopted new regulations on carriage requirements for automatic identification system (AIS) on board seagoing vessels governed by the International Convention for the Safety of Life at Sea (SOLAS) [1], the usage of the AIS has become an integral part of the global maritime activities. SOLAS vessels have been accordingly equipped with special AIS transponders which can regularly broadcast own navigational data, referred to as AIS messages, and receive external navigational data using radio communication on two very high frequency (VHF) marine channels reserved exclusively for AIS [2, p. 35].
AIS data exchange is self-organised and uses various types of the time division multiple access (TDMA) synchronisation [3, p. 17]. The system was designed, in the first place, to provide additional support to nautical personnel with the assessment of traffic conditions in proximity of own vessel. Despite that the officers of the watch are still not allowed to use AIS in a standalone mode for ensuring safety of navigation, they may include information from AIS to assist their visual and radar-based lookout in accordance with the International Regulations for Preventing Collisions at Sea (COLREGs) [4].
AIS users can be categorised into at least two major groups. First, the aforementioned crews on board vessels, whose focus is to utilise the most up-to-date traffic information. As soon as a new AIS message is received from a given vessel, the preceding message is discarded. AIS data within this group of users is directly exchanged between the AIS transponders on board nearby vessels within the line-of-sight propagation of AIS transmissions. It may be referred to as bridge-to-bridge data exchange. The second group is categorised by shore-based users which include maritime administration, vessel traffic services (VTS), shipping companies and research facilities, just to mention a few. They gather AIS data either on a regional or global scale and store them into the long-term archives. The AIS data within this group of users is obtained indirectly, referred to as bridge-to-shore data exchange, either from a network of terrestrial AIS stations [5] or a constellation of satellites equipped with sophisticated AIS receivers which have to deal with, among others, packet collision and signal overlap problems [6,7]. In case of the shore-based users, the acquisition of useful accounts of maritime traffic has a high priority, therefore discarding AIS data is generally avoided. In the scope of big data research and computing one should keep in mind that every single AIS message overrides the previous one but both will be stored to reveal the true nature of vessel movement [8]. It is therefore of importance, too, that the historical AIS messages have proper timestamps to ensure a chronological reconstruction of vessel movements during post-processing.
One of numerous applications of historical AIS data in the area of big data research is the estimation of maritime emissions [9]. Recent research activities of the Institute of Communications and Navigation (IKN) at the German Aerospace Center (DLR) include participation in the Emissions Land Karte (ELK) Project, standing for the Emission Map. Among the main objectives of the project are the allocation of emissions to various sectors of global economy and the development of global emission inventories for different types of transportation in the reference year 2019 [10]. In the scope of the ELK Project the IKN was assigned the task of assessing maritime emissions worldwide. A global AIS dataset used for the computation was obtained from a commercial maritime data provider. It comprised both terrestrial and satellite AIS reception networks, delivering raw data of more than 65 billion single vessel movements throughout 2019 worldwide. It should be mentioned that the aforementioned AIS data provider was chosen based on economic factors of pricing policy and one cannot exclude a possibility that competing AIS data providers could have delivered a dataset with even higher number of vessel movements.
The AIS messages which are necessary for tracking vessel activities relevant for the calculation of maritime emissions, are as given in Table 1 with the following numeric identifiers.
Table 1 Overview of AIS identifiers and their contents
Tablica 1. Pregled AIS identifikatora i njihov sadržaj
Source: [3, p. 105-106]
Every such AIS message has a data field, referred to as User ID, which contains the maritime mobile service identity (MMSI) [11]. The MMSI is formed of 9 decimal digits and identifies the AIS transponder of a given vessel as the source of AIS transmission. The MMSI numbers are globally unique at a given point in time and are assigned by the maritime authority of the current flag state of a given vessel. The global list of all MMSI identifiers is supervised by the International Telecommunication Union (ITU) [12]. The exact structure of the MMSI identifier is specified by the ITU [11]. The MMSI data field stored in the AIS messages is 30 bits long [3, p. 46]. It should be noted that MMSI is a compulsory data field in an AIS message, that must not be left undefined in an AIS transponder configuration as opposed to many other parameters within AIS which are allowed to contain unknown sensor data [13].
One of the challenges related to AIS data processing and tracking of global vessel movements to compute the maritime emissions was dealing with AIS errors and anomalies of vessel tracks. AIS is vulnerable in several error‑potential areas which include human errors – typically generated by unaware users, instrument failures – inappropriate design or fatigue of electronical devices, or overwhelmed transmission spectrum [14]. The subject of strange vessel movements observed within AIS has been well covered in the literature. Its complexity often requires application of various approaches like, for instance, graph theory, deep learning, neural networks, or kinematic interpolation [15,16]. A comprehensive example of classification of such anomalies states that there are five general anomalous behaviours derived from AIS vessel tracks: a route deviation, an unexpected activity, a port arrival, a close approach and a zone entry [17, p. 5]. The authors emphasise that all anomaly detection methods implicitly require the MMSI number to recover vessel tracks from individual AIS data points [17, p. 11]. Thus, the MMSI is generally assumed to be an infallible reference for distinguishing AIS vessels from one another.
In the scope of ELK Project, the prevailing anomaly spotted along the AIS vessel tracks was the unexpected activity, detected as strange continuations or interruptions of a given AIS vessel track. In the vast majority of those anomalous track sections, a vessel was suddenly off position so far and so fast that her movement could clearly be considered technically and physically impossible.
This article was inspired by those erratic motion patterns of vessels which seem to defy the laws of physics. The work is focused on analysing the effects of bit errors occurring solely in the 30 bit long data field which contains the MMSI number of an AIS message. It is an attempt to answer the following questions:
1. If a certain number of bits in the User ID data field of a given AIS transponder on board a vessel are inverted due to a transponder software bug or the User ID data field is intentionally modified by the crew and a new MMSI number is created therefrom, what is the level of possibility that the newly created MMSI number accidentally belongs to a completely different AIS transponder of a vessel which exists and is traceable in the AIS data?
2. If the newly created MMSI number is real and used by a different existing AIS transponder on board a vessel, what is the level of possibility that the false track points — injected into the AIS track they do not belong to — will remain undetected after passing a plausibility check?
In terms of calculating maritime emissions the latter question is particularly important. If the false points injected into an AIS track of a given vessel are cleared as plausible, the false movements of that vessel will be included in the emission calculation, which could introduce additional inaccuracies into the resulting emission inventories.
2. MATERIALS AND METHODS / Materijali i metode
2.1. MMSI-based injection of false AIS track points / Ubacivanje lažnih AIS točaka rute na temelju MMSI-a
In order to explain the aforementioned injection of false AIS track points in greater detail, an example situation is presented in Figure 1.
(a) ideal AIS tracks without MMSI alterations
(b) injection of a false AIS track point at P6
(c) flawed AIS tracks as stored in a big-data archive
(d) AIS tracks with discarded P6 due to its implausibility
Figure 1 AIS tracks of the example vessels CRAFT-1 (MMSI1) and CRAFT-2 (MMSI2)
Slika 1. AIS rute oglednih plovila CRAFT-1 (MMSI1) i CRAFT-2 (MMSI2)
Source: Author’s research
The Figure 1a depicts the AIS tracks of two vessels. One track originates from vessel CRAFT-1 identified by MMSI1 and contains the position points (P1, P2, P3, P4). The other track originates from vessel CRAFT-2 identified by MMSI2 and contains the position points (P5, P6, P7, P8). The overall chronology of all position reports, as sorted by their timestamps, is (P1, P2, P5, P3, P6, P4, P7, P8). If a software malfunction of AIS transponder, a deliberate act of deception, or any other action affected the User ID data field of the AIS message transmitted by CRAFT-2 at the point P6 such that the MMSI2 number was transformed into the MMSI1 number, as seen in Figure 1b, then the AIS track of CRAFT-1 during post-processing would contain the points (P1, P2, P3, P6, P4) whereas the AIS track of CRAFT-2 would contain the points (P5, P7, P8), as seen in Figure 1c. CRAFT-1 would ostensibly proceed along a longer route and her emissions might be overestimated, ceteris paribus. In the case of CRAFT-2, the position point P6 would irretrievably disappear from her AIS track and her emissions might be underestimated, ceteris paribus. In a fortunate situation the falsely injected AIS track point P6 would not pass a plausibility check and it would be discarded. This would leave the AIS track of CRAFT-1 intact while the AIS track of CRAFT-2 would be shortened during post-processing, as seen in Figure 1d.
It should be emphasised that the movement of CRAFT-1 along the altered route (P3, P6, P4) is not impracticable as long as it is physically plausible. As the Rule 5 of the COLREGs implies, “ every vessel shall at all times maintain a proper look-out by sight and hearing as well as by all available means appropriate in the prevailing circumstances and conditions so as to make a full appraisal of the situation and of the risk of collision” [4, p. 23]. It means, to a greater or lesser extent, that the crews on board and the owners ashore have a freedom and a responsibility to decide when and where their vessels go, and the prevailing circumstances may force them to divert from the primarily planned route.
2.2. Bitwise comparison of pairs of MMSI numbers / Usporedba parova MMSI brojeva po bitovima
In order to analyse how often a bit inversion of the User ID data field can transform a given MMSI number into another one which is actively being used by AIS transponder of a different vessel, an extensive list of MMSI numbers was needed. Alas, due to legal restrictions it was not possible to obtain a complete database dump of all AIS-related MMSI numbers stored in the ITU maritime mobile access and retrieval system (MARS). Therefore, an alternative list of MMSI numbers was generated from the global AIS dataset of 2019 by decoding the User ID data field of all available AIS messages containing dynamic data. This way, it was possible to gather a total of unique MMSI numbers. According to the ITU, as of November 2023, there were approximately 750 thousand vessels worldwide which had an officially assigned MMSI for use within the AIS [18]. Therefore, it might be argued that close to 30% of MMSI identifiers gathered from the AIS data could have been erroneous, fabricated or bogus as a consequence of a custom unauthorised configuration of AIS transponders leading even to sharing of the same MMSI by multiple vessels [19, p. 414-415].
Next, a specific type of the combinatorial testing, known as all-pairs testing or pairwise testing, was utilised [20, p. 4]. All pairs of gathered MMSI numbers were generated, so that every MMSI number could be bitwise compared with all the other MMSI numbers available in the set. For the processing part, the number of iterations needed to accomplish the task was 565076453241, which corresponds to being all 2-combinations of the set of MMSI numbers. At every iteration step a bitwise comparison of two given MMSI numbers was made. An example of a single iteration step, based on the AIS transponder MMSI 211202460 of research vessel POLARSTERN (IMO 8013132) and the AIS transponder MMSI 211627240 of research vessel SONNE (IMO 9633927), is described in Table 2.
Table 2 Algorithm for counting inverted bits of two MMSI identifiers
Tablica 2. Algoritam za brojenje invertiranih bitova dvaju MMSI identifikatora
Source: Author’s research
The number of inverted bits, when comparing all pairs of MMSI numbers, ranged from 1 to 30, leading to 30 categories of bit inversion thus be defined. For each individual category the MMSI numbers which were affected by a given bit inversion pairing were added to a corresponding set. Finally, based on all 30 categories and the cardinality of the categorised sets of affected MMSI numbers, a histogram was constructed for assessing the level of MMSI confusion in the identification of AIS transponders of the analysed vessels.
2.3. Plausibility of a false point injected into a short‑voyage AIS track / Vjerodostojnost lažne točke umetnute u AIS rutu kratkog putovanja
In the exemplar traffic situation shown in Figure 1 it was important, to additionally figure out how plausible the altered route of vessel CRAFT-1 was, with special focus on the false waypoint P6. For the crews on board CRAFT-1 and CRAFT-2 it does not pose a problem because they have other reliable means of observation, verification and communication of other vessel intentions and movements. However, during big data processing of billions of vessel movements extracted from AIS data, no insider information is available on what decisions were taken on the bridge of a given vessel and what her intended actions were. In order to detect and to assess the situations like the one shown in Figure 1, a plausibility check of speed over ground was carried out chronologically on AIS vessel movements of 2019. An example scheme of selecting MMSI numbers is presented in Figure 2.
Figure 2 Example of an interval-based selection of MMSI numbers for the plausibility check
Slika 2. Primjer odabira MMSI brojeva na temelju intervala za provjeru vjerodostojnosti
Source: Author’s research
The process iterated chronologically through all available dynamic AIS messages of the AIS dataset of 2019. As soon as a dynamic AIS message was found, its MMSI, timestamp and geographic position were noted. In Figure 2 it is the MMSIx. Afterwards, the iteration process continued until another dynamic AIS message was found, such that its MMSI was different from the previously spotted MMSI and the interval between the two AIS messages was at least 10 s. MMSIy was the next which occurred 10.6 s after the AIS message identified by MMSIx.
With the timestamps and the positions of both AIS transponders it was possible to compute the hypothetical speed over ground needed to move between the two positions. Additionally, a bitwise comparison of MMSIx and MMSIy was made to check how many bits would have to be inverted to confuse the identity of the two dynamic AIS messages. The calculated speed over ground and the number of inverted bits were stored in an identical data structure as the aforementioned bitwise comparison of all pairs of MMSI numbers. The iteration process then continued searching for another MMSI. This time the period was checked against the timestamp of the AIS messages identified by MMSIy.
In Figure 2, the next AIS message which met the conditions was spotted 12.4 s later and its identifier was the MMSIz. The general idea was to simulate a situation, like the example in Figure 2, when the MMSIy is transformed into the MMSIx and the vessel identified by AIS transponder MMSIx gets an additional false track point of the vessel identified by AIS transponder MMSIy, and similarly, the MMSIz is transformed into the MMSIy to generate another false track point for the vessel identified by AIS transponder MMSIy. The whole MMSI capture process continued through the whole AIS dataset. This way a histogram could be constructed for assessing how plausible such misidentified movements might be in terms of speed over ground being physically feasible.
2.4. Plausibility of a false point injected into a long‑voyage AIS track / Vjerodostojnost lažne točke unesene u AIS rutu dugog putovanja
The approach described in subsection 2.3 was additionally extended to assess the possibilities of injecting false AIS track points during longer voyages. The process iterated chronologically through all available dynamic AIS messages of the AIS dataset of 2019. Its aim was to find all situations in which a transmission of a dynamic AIS message from AIS transponder of a vessel was received only once during at least 2 weeks. If such vessel was found, another AIS transponder of a different vessel was searched for, such as that vessel also transmitted a dynamic AIS message only once approximately in the middle of the aforementioned period of 2 weeks. This way it was possible to find potentially falsified voyages between two remote locations for which a vessel would need 2 weeks to make a round trip. With the timestamps and the positions of the round-trip points it was possible to compute the hypothetical speed over ground needed to complete such a voyage. A real‑world example might be a falsified two‑week voyage from Lima (PELIM) towards Valparaíso (CLVAP) and back while one vessel was spotted on AIS only once in Lima and never left the harbour and the other vessel was spotted on AIS only once in Valparaíso and never left the harbour either. Similarly to the approach in subsection 2.3, a bitwise comparison of the two MMSI identifiers was made to check how many bits would have to be inverted to confuse the identity of the two dynamic AIS messages. The calculated speed over ground and the number of inverted bits were stored to construct a histogram for assessing how plausible such misidentified movements might be in terms of speed over ground being physically feasible.
3. RESULTS / Rezultati
An overall assessment of MMSI-based injection of false AIS track points due to a misidentification of vessels is shown in Figure 3.
Figure 3 Influence of the MMSI bit inversion on the possibility of injecting a false AIS track point
Slika 3. Utjecaj inverzije MMSI bitova na mogućnost ubacivanja lažne AIS točke rute
Source: Author’s research
It can be noticed that, in case of an inversion of a single bit, about 70% of all analysed MMSI numbers matched another existing MMSI number. This percentage sharply increases to reach its maximum. For the number of inverted bits between 4 and 25, every MMSI number selected from the analysed set could be transformed into another actively used MMSI number. Finally, in a situation when, for example, 27 bits are inverted causing literally a 90% alteration of the User ID data field, still about 54% of all MMSI numbers could be transformed into another existing MMSI number. Such transformation of MMSI number alone would insert a false AIS track point for one vessel and remove a correct AIS track point from a voyage record of another vessel at the same time. One has to emphasise that an AIS message, being affected like this by an external interference, would be fortunately rendered unusable because it would not pass a cyclic redundancy check (CRC) [3, p. 25]. Since the bit alterations described in this work are assumed to occur only internally within the on‑board equipment of a given vessel, the AIS messages transmitted thereafter are not malformed and will pass the CRC check.
As of writing this, there is no possibility within the design scope of AIS to detect such an MMSI switchover if it happens within AIS transponder before an AIS message is transmitted. It is noticeable that only when all 30 bits of a given MMSI are inverted, the percentage of the MMSI identifiers which could be transformed into another existing MMSI is so low that such alteration affecting the User ID data field would very rarely lead to a bogus MMSI. Additional possible explanation as to why the percentage decreases when the number of inverted bits stays between 26 and 30 is that there is a higher chance of generating an impossible MMSI greater than its theoretical limit of 999999999 since the maximum number of values storable in the User ID data field is which is 73741825 in excess of MMSI requirements.
In an attempt to mitigate the negative effects of such an AIS track falsification, it is possible to apply various plausibility checks upon AIS data which may include a verification of speed, course, heading or position data, just to mention a few [21]. In the scope of this analysis, a simple test of the speed over ground was set up under the assumption that a vessel cannot be faster than 60 kn. So far, the high-speed ferry FRANCISCO (IMO 9610028) has been the fastest merchant vessel in the world, excluding navy and classified craft. She can reach up to roughly 58 kn [22]. It has to be emphasised that the choice of speed over ground in reference to FRANCISCO performance for plausibility verification was only meant to serve as a simple functional support for finding erratic vessel movements. A combination of various tests involving additional motion parameters like an overall direction of movement based on course over ground, or even a practicality of voyage with respect to vessel operational profile, might produce better results and limit negative effects of AIS track falsification.
An example evaluation of how a false AIS track point can escape the aforementioned plausibility check is presented in Figure 4.
Figure 4 Percentage of falsely identified AIS messages passing a speed over ground plausibility check during a short voyage
Slika 4. Postotak lažno identificiranih AIS poruka koje prolaze provjeru vjerodostojnosti brzine preko dna tijekom kratkog putovanja
Source: Author’s research
It can be observed that, after inverting 3 bits of the User ID data field, about 8.8% of all analysed MMSI identifiers could be transformed into another plausible MMSI, should the aforementioned speed limit of 60 kn be applied as a plausibility check. For alternative approaches to validate the AIS plausibility, the results can be expected to be different. As seen in Figure 4, the percentage of plausible false MMSI identifiers which could cause the undesired injection of false AIS track points varies between 0.03% and 8.8%. In case of AIS data analysed within this work, when number of inverted bits of the User ID data field was between 16 and 30, no occurrence of plausible false MMSI was spotted.
The results of the extended search for falsified long‑voyage AIS track points is presented in Figure 5.
Figure 5 Percentage of falsely identified AIS messages passing a speed over ground plausibility check during a long voyage
Slika 5. Postotak lažno identificiranih AIS poruka koje prolaze provjeru vjerodostojnosti brzine preko dna tijekom dugog putovanja
Source: Author’s research
It can be noticed that the highest percentage of falsified MMSI identifiers (8.17%), which still passed the speed plausibility check, was reached after inverting 14 bits of the User ID data field. All cases of AIS transponders which seemed to have disappeared for 2 weeks from the AIS surveillance involved a bit inversion in the range between 6 and 21. Generally speaking, a total of about 53% of all analysed MMSI identifiers have shown a potential of being affected by the long‑voyage falsification of AIS tracks. It is due to various technical aspects of global AIS data gathering. That AIS transponder with a given MMSI disappears from the AIS traffic picture for 2 weeks could mean that either the AIS transponder stopped transmitting its signals or the AIS reception systems stopped receiving the signals from that AIS source. Regardless of the reasons of such AIS disappearance such falsified vessel movements could have a significant influence on calculating erroneous emissions out of touch with reality.
4. DISCUSSION AND SUMMARY / Rasprava i sažetak
The analysis presented in this work indicates that there is a strong possibility of misidentification of shipborne AIS transponders due to a bitwise modification of the User ID data field contained within an AIS message. The falsification of the AIS transponder identity of a vessel can easily remain unnoticed, especially if the falsely injected AIS track points pass a plausibility verification. Before jumping to conclusions about negative implications of the misidentification of vessels on the automatic identification system in general, it is important to bear in mind what AIS was made for: supporting officers of the watch on board vessels with their tasks related to safety of navigation. Better still, they are trained to carry out their duties and to ensure that an efficient look-out is maintained at all times regardless of whether AIS fails or is not available. Therefore, any corruption of the User ID data field causing injections of false AIS track points may affect only the on-the-scene short-term AIS-based plotting of the traffic situation. It is certainly a kind of difficulty that the crews on board vessels can deal with. What AIS was not devised for is a reliable global surveillance of the maritime traffic aimed at creating a credible long-term record of vessel movements and activities.
Fortunately, advances in radio and satellite technology make it possible to overcome numerous technical challenges in AIS data acquisition even from the most remote areas of the globe. Thus, AIS data gathering is no longer limited to coastal waters within the range of onshore AIS reception networks and such big-data AIS archives are nowadays successfully used in various maritime applications. Nevertheless, considering the results of this study, it is worth pointing out that the identification of vessels within AIS data, as it is currently based only on MMSI stored in the User ID data field, demonstrates its weakness when used on a global scale instead of only within the line-of-sight propagation of AIS transmissions. The reason for this is that having a global AIS data archive with thousands of MMSI records offers higher chances for post-processing running into injected false AIS track points than having local AIS tracks limited to easily identifiable vessels in proximity.
An attempt to provide the best possible way of addressing the aforementioned problems with misidentification of shipborne AIS transponders is beyond the scope of this work. However, it might be worthwhile, in the context of the results presented in this study, to consider the following recommendations addressed to the community responsible for further development of the automatic identification system. First, the AIS position report should be enhanced to contain absolute timestamp. It could be defined as the number of non-leap seconds which have passed since a given epoch, for example the UTC midnight of the 1 January 2000, or any suitable point in time prior to the worldwide deployment of AIS. Such upgrade would surely come at the price of an additional data field needed to be inserted into the AIS transmission. However, the foremost advantage would lie in delivering valuable input to the plausibility verification algorithms and perfecting the chronology of vessel movements stored in the big-data archives. Second, additional checksum of MMSI could be implemented to verify the integrity of the AIS message identifier. Such approach of using a check digit is currently applied to IMO numbers, to give just one example [21, p. 4]. AIS messages containing dynamic data have a Spare data field of 3 bits which is currently not used and remains reserved for future use [3, p. 110]. It might be used to store a checksum of MMSI computed by appropriate algorithm suitable for small blocks of data. This approach provides an example of using the existing data structure of the dynamic AIS message. Having a way of verifying the integrity of MMSI would make it more difficult — but still not impossible — to overlook a falsely injected AIS track point following a bitwise alteration of the MMSI.
In near future, AIS will become a part of an upgraded radio communication infrastructure, called the VHF data exchange system (VDES), which will operate between vessels, shore stations and satellites [23]. MMSI will continue to be used as an identifier of the data transmissions. It should be emphasised that an authentication of data messages will be a feature which VDES can provide. The future VDES authentication message will include MMSI together with other data fields like timestamp or slot number, thus making it more difficult to falsify MMSI alone in an untraceable way [24, p. 52-53]. Nevertheless, AIS in its well-established shape will keep running along with the modern VDES extensions and therefore it is important to take into consideration the weakness of the system reflected in the aforementioned potential to falsify, be it unintentional or maybe a malicious act of identity theft in the general context of cybercrime, the identity of AIS transmissions.
Author Contributions: Conceptualization, P.B.; methodology, P.B.; software, P.B.; validation, J.S.F. and M.G.; formal analysis, P.B. and J.S.F.; writing—original draft preparation, P.B.; writing—review and editing, J.S.F. and M.G. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.