Zero-Shot Learning in Maritime Domain: Classification of Marine Objects using CLIP

Authors

  • Ivan Lorencin Juraj Dobrila University of Pula, Faculty of Informatics
  • Domagoj Frank
  • Damir Vusić

Keywords:

Clip, Marine Objects, Vision-language models, Transformers, Zero-shoot learning

Abstract

Maritime security and monitoring are essential for global trade, environmental protection,
and national defense. Traditional machine learning models have been effective in recognizing and
classifying maritime objects, but their reliance on large, labeled datasets poses challenges,
particularly in dynamic environments where new and unforeseen objects frequently emerge. This
study explores the application of Zero-Shot Learning (ZSL) to the maritime domain, leveraging the
CLIP model to classify maritime objects with minimal labeled data. A custom dataset comprising
1,438 images was used to evaluate the performance of various CLIP model variants. Our findings
indicate that CLIP models, particularly the "clip-vit-large-patch14-336" variant, achieve high
classification accuracy, with AUC values approaching 1.0 across most classes. However, challenges
remain in handling rare or ambiguous classes such as cargo ships, where F2 scores suggest
variability in recall and precision. Additionally, the study highlights the potential limitations of
these models, including their dependency on dataset diversity and the risk of overfitting to specific
data characteristics. The "clip-vit-large-patch14-336" model is identified as the most balanced and
reliable option, offering a strong foundation for enhancing maritime situational awareness and
supporting diverse maritime applications.

Downloads

Published

2024-10-22