Maritime anomaly detection is crucial for ensuring navigational safety and marine security. However, the global navigation safety is significantly challenged by the lack of comprehensive understanding of abnormal maritime ship behaviors. Drawing inspiration from the advanced reasoning capabilities of large language models, we introduce VLMAR, a novel vision-language framework that synergizes retrieval-augmented knowledge grounding and chain-of-thought reasoning to address these challenges. Our approach consists of two key innovations: (1) The VLMAR dataset is a large-scale multimodal repository containing 80,000 automatic identification system records, 11,500 synthetic aperture radar images, 5750 AIS text reports, and 27,000 behavioral narratives; (2) The VLMAR model architecture links real-time sensor data with maritime knowledge through dynamic retrieval and uses chain-of-thought fusion to interpret complex behaviors. Experimental results show that VLMAR achieves 94.77% Rank-1 accuracy in AIS retrieval and 89.10% accuracy in anomaly detection, significantly outperforming existing VLMs. Beyond performance, VLMAR reveals that aligning spatiotemporal AIS data with SAR imagery enables interpretable detection of hidden anomalies such as AIS spoofing and unauthorized route deviations, offering reliable explanations for safety-critical maritime decisions. This research establishes a new benchmark for maritime artificial intelligence systems, demonstrating how hybrid retrieval-generation paradigms can enhance situational awareness and support human-aligned decision-making.
扫码关注我们
求助内容:
应助结果提醒方式:
