Real-time vessel detection in maritime environments is crucial for diverse applications requiring speed and accuracy. Static camera views often introduce blind spots, compromising detection efficiency. This paper proposes a novel, real-time UAV-based system that uses a dynamic camera control strategy to address this limitation. This strategy leverages pre-defined search patterns, historical data (if available), and real-time sensor information (e.g., radar or LiDAR) to dynamically adjust the UAV's camera gimbal angles. This ensures comprehensive search area coverage while minimizing the risk of undetected vessels. Beyond dynamic camera control, our system incorporates a unique feature-based prioritization scheme for real-time target vessel identification. This scheme analyzes features extracted from captured images, including object size and shape. Additionally, movement analysis helps distinguish stationary objects from potential vessels. The combined approach of dynamic camera control and feature-based prioritization offers significant advantages. Firstly, it enhances search efficiency by systematically scanning the area and prioritizing promising candidates based on dynamic camera adjustments and feature analysis. Secondly, it improves detection accuracy by employing feature similarity (cosine similarity with a reference vessel stored in the system using a ResNet50 module) to reduce false positives and expedite target identification, especially in scenarios with multiple vessels. A comprehensive evaluation process has been conducted to validate the effectiveness of our proposed system in diverse simulated and real-world environments encompassing various conditions (weather, traffic density, background clutter). The results from this evaluation are highly promising and suggest the system's strong potential for real-time vessel detection in maritime environments.
{"title":"Prioritized Real-Time UAV-Based Vessel Detection for Efficient Maritime Search","authors":"Lyes Saad Saoud, Zikai Jia, Siyuan Yang, Muhayy Ud Din, Lakmal Seneviratne, Shaoming He, Irfan Hussain","doi":"10.1002/rob.70048","DOIUrl":"https://doi.org/10.1002/rob.70048","url":null,"abstract":"<p>Real-time vessel detection in maritime environments is crucial for diverse applications requiring speed and accuracy. Static camera views often introduce blind spots, compromising detection efficiency. This paper proposes a novel, real-time UAV-based system that uses a dynamic camera control strategy to address this limitation. This strategy leverages pre-defined search patterns, historical data (if available), and real-time sensor information (e.g., radar or LiDAR) to dynamically adjust the UAV's camera gimbal angles. This ensures comprehensive search area coverage while minimizing the risk of undetected vessels. Beyond dynamic camera control, our system incorporates a unique feature-based prioritization scheme for real-time target vessel identification. This scheme analyzes features extracted from captured images, including object size and shape. Additionally, movement analysis helps distinguish stationary objects from potential vessels. The combined approach of dynamic camera control and feature-based prioritization offers significant advantages. Firstly, it enhances search efficiency by systematically scanning the area and prioritizing promising candidates based on dynamic camera adjustments and feature analysis. Secondly, it improves detection accuracy by employing feature similarity (cosine similarity with a reference vessel stored in the system using a ResNet50 module) to reduce false positives and expedite target identification, especially in scenarios with multiple vessels. A comprehensive evaluation process has been conducted to validate the effectiveness of our proposed system in diverse simulated and real-world environments encompassing various conditions (weather, traffic density, background clutter). The results from this evaluation are highly promising and suggest the system's strong potential for real-time vessel detection in maritime environments.</p>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"43 2","pages":"561-577"},"PeriodicalIF":5.2,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/rob.70048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146139159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingxiu Chang, Yongqiang Cheng, John Murray, Muhammad Khalid, Umar Manzoor
Vision-based deep learning models have been widely adopted in autonomous agents, such as unmanned aerial vehicles (UAVs), particularly in reactive control policies that serve as a key component of navigation systems. These policies enable agents to respond instantaneously to dynamic environments without relying on pre-existing maps. However, there remain open challenges to improve the agent's reactive control performance: (1) Is it possible and how to anticipate future states at the current moment to benefit control precision? (2) Is it possible and how can we anticipate future states for different sub-tasks when the agent's control consists of both discrete classification and continuous regression commands? Inspired by the Chinese idiom “Mirror Flower, Water Moon,” this paper hypothesizes that future states in the latent space can be learnt from sequential images using contrastive learning, and consequently proposes a light-weight Multi-task Visual Prospective Representation Learning (MulVPRL) framework for benefiting reactive control. Specifically, (1) This paper leverages the advantage of contrastive learning to correlate the representations obtained from the latest sequential images and one image in the future. (2) This paper constructs an integrated loss function of contrastive learning for classification and regression sub-tasks. The MulVPRL framework outperforms the benchmark models on the public HDIN and DroNet datasets, and obtained the best performance in real-world experiments (