{"title":"Nearshore optical video object detector based on temporal branch and spatial feature enhancement","authors":"","doi":"10.1016/j.engappai.2024.109387","DOIUrl":null,"url":null,"abstract":"<div><div>The computing power of nearshore and ship-borne devices is limited, posing significant challenges for accurately detecting objects in real-time on such devices. We propose a nearshore video object detector (NVID) to tackle these challenges. Considering the abundance of dynamic entities in the nearshore environment, we have developed you can look more (YCLM) to perceive the temporal characteristics of these objects. Furthermore, to improve the ability to detect objects of different sizes of networks, we designed parallel deformable attention (PDA) based on the spatial features of objects. More importantly, we developed fast re-parameterization convolution (FREConv) and faster conv (FConv). Building on these innovations, we proposed a fast re-parameterization network (FRENet) specifically tailored to produce low-parameter, multi-scale feature outputs. With end-to-end training, our pipeline outperforms other state-of-the-art (SOTA) methods on the nearshore objects (NearshoreObjects) dataset (90.4 average precision (AP) 50 (+4.7), 9.3 parameters (Params) (−1.0M), 24.8 frames per second (FPS) (Jetson Nano) (+0.6)). In addition, NVID also achieved excellent results in the on board (OnBoard) dataset (90.3 AP50 (+2.8), 9.3 params (−1.0M), 26.5 FPS (Jetson Nano) (+0.8)). The source code can be accessed at <span><span>https://github.com/Yuanlin-Zhao/NVID</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624015458","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The computing power of nearshore and ship-borne devices is limited, posing significant challenges for accurately detecting objects in real-time on such devices. We propose a nearshore video object detector (NVID) to tackle these challenges. Considering the abundance of dynamic entities in the nearshore environment, we have developed you can look more (YCLM) to perceive the temporal characteristics of these objects. Furthermore, to improve the ability to detect objects of different sizes of networks, we designed parallel deformable attention (PDA) based on the spatial features of objects. More importantly, we developed fast re-parameterization convolution (FREConv) and faster conv (FConv). Building on these innovations, we proposed a fast re-parameterization network (FRENet) specifically tailored to produce low-parameter, multi-scale feature outputs. With end-to-end training, our pipeline outperforms other state-of-the-art (SOTA) methods on the nearshore objects (NearshoreObjects) dataset (90.4 average precision (AP) 50 (+4.7), 9.3 parameters (Params) (−1.0M), 24.8 frames per second (FPS) (Jetson Nano) (+0.6)). In addition, NVID also achieved excellent results in the on board (OnBoard) dataset (90.3 AP50 (+2.8), 9.3 params (−1.0M), 26.5 FPS (Jetson Nano) (+0.8)). The source code can be accessed at https://github.com/Yuanlin-Zhao/NVID.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.