{"title":"Spatiotemporal Feature Enhancement Network for Blur Robust Underwater Object Detection","authors":"Hao Zhou;Lu Qi;Hai Huang;Xu Yang;Jing Yang","doi":"10.1109/TCDS.2024.3386664","DOIUrl":null,"url":null,"abstract":"Underwater object detection is challenged by the presence of image blur induced by light absorption and scattering, resulting in substantial performance degradation. It is hypothesized that the attenuation of light is directly correlated with the camera-to-object distance, manifesting as variable degrees of image blur across different regions within underwater images. Specifically, regions in close proximity to the camera exhibit less pronounced blur compared to distant regions. Within the same object category, objects situated in clear regions share similar feature embeddings with their counterparts in blurred regions. This observation underscores the potential for leveraging objects in clear regions to aid in the detection of objects within blurred areas, a critical requirement for autonomous agents, such as autonomous underwater vehicles, engaged in continuous underwater object detection. Motivated by this insight, we introduce the spatiotemporal feature enhancement network (STFEN), a novel framework engineered to autonomously extract discriminative features from objects in clear regions. These features are then harnessed to enhance the representations of objects in blurred regions, operating across both spatial and temporal dimensions. Notably, the proposed STFEN seamlessly integrates into two-stage detectors, such as the faster region-based convolutional neural networks (Faster R-CNN) and feature pyramid networks (FPN). Extensive experimentation conducted on two benchmark underwater datasets, URPC 2018 and URPC 2019, conclusively demonstrates the efficacy of the STFEN framework. It delivers substantial enhancements in performance relative to baseline methods, yielding improvements in the mAP evaluation metric ranging from 3.7% to 5.0%.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1814-1828"},"PeriodicalIF":5.0000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10498101/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Underwater object detection is challenged by the presence of image blur induced by light absorption and scattering, resulting in substantial performance degradation. It is hypothesized that the attenuation of light is directly correlated with the camera-to-object distance, manifesting as variable degrees of image blur across different regions within underwater images. Specifically, regions in close proximity to the camera exhibit less pronounced blur compared to distant regions. Within the same object category, objects situated in clear regions share similar feature embeddings with their counterparts in blurred regions. This observation underscores the potential for leveraging objects in clear regions to aid in the detection of objects within blurred areas, a critical requirement for autonomous agents, such as autonomous underwater vehicles, engaged in continuous underwater object detection. Motivated by this insight, we introduce the spatiotemporal feature enhancement network (STFEN), a novel framework engineered to autonomously extract discriminative features from objects in clear regions. These features are then harnessed to enhance the representations of objects in blurred regions, operating across both spatial and temporal dimensions. Notably, the proposed STFEN seamlessly integrates into two-stage detectors, such as the faster region-based convolutional neural networks (Faster R-CNN) and feature pyramid networks (FPN). Extensive experimentation conducted on two benchmark underwater datasets, URPC 2018 and URPC 2019, conclusively demonstrates the efficacy of the STFEN framework. It delivers substantial enhancements in performance relative to baseline methods, yielding improvements in the mAP evaluation metric ranging from 3.7% to 5.0%.
期刊介绍:
The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.