USOD10K：水下突出物体检测的新基准数据集。

IF 10.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Image Processing Pub Date : 2023-04-14 DOI:10.1109/TIP.2023.3266163

Lin Hong, Xin Wang, Gan Zhang, Ming Zhao

{"title":"USOD10K：水下突出物体检测的新基准数据集。","authors":"Lin Hong, Xin Wang, Gan Zhang, Ming Zhao","doi":"10.1109/TIP.2023.3266163","DOIUrl":null,"url":null,"abstract":"Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.8000,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection.\",\"authors\":\"Lin Hong, Xin Wang, Gan Zhang, Ming Zhao\",\"doi\":\"10.1109/TIP.2023.3266163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.\",\"PeriodicalId\":13217,\"journal\":{\"name\":\"IEEE Transactions on Image Processing\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":10.8000,\"publicationDate\":\"2023-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TIP.2023.3266163\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TIP.2023.3266163","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

水下突出物体检测（USOD）因其在各种水下视觉任务中的良好表现而受到越来越多的关注。然而，由于缺乏对突出物体进行明确定义和像素注释的大规模数据集，水下突出物体检测研究仍处于早期阶段。为了解决这个问题，本文引入了一个名为 USOD10K 的新数据集。该数据集由 10,255 幅水下图像组成，涵盖 12 个不同水下场景中的 70 个突出物体类别。此外，该数据集还提供了所有图像的突出物体边界和深度图。USOD10K 是 USOD 社区的第一个大规模数据集，在多样性、复杂性和可扩展性方面实现了重大飞跃。其次，为 USOD10K 设计了一个简单但强大的基线，称为 TC-USOD。TC-USOD 采用基于编码器-解码器设计的混合架构，利用变压器和卷积分别作为编码器和解码器的基本计算构件。第三，我们对 35 种前沿 SOD/USOD 方法进行了全面总结，并在现有 USOD 数据集和 USOD10K 数据集上对这些方法进行了基准测试。结果表明，我们的 TC-USOD 在所有测试数据集上都取得了优异的性能。最后，讨论了 USOD10K 的其他几个使用案例，并指出了 USOD 研究的未来方向。这项工作将推动 USOD 研究的发展，促进水下视觉任务和视觉引导水下机器人的进一步研究。为了在这一研究领域铺平道路，所有数据集、代码和基准结果均可公开获取：https://github.com/LinHong-HIT/USOD10K。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection.

Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Image Processing 工程技术-工程：电子与电气

CiteScore

20.90

自引率

6.60%

发文量

774

审稿时长

7.6 months

期刊介绍： The IEEE Transactions on Image Processing delves into groundbreaking theories, algorithms, and structures concerning the generation, acquisition, manipulation, transmission, scrutiny, and presentation of images, video, and multidimensional signals across diverse applications. Topics span mathematical, statistical, and perceptual aspects, encompassing modeling, representation, formation, coding, filtering, enhancement, restoration, rendering, halftoning, search, and analysis of images, video, and multidimensional signals. Pertinent applications range from image and video communications to electronic imaging, biomedical imaging, image and video systems, and remote sensing.

期刊最新文献

Data Subdivision Based Dual-Weighted Robust Principal Component Analysis Shell-guided Compression of Voxel Radiance Fields Rethinking Copy-Paste for Consistency Learning in Medical Image Segmentation Joint Spatial and Frequency Domain Learning for Lightweight Spectral Image Demosaicing Implicit-explicit Integrated Representations for Multi-view Video Compression