{"title":"USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection.","authors":"Lin Hong, Xin Wang, Gan Zhang, Ming Zhao","doi":"10.1109/TIP.2023.3266163","DOIUrl":null,"url":null,"abstract":"<p><p>Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.8000,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TIP.2023.3266163","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.
期刊介绍:
The IEEE Transactions on Image Processing delves into groundbreaking theories, algorithms, and structures concerning the generation, acquisition, manipulation, transmission, scrutiny, and presentation of images, video, and multidimensional signals across diverse applications. Topics span mathematical, statistical, and perceptual aspects, encompassing modeling, representation, formation, coding, filtering, enhancement, restoration, rendering, halftoning, search, and analysis of images, video, and multidimensional signals. Pertinent applications range from image and video communications to electronic imaging, biomedical imaging, image and video systems, and remote sensing.