Yuanyuan Li;Zetian Mi;Yulin Wang;Shuaiyong Jiang;Xianping Fu
{"title":"TAFormer: A Transmission-Aware Transformer for Underwater Image Enhancement","authors":"Yuanyuan Li;Zetian Mi;Yulin Wang;Shuaiyong Jiang;Xianping Fu","doi":"10.1109/TCSVT.2024.3455353","DOIUrl":null,"url":null,"abstract":"The attenuation and scattering of different colors of light underwater are wavelength- and distance-dependent, leading to various degradation problems in underwater images. When enhancing underwater images, many deep learning-based methods rely solely on convolutional neural networks to learn a mapping from degraded images to clear images to achieve enhanced effects. However, such methods have limitations in capturing long-term dependencies, preventing them from accurately capturing the global information of images. Although Transformers can solve this problem, there is a lack of inductive bias in training due to the limited number of training datasets with certain degradation phenomena. To address this issue, a novel Swin Transformer based on physical perception is proposed for the first time. Swin Transformer is used to solve the long- and short-distance dependency problem. Additionally, the underwater image degradation process is considered in network design to solve the problem of poor inductive bias. Combining the advantages of physical imaging, convolutional neural networks and Transformer can effectively improve the visual quality of underwater images. Rich qualitative and quantitative experimental results show that our Transformer achieves competitive performance on 5 benchmark datasets.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"601-616"},"PeriodicalIF":11.1000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10669071/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The attenuation and scattering of different colors of light underwater are wavelength- and distance-dependent, leading to various degradation problems in underwater images. When enhancing underwater images, many deep learning-based methods rely solely on convolutional neural networks to learn a mapping from degraded images to clear images to achieve enhanced effects. However, such methods have limitations in capturing long-term dependencies, preventing them from accurately capturing the global information of images. Although Transformers can solve this problem, there is a lack of inductive bias in training due to the limited number of training datasets with certain degradation phenomena. To address this issue, a novel Swin Transformer based on physical perception is proposed for the first time. Swin Transformer is used to solve the long- and short-distance dependency problem. Additionally, the underwater image degradation process is considered in network design to solve the problem of poor inductive bias. Combining the advantages of physical imaging, convolutional neural networks and Transformer can effectively improve the visual quality of underwater images. Rich qualitative and quantitative experimental results show that our Transformer achieves competitive performance on 5 benchmark datasets.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.