Linling Wang , Xiaoyan Xu , Shunmin An , Bing Han , Yi Guo
{"title":"CodeUNet:通过水下编码本先验实现自主潜水器真实视觉增强","authors":"Linling Wang , Xiaoyan Xu , Shunmin An , Bing Han , Yi Guo","doi":"10.1016/j.isprsjprs.2024.06.009","DOIUrl":null,"url":null,"abstract":"<div><p>The vision enhancement of autonomous underwater vehicle (AUV) has received increasing attention and rapid development in recent years. However, existing methods based on prior knowledge struggle to adapt to all scenarios, while learning-based approaches lack paired datasets from real-world scenes, limiting their enhancement capabilities. Consequently, this severely hampers their generalization and application in AUVs. Besides, the existing deep learning-based methods largely overlook the advantages of prior knowledge-based approaches. To address the aforementioned issues, a novel architecture called CodeUNet is proposed in this paper. Instead of relying on physical scattering models, a real-world scene vision enhancement network based on a codebook prior is considered. First, the VQGAN is pretrained on underwater datasets to obtain a discrete codebook, encapsulating the underwater priors (UPs). The decoder is equipped with a novel feature alignment module that effectively leverages underwater features to generate clean results. Then, the distance between the features and the matches is recalibrated by controllable matching operations, enabling better matching. Extensive experiments demonstrate that CodeUNet outperforms state-of-the-art methods in terms of visual quality and quantitative metrics. The testing results of geometric rotation, SIFT salient point detection, and edge detection applications are shown in this paper, providing strong evidence for the feasibility of CodeUNet in the field of autonomous underwater vehicles. Specifically, on the full reference dataset, the proposed method outperforms most of the 14 state-of-the-art methods in four evaluation metrics, with an improvement of up to 3.7722 compared to MLLE. On the no-reference dataset, the proposed method achieves excellent results, with an improvement of up to 0.0362 compared to MLLE. Links to the dataset and code for this project can be found at: <span>https://github.com/An-Shunmin/CodeUNet</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CodeUNet: Autonomous underwater vehicle real visual enhancement via underwater codebook priors\",\"authors\":\"Linling Wang , Xiaoyan Xu , Shunmin An , Bing Han , Yi Guo\",\"doi\":\"10.1016/j.isprsjprs.2024.06.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The vision enhancement of autonomous underwater vehicle (AUV) has received increasing attention and rapid development in recent years. However, existing methods based on prior knowledge struggle to adapt to all scenarios, while learning-based approaches lack paired datasets from real-world scenes, limiting their enhancement capabilities. Consequently, this severely hampers their generalization and application in AUVs. Besides, the existing deep learning-based methods largely overlook the advantages of prior knowledge-based approaches. To address the aforementioned issues, a novel architecture called CodeUNet is proposed in this paper. Instead of relying on physical scattering models, a real-world scene vision enhancement network based on a codebook prior is considered. First, the VQGAN is pretrained on underwater datasets to obtain a discrete codebook, encapsulating the underwater priors (UPs). The decoder is equipped with a novel feature alignment module that effectively leverages underwater features to generate clean results. Then, the distance between the features and the matches is recalibrated by controllable matching operations, enabling better matching. Extensive experiments demonstrate that CodeUNet outperforms state-of-the-art methods in terms of visual quality and quantitative metrics. The testing results of geometric rotation, SIFT salient point detection, and edge detection applications are shown in this paper, providing strong evidence for the feasibility of CodeUNet in the field of autonomous underwater vehicles. Specifically, on the full reference dataset, the proposed method outperforms most of the 14 state-of-the-art methods in four evaluation metrics, with an improvement of up to 3.7722 compared to MLLE. On the no-reference dataset, the proposed method achieves excellent results, with an improvement of up to 0.0362 compared to MLLE. Links to the dataset and code for this project can be found at: <span>https://github.com/An-Shunmin/CodeUNet</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":10.6000,\"publicationDate\":\"2024-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271624002478\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271624002478","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
CodeUNet: Autonomous underwater vehicle real visual enhancement via underwater codebook priors
The vision enhancement of autonomous underwater vehicle (AUV) has received increasing attention and rapid development in recent years. However, existing methods based on prior knowledge struggle to adapt to all scenarios, while learning-based approaches lack paired datasets from real-world scenes, limiting their enhancement capabilities. Consequently, this severely hampers their generalization and application in AUVs. Besides, the existing deep learning-based methods largely overlook the advantages of prior knowledge-based approaches. To address the aforementioned issues, a novel architecture called CodeUNet is proposed in this paper. Instead of relying on physical scattering models, a real-world scene vision enhancement network based on a codebook prior is considered. First, the VQGAN is pretrained on underwater datasets to obtain a discrete codebook, encapsulating the underwater priors (UPs). The decoder is equipped with a novel feature alignment module that effectively leverages underwater features to generate clean results. Then, the distance between the features and the matches is recalibrated by controllable matching operations, enabling better matching. Extensive experiments demonstrate that CodeUNet outperforms state-of-the-art methods in terms of visual quality and quantitative metrics. The testing results of geometric rotation, SIFT salient point detection, and edge detection applications are shown in this paper, providing strong evidence for the feasibility of CodeUNet in the field of autonomous underwater vehicles. Specifically, on the full reference dataset, the proposed method outperforms most of the 14 state-of-the-art methods in four evaluation metrics, with an improvement of up to 3.7722 compared to MLLE. On the no-reference dataset, the proposed method achieves excellent results, with an improvement of up to 0.0362 compared to MLLE. Links to the dataset and code for this project can be found at: https://github.com/An-Shunmin/CodeUNet.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.