Wenzhe Shi, Ziqi Hu, Hao Chen, Hengjia Zhang, Jiale Yang, Li Li
{"title":"Orhlr-net:用于联合检测和去除单张图像镜面高光的单级残差学习网络","authors":"Wenzhe Shi, Ziqi Hu, Hao Chen, Hengjia Zhang, Jiale Yang, Li Li","doi":"10.1007/s00371-024-03607-9","DOIUrl":null,"url":null,"abstract":"<p>Detecting and removing specular highlights is a complex task that can greatly enhance various visual tasks in real-world environments. Although previous works have made great progress, they often ignore specular highlight areas or produce unsatisfactory results with visual artifacts such as color distortion. In this paper, we present a framework that utilizes an encoder–decoder structure for the combined task of specular highlight detection and removal in single images, employing specular highlight mask guidance. The encoder uses EfficientNet as a feature extraction backbone network to convert the input RGB image into a series of feature maps. The decoder gradually restores these feature maps to their original size through up-sampling. In the specular highlight detection module, we enhance the network by utilizing residual modules to extract additional feature information, thereby improving detection accuracy. For the specular highlight removal module, we introduce the Convolutional Block Attention Module, which dynamically captures the importance of each channel and spatial location in the input feature map. This enables the model to effectively distinguish between foreground and background, resulting in enhanced adaptability and accuracy in complex scenes. We evaluate the proposed method on the publicly available SHIQ dataset, and its superiority is demonstrated through a comparative analysis of the experimental results. The source code will be available at https://github.com/hzq2333/ORHLR-Net.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Orhlr-net: one-stage residual learning network for joint single-image specular highlight detection and removal\",\"authors\":\"Wenzhe Shi, Ziqi Hu, Hao Chen, Hengjia Zhang, Jiale Yang, Li Li\",\"doi\":\"10.1007/s00371-024-03607-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Detecting and removing specular highlights is a complex task that can greatly enhance various visual tasks in real-world environments. Although previous works have made great progress, they often ignore specular highlight areas or produce unsatisfactory results with visual artifacts such as color distortion. In this paper, we present a framework that utilizes an encoder–decoder structure for the combined task of specular highlight detection and removal in single images, employing specular highlight mask guidance. The encoder uses EfficientNet as a feature extraction backbone network to convert the input RGB image into a series of feature maps. The decoder gradually restores these feature maps to their original size through up-sampling. In the specular highlight detection module, we enhance the network by utilizing residual modules to extract additional feature information, thereby improving detection accuracy. For the specular highlight removal module, we introduce the Convolutional Block Attention Module, which dynamically captures the importance of each channel and spatial location in the input feature map. This enables the model to effectively distinguish between foreground and background, resulting in enhanced adaptability and accuracy in complex scenes. We evaluate the proposed method on the publicly available SHIQ dataset, and its superiority is demonstrated through a comparative analysis of the experimental results. The source code will be available at https://github.com/hzq2333/ORHLR-Net.</p>\",\"PeriodicalId\":501186,\"journal\":{\"name\":\"The Visual Computer\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Visual Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00371-024-03607-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03607-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Orhlr-net: one-stage residual learning network for joint single-image specular highlight detection and removal
Detecting and removing specular highlights is a complex task that can greatly enhance various visual tasks in real-world environments. Although previous works have made great progress, they often ignore specular highlight areas or produce unsatisfactory results with visual artifacts such as color distortion. In this paper, we present a framework that utilizes an encoder–decoder structure for the combined task of specular highlight detection and removal in single images, employing specular highlight mask guidance. The encoder uses EfficientNet as a feature extraction backbone network to convert the input RGB image into a series of feature maps. The decoder gradually restores these feature maps to their original size through up-sampling. In the specular highlight detection module, we enhance the network by utilizing residual modules to extract additional feature information, thereby improving detection accuracy. For the specular highlight removal module, we introduce the Convolutional Block Attention Module, which dynamically captures the importance of each channel and spatial location in the input feature map. This enables the model to effectively distinguish between foreground and background, resulting in enhanced adaptability and accuracy in complex scenes. We evaluate the proposed method on the publicly available SHIQ dataset, and its superiority is demonstrated through a comparative analysis of the experimental results. The source code will be available at https://github.com/hzq2333/ORHLR-Net.