UPT-Flow：用于低照度图像增强的多尺度变压器引导归一化流程

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2024-10-11 DOI:10.1016/j.patcog.2024.111076

Lintao Xu , Changhui Hu , Yin Hu , Xiaoyuan Jing , Ziyun Cai , Xiaobo Lu

{"title":"UPT-Flow：用于低照度图像增强的多尺度变压器引导归一化流程","authors":"Lintao Xu , Changhui Hu , Yin Hu , Xiaoyuan Jing , Ziyun Cai , Xiaobo Lu","doi":"10.1016/j.patcog.2024.111076","DOIUrl":null,"url":null,"abstract":"<div><div>Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at <span><span>https://github.com/NJUPT-IPR-XuLintao/UPT-Flow</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"158 ","pages":"Article 111076"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UPT-Flow: Multi-scale transformer-guided normalizing flow for low-light image enhancement\",\"authors\":\"Lintao Xu , Changhui Hu , Yin Hu , Xiaoyuan Jing , Ziyun Cai , Xiaobo Lu\",\"doi\":\"10.1016/j.patcog.2024.111076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at <span><span>https://github.com/NJUPT-IPR-XuLintao/UPT-Flow</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"158 \",\"pages\":\"Article 111076\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324008276\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008276","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于光照条件极低或不均匀，低照度图像通常会出现信息丢失和 RGB 值下降的问题。现有的许多方法主要侧重于优化增强图像与正常光照图像之间的外观距离，而忽视了对低照度图像中信息丢失区域或不正确信息点的明确建模。针对这一问题，本文提出了一种用于弱光图像增强的非平衡点引导的基于多尺度变换器的条件归一化流程（UPT-Flow）。我们根据图像中每个像素的 RGB 值比例差异设计了一种非平衡点图先验，用来修正传统的自我注意力，减轻信息失真区域在注意力计算中的负面影响。多尺度变换器（MSFormer）由多个全局-局部变换器块组成，编码丰富的全局上下文信息和局部细粒度细节，用于条件归一化流量。在流动的可逆网络中，我们设计了基于通道和空间注意力的交叉耦合条件仿射层，从而增强了单一流动步骤的表现力。在没有任何附加功能的情况下，我们在弱光图像增强、夜间交通监控增强、弱光物体检测和夜间图像分割等方面进行的大量实验证明，我们提出的方法在各种真实场景中都能达到最先进的性能。代码和预训练模型将发布在 https://github.com/NJUPT-IPR-XuLintao/UPT-Flow 网站上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UPT-Flow: Multi-scale transformer-guided normalizing flow for low-light image enhancement

Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at https://github.com/NJUPT-IPR-XuLintao/UPT-Flow.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.