黑暗中的文字:极低照度下的文字图像增强

IF 3.4 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Signal Processing-Image Communication Pub Date : 2024-10-28 DOI:10.1016/j.image.2024.117222
Che-Tsung Lin , Chun Chet Ng , Zhi Qin Tan , Wan Jun Nah , Xinyu Wang , Jie Long Kew , Pohao Hsu , Shang Hong Lai , Chee Seng Chan , Christopher Zach
{"title":"黑暗中的文字:极低照度下的文字图像增强","authors":"Che-Tsung Lin ,&nbsp;Chun Chet Ng ,&nbsp;Zhi Qin Tan ,&nbsp;Wan Jun Nah ,&nbsp;Xinyu Wang ,&nbsp;Jie Long Kew ,&nbsp;Pohao Hsu ,&nbsp;Shang Hong Lai ,&nbsp;Chee Seng Chan ,&nbsp;Christopher Zach","doi":"10.1016/j.image.2024.117222","DOIUrl":null,"url":null,"abstract":"<div><div>Extremely low-light text images pose significant challenges for scene text detection. Existing methods enhance these images using low-light image enhancement techniques before text detection. However, they fail to address the importance of low-level features, which are essential for optimal performance in downstream scene text tasks. Further research is also limited by the scarcity of extremely low-light text datasets. To address these limitations, we propose a novel, text-aware extremely low-light image enhancement framework. Our approach first integrates a Text-Aware Copy-Paste (Text-CP) augmentation method as a preprocessing step, followed by a dual-encoder–decoder architecture enhanced with Edge-Aware attention modules. We also introduce text detection and edge reconstruction losses to train the model to generate images with higher text visibility. Additionally, we propose a Supervised Deep Curve Estimation (Supervised-DCE) model for synthesizing extremely low-light images, allowing training on publicly available scene text datasets such as IC15. To further advance this domain, we annotated texts in the extremely low-light See In the Dark (SID) and ordinary LOw-Light (LOL) datasets. The proposed framework is rigorously tested against various traditional and deep learning-based methods on the newly labeled SID-Sony-Text, SID-Fuji-Text, LOL-Text, and synthetic extremely low-light IC15 datasets. Our extensive experiments demonstrate notable improvements in both image enhancement and scene text tasks, showcasing the model’s efficacy in text detection under extremely low-light conditions. Code and datasets will be released publicly at <span><span>https://github.com/chunchet-ng/Text-in-the-Dark</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"130 ","pages":"Article 117222"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text in the dark: Extremely low-light text image enhancement\",\"authors\":\"Che-Tsung Lin ,&nbsp;Chun Chet Ng ,&nbsp;Zhi Qin Tan ,&nbsp;Wan Jun Nah ,&nbsp;Xinyu Wang ,&nbsp;Jie Long Kew ,&nbsp;Pohao Hsu ,&nbsp;Shang Hong Lai ,&nbsp;Chee Seng Chan ,&nbsp;Christopher Zach\",\"doi\":\"10.1016/j.image.2024.117222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Extremely low-light text images pose significant challenges for scene text detection. Existing methods enhance these images using low-light image enhancement techniques before text detection. However, they fail to address the importance of low-level features, which are essential for optimal performance in downstream scene text tasks. Further research is also limited by the scarcity of extremely low-light text datasets. To address these limitations, we propose a novel, text-aware extremely low-light image enhancement framework. Our approach first integrates a Text-Aware Copy-Paste (Text-CP) augmentation method as a preprocessing step, followed by a dual-encoder–decoder architecture enhanced with Edge-Aware attention modules. We also introduce text detection and edge reconstruction losses to train the model to generate images with higher text visibility. Additionally, we propose a Supervised Deep Curve Estimation (Supervised-DCE) model for synthesizing extremely low-light images, allowing training on publicly available scene text datasets such as IC15. To further advance this domain, we annotated texts in the extremely low-light See In the Dark (SID) and ordinary LOw-Light (LOL) datasets. The proposed framework is rigorously tested against various traditional and deep learning-based methods on the newly labeled SID-Sony-Text, SID-Fuji-Text, LOL-Text, and synthetic extremely low-light IC15 datasets. Our extensive experiments demonstrate notable improvements in both image enhancement and scene text tasks, showcasing the model’s efficacy in text detection under extremely low-light conditions. Code and datasets will be released publicly at <span><span>https://github.com/chunchet-ng/Text-in-the-Dark</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"130 \",\"pages\":\"Article 117222\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0923596524001231\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596524001231","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

极低照度下的文本图像给场景文本检测带来了巨大挑战。现有方法在文本检测前使用低照度图像增强技术来增强这些图像。然而,这些方法未能解决低层次特征的重要性问题,而低层次特征对于优化下游场景文本任务的性能至关重要。此外,极低照度文本数据集的缺乏也限制了进一步的研究。为了解决这些局限性,我们提出了一种新颖的、文本感知的极低照度图像增强框架。我们的方法首先集成了文本感知复制-粘贴(Text-CP)增强方法作为预处理步骤,然后采用边缘感知注意模块增强的双编码器-解码器架构。我们还引入了文本检测和边缘重建损失,以训练模型生成具有更高文本可见性的图像。此外,我们还提出了一种用于合成极低光图像的监督深度曲线估计(Supervised-DCE)模型,允许在 IC15 等公开可用的场景文本数据集上进行训练。为了进一步推动这一领域的发展,我们在极低照度的 "See In the Dark"(SID)和普通的 "LOw-Light"(LOL)数据集中对文本进行了注释。在新标注的 SID-Sony-文本、SID-Fuji-文本、LOL-文本和合成的极低照度 IC15 数据集上,我们针对各种传统方法和基于深度学习的方法对所提出的框架进行了严格测试。我们的大量实验表明,该模型在图像增强和场景文本任务方面都有显著改进,展示了该模型在极弱光条件下进行文本检测的功效。代码和数据集将在 https://github.com/chunchet-ng/Text-in-the-Dark 上公开发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Text in the dark: Extremely low-light text image enhancement
Extremely low-light text images pose significant challenges for scene text detection. Existing methods enhance these images using low-light image enhancement techniques before text detection. However, they fail to address the importance of low-level features, which are essential for optimal performance in downstream scene text tasks. Further research is also limited by the scarcity of extremely low-light text datasets. To address these limitations, we propose a novel, text-aware extremely low-light image enhancement framework. Our approach first integrates a Text-Aware Copy-Paste (Text-CP) augmentation method as a preprocessing step, followed by a dual-encoder–decoder architecture enhanced with Edge-Aware attention modules. We also introduce text detection and edge reconstruction losses to train the model to generate images with higher text visibility. Additionally, we propose a Supervised Deep Curve Estimation (Supervised-DCE) model for synthesizing extremely low-light images, allowing training on publicly available scene text datasets such as IC15. To further advance this domain, we annotated texts in the extremely low-light See In the Dark (SID) and ordinary LOw-Light (LOL) datasets. The proposed framework is rigorously tested against various traditional and deep learning-based methods on the newly labeled SID-Sony-Text, SID-Fuji-Text, LOL-Text, and synthetic extremely low-light IC15 datasets. Our extensive experiments demonstrate notable improvements in both image enhancement and scene text tasks, showcasing the model’s efficacy in text detection under extremely low-light conditions. Code and datasets will be released publicly at https://github.com/chunchet-ng/Text-in-the-Dark.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Signal Processing-Image Communication
Signal Processing-Image Communication 工程技术-工程:电子与电气
CiteScore
8.40
自引率
2.90%
发文量
138
审稿时长
5.2 months
期刊介绍: Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.
期刊最新文献
SES-ReNet: Lightweight deep learning model for human detection in hazy weather conditions HOI-V: One-stage human-object interaction detection based on multi-feature fusion in videos Text in the dark: Extremely low-light text image enhancement High efficiency deep image compression via channel-wise scale adaptive latent representation learning Double supervision for scene text detection and recognition based on BMINet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1