DADNet：基于边界适应的无人机视角任意形状文本检测

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2024-11-14 DOI:10.1007/s40747-024-01617-7

Jun Liu, Jianxun Zhang, Ting Tang, Shengyuan Wu

{"title":"DADNet：基于边界适应的无人机视角任意形状文本检测","authors":"Jun Liu, Jianxun Zhang, Ting Tang, Shengyuan Wu","doi":"10.1007/s40747-024-01617-7","DOIUrl":null,"url":null,"abstract":"<p>The rapid development of drone technology has made drones one of the essential tools for acquiring aerial information. The detection and localization of text information through drones greatly enhance their understanding of the environment, enabling tasks of significant importance such as community commercial planning and autonomous navigation in intelligent environments. However, the unique perspective and complex environment during drone photography lead to various challenges in text detection, including diverse text shapes, large-scale variations, and background interference, making traditional methods inadequate. To address this issue, we propose a drone-based text detection method based on boundary adaptation. We first conduct an in-depth analysis of text characteristics from a drone’s perspective. Using ResNet50 as the backbone network, we introduce the proposed Hybrid Text Attention Mechanism into the backbone network to enhance the perception of text regions in the feature extraction module. Additionally, we propose a Spatial Feature Fusion Module to adaptively fuse text features of different scales, thereby enhancing the model’s adaptability. Furthermore, we introduce a text detail transformer by incorporating a local feature extractor into the transformer of the text detail boundary iteration optimization module. This enables the precise optimization and localization of text boundaries by reducing the interference of complex backgrounds, eliminating the need for complex post-processing. Extensive experiments on challenging text detection datasets and drone-based text detection datasets validate the high robustness and state-of-the-art performance of our proposed method, laying a solid foundation for practical applications.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"26 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DADNet: text detection of arbitrary shapes from drone perspective based on boundary adaptation\",\"authors\":\"Jun Liu, Jianxun Zhang, Ting Tang, Shengyuan Wu\",\"doi\":\"10.1007/s40747-024-01617-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The rapid development of drone technology has made drones one of the essential tools for acquiring aerial information. The detection and localization of text information through drones greatly enhance their understanding of the environment, enabling tasks of significant importance such as community commercial planning and autonomous navigation in intelligent environments. However, the unique perspective and complex environment during drone photography lead to various challenges in text detection, including diverse text shapes, large-scale variations, and background interference, making traditional methods inadequate. To address this issue, we propose a drone-based text detection method based on boundary adaptation. We first conduct an in-depth analysis of text characteristics from a drone’s perspective. Using ResNet50 as the backbone network, we introduce the proposed Hybrid Text Attention Mechanism into the backbone network to enhance the perception of text regions in the feature extraction module. Additionally, we propose a Spatial Feature Fusion Module to adaptively fuse text features of different scales, thereby enhancing the model’s adaptability. Furthermore, we introduce a text detail transformer by incorporating a local feature extractor into the transformer of the text detail boundary iteration optimization module. This enables the precise optimization and localization of text boundaries by reducing the interference of complex backgrounds, eliminating the need for complex post-processing. Extensive experiments on challenging text detection datasets and drone-based text detection datasets validate the high robustness and state-of-the-art performance of our proposed method, laying a solid foundation for practical applications.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01617-7\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01617-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

无人机技术的快速发展使无人机成为获取空中信息的重要工具之一。通过无人机对文字信息进行检测和定位，可以大大提高无人机对环境的理解能力，从而实现社区商业规划和智能环境下的自主导航等重要任务。然而，无人机拍摄时独特的视角和复杂的环境给文本检测带来了各种挑战，包括文本形状多样、大尺度变化和背景干扰等，使得传统方法难以胜任。针对这一问题，我们提出了一种基于边界自适应的无人机文本检测方法。我们首先从无人机的角度对文本特征进行了深入分析。利用 ResNet50 作为骨干网络，我们在骨干网络中引入了所提出的混合文本关注机制，以增强特征提取模块对文本区域的感知。此外，我们还提出了空间特征融合模块，以自适应地融合不同尺度的文本特征，从而增强模型的适应性。此外，我们还引入了文本细节变换器，将局部特征提取器纳入文本细节边界迭代优化模块的变换器中。这样就能通过减少复杂背景的干扰来实现文本边界的精确优化和定位，从而省去了复杂的后处理。在具有挑战性的文本检测数据集和基于无人机的文本检测数据集上进行的大量实验验证了我们提出的方法的高鲁棒性和先进性能，为实际应用奠定了坚实的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DADNet: text detection of arbitrary shapes from drone perspective based on boundary adaptation

The rapid development of drone technology has made drones one of the essential tools for acquiring aerial information. The detection and localization of text information through drones greatly enhance their understanding of the environment, enabling tasks of significant importance such as community commercial planning and autonomous navigation in intelligent environments. However, the unique perspective and complex environment during drone photography lead to various challenges in text detection, including diverse text shapes, large-scale variations, and background interference, making traditional methods inadequate. To address this issue, we propose a drone-based text detection method based on boundary adaptation. We first conduct an in-depth analysis of text characteristics from a drone’s perspective. Using ResNet50 as the backbone network, we introduce the proposed Hybrid Text Attention Mechanism into the backbone network to enhance the perception of text regions in the feature extraction module. Additionally, we propose a Spatial Feature Fusion Module to adaptively fuse text features of different scales, thereby enhancing the model’s adaptability. Furthermore, we introduce a text detail transformer by incorporating a local feature extractor into the transformer of the text detail boundary iteration optimization module. This enables the precise optimization and localization of text boundaries by reducing the interference of complex backgrounds, eliminating the need for complex post-processing. Extensive experiments on challenging text detection datasets and drone-based text detection datasets validate the high robustness and state-of-the-art performance of our proposed method, laying a solid foundation for practical applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.