{"title":"WA-Net: Wavelet Integrated Attention Network for Silk and Bamboo character recognition","authors":"Shengnan Li, Chi Zhou, Kaili Wang","doi":"10.1016/j.engappai.2024.109674","DOIUrl":null,"url":null,"abstract":"<div><div>Chu Bamboo and Silk ancient Chinese character (CBSC) was originated in the Chu state over 2000 years ago, representing an intermediate script between oracle bone script and seal script. Existing text images have degraded and suffered damage due to their ancient historical origins and insufficient preservation. Due to distinct structural and stroke texture characteristics, significant differences exist between CBSC and contemporary characters, posing challenges for intelligent recognition. Targeting these aforementioned characteristics, we propose a method called Wavelet Integrated Attention Network (WA-Net). This method integrates discrete wavelet transform and attention mechanisms to extract more discriminative features from severe noise interference and degraded text images. Additionally, a dataset named Chu Bamboo and Silk 730 (Chu730) for CBSC recognition has been created due to the lack of publicly available datasets. WA-Net introduces the discrete wavelet attention among layer (L-DWT) to broaden the feature learning space of convolutional neural networks into the wavelet domain, capturing latent information across various frequencies. Subsequently, a wavelet convolution (C-DWT) module is proposed to mitigate the partial information loss of conventional convolution operations. In the W-bneck module, the SE (Squeeze-and-Excitation) attention module and average pooling downsampling are introduced to enhance the extraction of valuable feature maps. Extensive experiments were conducted, including a baseline method that achieved top-1 recognition accuracy of 87.42%. The proposed method achieved an accuracy of 89.27%, and other top-n results also significantly surpassed the baseline accuracy. Other experiment results demonstrate the superiority of the proposed modules and theirvaluable applications in ancient text intelligent recognition and cultural heritage digital preservation. Furthermore, this approach holds significant promise in facilitating the study of other handwritten or ancient characters recognition. Dataset and code are available at: <span><span>https://github.com/Nancy45-ui/WA-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109674"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624018323","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Chu Bamboo and Silk ancient Chinese character (CBSC) was originated in the Chu state over 2000 years ago, representing an intermediate script between oracle bone script and seal script. Existing text images have degraded and suffered damage due to their ancient historical origins and insufficient preservation. Due to distinct structural and stroke texture characteristics, significant differences exist between CBSC and contemporary characters, posing challenges for intelligent recognition. Targeting these aforementioned characteristics, we propose a method called Wavelet Integrated Attention Network (WA-Net). This method integrates discrete wavelet transform and attention mechanisms to extract more discriminative features from severe noise interference and degraded text images. Additionally, a dataset named Chu Bamboo and Silk 730 (Chu730) for CBSC recognition has been created due to the lack of publicly available datasets. WA-Net introduces the discrete wavelet attention among layer (L-DWT) to broaden the feature learning space of convolutional neural networks into the wavelet domain, capturing latent information across various frequencies. Subsequently, a wavelet convolution (C-DWT) module is proposed to mitigate the partial information loss of conventional convolution operations. In the W-bneck module, the SE (Squeeze-and-Excitation) attention module and average pooling downsampling are introduced to enhance the extraction of valuable feature maps. Extensive experiments were conducted, including a baseline method that achieved top-1 recognition accuracy of 87.42%. The proposed method achieved an accuracy of 89.27%, and other top-n results also significantly surpassed the baseline accuracy. Other experiment results demonstrate the superiority of the proposed modules and theirvaluable applications in ancient text intelligent recognition and cultural heritage digital preservation. Furthermore, this approach holds significant promise in facilitating the study of other handwritten or ancient characters recognition. Dataset and code are available at: https://github.com/Nancy45-ui/WA-Net.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.