首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
Paying more attention to local contrast: Improving infrared small target detection performance via prior knowledge
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-20 DOI: 10.1016/j.engappai.2025.110244
Peichao Wang, Jiabao Wang, Yao Chen, Rui Zhang, Yang Li, Zhuang Miao
The data-driven methods for InfraRed Small Target Detection (IRSTD) have achieved promising results. However, these methods typically incorporate modules with high computational complexity, which enhance performance at the expense of computational efficiency. Utilizing human expert knowledge to assist data-driven methods in better learning with less costs is worthy of exploration. To effectively guide the model to focus on targets’ spatial features, this paper proposes the Local Contrast Attention Enhanced infrared small target detection Network (LCAE-Net), combining prior knowledge with data-driven deep learning methods. LCAE-Net is a U-shaped neural network model which consists of two developed modules: a Local Contrast Enhancement (LCE) module and a Channel Attention Enhancement (CAE) module. The LCE module takes advantage of prior knowledge, leveraging handcrafted convolution operators to acquire Local Contrast Attention (LCA), which could realize background suppression while enhancing the potential target region, thus guiding the neural network to pay more attention to potential infrared small targets’ location information. To effectively utilize the response information throughout the downsampling progresses, the CAE module is proposed to achieve the information fusion among feature maps’ different channels. Experimental results indicate that our LCAE-Net outperforms comparison methods on the three public datasets, and its detection speed could reach up to 70 Frames Per Second (FPS). Meanwhile, our model has a parameter count and Floating-Point Operations (FLOPs) of 1.945 Million (M) and 4.862 Giga (G) respectively, which is suitable for deployment on edge devices. Our code will be available at https://github.com/boa2004plaust/LCAENet.
{"title":"Paying more attention to local contrast: Improving infrared small target detection performance via prior knowledge","authors":"Peichao Wang,&nbsp;Jiabao Wang,&nbsp;Yao Chen,&nbsp;Rui Zhang,&nbsp;Yang Li,&nbsp;Zhuang Miao","doi":"10.1016/j.engappai.2025.110244","DOIUrl":"10.1016/j.engappai.2025.110244","url":null,"abstract":"<div><div>The data-driven methods for InfraRed Small Target Detection (IRSTD) have achieved promising results. However, these methods typically incorporate modules with high computational complexity, which enhance performance at the expense of computational efficiency. Utilizing human expert knowledge to assist data-driven methods in better learning with less costs is worthy of exploration. To effectively guide the model to focus on targets’ spatial features, this paper proposes the Local Contrast Attention Enhanced infrared small target detection Network (LCAE-Net), combining prior knowledge with data-driven deep learning methods. LCAE-Net is a U-shaped neural network model which consists of two developed modules: a Local Contrast Enhancement (LCE) module and a Channel Attention Enhancement (CAE) module. The LCE module takes advantage of prior knowledge, leveraging handcrafted convolution operators to acquire Local Contrast Attention (LCA), which could realize background suppression while enhancing the potential target region, thus guiding the neural network to pay more attention to potential infrared small targets’ location information. To effectively utilize the response information throughout the downsampling progresses, the CAE module is proposed to achieve the information fusion among feature maps’ different channels. Experimental results indicate that our LCAE-Net outperforms comparison methods on the three public datasets, and its detection speed could reach up to 70 Frames Per Second (FPS). Meanwhile, our model has a parameter count and Floating-Point Operations (FLOPs) of 1.945 Million (M) and 4.862 Giga (G) respectively, which is suitable for deployment on edge devices. Our code will be available at <span><span>https://github.com/boa2004plaust/LCAENet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110244"},"PeriodicalIF":7.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143453749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph-aware pre-trained language model for political sentiment analysis in Filipino social media
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-20 DOI: 10.1016/j.engappai.2025.110317
Jean Aristide Aquino , Di Jie Liew , Yung-Chun Chang
Elections are emotionally and sentimentally charged events that offer unique opportunities for analysis of sentiments not typically observed during non-election periods. Unlike recurring phenomena, elections are inherently singular events, with each election shaped by distinct political, social, and cultural contexts. In the digital age, social media has become a direct channel for politicians and political parties to engage with voters, making it a critical platform for sentiment analysis. However, challenges such as imbalanced datasets, the prevalence of noisy non-text elements (e.g., emojis, hashtags, user mentions), and the need for effective integration of graph-based learning remain significant hurdles in sentiment prediction. To address these challenges, we constructed an imbalanced dataset of 8035 manually annotated tweets and approximately 516,000 weakly labeled Filipino tweets related to the 2022 Philippine National Election. Leveraging these datasets, we designed a Bidirectional Encoder Representations from Transformers (BERT) and Graph Convolution Network (GCN) model, which uniquely incorporates emojis, hashtags, and user mentions as features to enhance semantic understanding. Differing from the prior literature that focused solely on textual data or discarded non-textual elements, our model integrates these features to achieve a robust performance that outperforms baseline models with a macro-recall score of 64.73% and a macro F1-score of 68.72% on the imbalanced dataset. Additionally, we introduce a topic modeling framework that combines BERT embeddings with Latent Dirichlet Allocation (LDA) and Log-Likelihood Ratio (LLR) to yield more distinct topic clusters for deeper sentiment analysis. Our work therefore contributes two novel datasets in Filipino as well as methodologies that bridge sentiment prediction and analysis, and in so doing, provides valuable resources for future research.
{"title":"Graph-aware pre-trained language model for political sentiment analysis in Filipino social media","authors":"Jean Aristide Aquino ,&nbsp;Di Jie Liew ,&nbsp;Yung-Chun Chang","doi":"10.1016/j.engappai.2025.110317","DOIUrl":"10.1016/j.engappai.2025.110317","url":null,"abstract":"<div><div>Elections are emotionally and sentimentally charged events that offer unique opportunities for analysis of sentiments not typically observed during non-election periods. Unlike recurring phenomena, elections are inherently singular events, with each election shaped by distinct political, social, and cultural contexts. In the digital age, social media has become a direct channel for politicians and political parties to engage with voters, making it a critical platform for sentiment analysis. However, challenges such as imbalanced datasets, the prevalence of noisy non-text elements (e.g., emojis, hashtags, user mentions), and the need for effective integration of graph-based learning remain significant hurdles in sentiment prediction. To address these challenges, we constructed an imbalanced dataset of 8035 manually annotated tweets and approximately 516,000 weakly labeled Filipino tweets related to the 2022 Philippine National Election. Leveraging these datasets, we designed a Bidirectional Encoder Representations from Transformers (BERT) and Graph Convolution Network (GCN) model, which uniquely incorporates emojis, hashtags, and user mentions as features to enhance semantic understanding. Differing from the prior literature that focused solely on textual data or discarded non-textual elements, our model integrates these features to achieve a robust performance that outperforms baseline models with a macro-recall score of 64.73% and a macro F<sub>1</sub>-score of 68.72% on the imbalanced dataset. Additionally, we introduce a topic modeling framework that combines BERT embeddings with Latent Dirichlet Allocation (LDA) and Log-Likelihood Ratio (LLR) to yield more distinct topic clusters for deeper sentiment analysis. Our work therefore contributes two novel datasets in Filipino as well as methodologies that bridge sentiment prediction and analysis, and in so doing, provides valuable resources for future research.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110317"},"PeriodicalIF":7.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143453813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A rolling bearing fault diagnosis framework under variable working conditions considers dynamic feature extraction
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-19 DOI: 10.1016/j.engappai.2025.110255
Wang Jia, Hui Shi, Zengshou Dong, Xiaoyi Zhang
As a key component in industrial machinery, rolling bearings usually operate at variable speeds. The features in the signal are dynamic due to speed changes, with complementarity and correlation embedded in the different features. However, the utilization of this complementarity and correlation in mining dynamic signal features has been neglected, leading to reduced accuracy of fault classification models and less adaptability to variable working conditions. To address this problem, a multi-scale asymmetric feature reproduction plots-shifted window transformer (MAFRP-ST) framework of rolling bearing fault diagnosis is proposed under variable speed conditions in this study. Specifically, the framework includes dynamic feature capture and dynamic feature learning modules. The dynamic feature capture module is designed to convert signals into multi-scale asymmetric feature reproduction plots (MAFRP) containing features in the time–frequency domain, allowing the deeper dynamic features to be better captured by exploiting the complementarities and correlation. In the dynamic feature learning module, a shifted window (Swin) transformer adapted to dynamic features at different scales is developed, calculating local attention according to the window size in each layer and incrementally increasing the receptive field layer by layer. Compared with recently proposed similar methods, the MAFRP-ST framework improves diagnosis accuracy by about 4.1% and 2.2% on average across two datasets, respectively, and better robustness to noise is demonstrated.
{"title":"A rolling bearing fault diagnosis framework under variable working conditions considers dynamic feature extraction","authors":"Wang Jia,&nbsp;Hui Shi,&nbsp;Zengshou Dong,&nbsp;Xiaoyi Zhang","doi":"10.1016/j.engappai.2025.110255","DOIUrl":"10.1016/j.engappai.2025.110255","url":null,"abstract":"<div><div>As a key component in industrial machinery, rolling bearings usually operate at variable speeds. The features in the signal are dynamic due to speed changes, with complementarity and correlation embedded in the different features. However, the utilization of this complementarity and correlation in mining dynamic signal features has been neglected, leading to reduced accuracy of fault classification models and less adaptability to variable working conditions. To address this problem, a multi-scale asymmetric feature reproduction plots-shifted window transformer (MAFRP-ST) framework of rolling bearing fault diagnosis is proposed under variable speed conditions in this study. Specifically, the framework includes dynamic feature capture and dynamic feature learning modules. The dynamic feature capture module is designed to convert signals into multi-scale asymmetric feature reproduction plots (MAFRP) containing features in the time–frequency domain, allowing the deeper dynamic features to be better captured by exploiting the complementarities and correlation. In the dynamic feature learning module, a shifted window (Swin) transformer adapted to dynamic features at different scales is developed, calculating local attention according to the window size in each layer and incrementally increasing the receptive field layer by layer. Compared with recently proposed similar methods, the MAFRP-ST framework improves diagnosis accuracy by about 4.1% and 2.2% on average across two datasets, respectively, and better robustness to noise is demonstrated.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110255"},"PeriodicalIF":7.5,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning multi-color curve for image harmonization
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-19 DOI: 10.1016/j.engappai.2025.110277
Jingrong Yuan, Hao Wu, Lidong Xie, Lei Zhang, Jichen Xing
Due to the varying shooting conditions, composite images often lack realism between the foreground and the back ground. As an important and challenging visual task, image harmonization can effectively improve visual effect of composite images. Currently, image harmonization methods have achieved satisfied performance on public dataset. However, in some challenging examples with substantial color disparities between the foreground and the background, existing methods get poor results. To solve this problem, we propose a Multi-color Curve Net that processes images through multiple color spaces to capture richer color information. Our Multi-color Curve Net performs multi-stage curve learning in different color spaces with the encoder composed of modified Transformer blocks. Simultaneously, we introduce a Multi-color Integration Module to effectively fuse the information extracted from different color spaces and further improve the results by a lightweight Fine-grained Optimization Module. The Multi-color Curve Net gains high performance while maintaining a small parameter scale. Experiments on benchmark demonstrate that the Multi-color Curve Net outperforms state-of-the-art methods in terms of peak signal to-noise ratio (PSNR), structural similarity (SSIM) and foreground mean squared error (fMSE) with fewer parameters. The code for our method is available at https://github.com/gmrj2024/MC2Net.
{"title":"Learning multi-color curve for image harmonization","authors":"Jingrong Yuan,&nbsp;Hao Wu,&nbsp;Lidong Xie,&nbsp;Lei Zhang,&nbsp;Jichen Xing","doi":"10.1016/j.engappai.2025.110277","DOIUrl":"10.1016/j.engappai.2025.110277","url":null,"abstract":"<div><div>Due to the varying shooting conditions, composite images often lack realism between the foreground and the back ground. As an important and challenging visual task, image harmonization can effectively improve visual effect of composite images. Currently, image harmonization methods have achieved satisfied performance on public dataset. However, in some challenging examples with substantial color disparities between the foreground and the background, existing methods get poor results. To solve this problem, we propose a Multi-color Curve Net that processes images through multiple color spaces to capture richer color information. Our Multi-color Curve Net performs multi-stage curve learning in different color spaces with the encoder composed of modified Transformer blocks. Simultaneously, we introduce a Multi-color Integration Module to effectively fuse the information extracted from different color spaces and further improve the results by a lightweight Fine-grained Optimization Module. The Multi-color Curve Net gains high performance while maintaining a small parameter scale. Experiments on benchmark demonstrate that the Multi-color Curve Net outperforms state-of-the-art methods in terms of peak signal to-noise ratio (PSNR), structural similarity (SSIM) and foreground mean squared error (fMSE) with fewer parameters. The code for our method is available at <span><span>https://github.com/gmrj2024/MC2Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110277"},"PeriodicalIF":7.5,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Short-term offshore wind speed forecasting approach based on multi-stage decomposition and deep residual network with self-attention
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-19 DOI: 10.1016/j.engappai.2025.110313
Hakan Acikgoz , Deniz Korkmaz
Wind energy is one of the widely used renewable energy systems. Wind speed forecasting is used to produce of wind energy and to ensure the sustainability of the power system. However, offshore wind speed forecasting is a challenging task with complex variables and highly nonlinear temporal dynamics of the ocean. This paper proposes a hybrid and robust offshore wind speed forecasting approach based on multi-stage decomposition, deep convolutional neural network (CNN), and extreme learning machine (ELM). Unlike conventional preprocessing for forecasting of renewable energy problems, the proposed approach combines two efficient decomposition methods as complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and ensemble empirical mode decomposition (EEMD). This method can decompose high-frequency and low-frequency components of the wind speed. While high-frequency components are decomposed with the EEMD, low-frequency components are directly sent to the ELM model. The obtained mode functions from the EEMD are then fed to the designed network for forecasting. The CNN model is constructed with the deep residual network and self-attention (SA) mechanism to improve the network performance. In the comparative evaluations, while other approaches give lower forecasting performance between 0.8233 and 2.1885 for the root mean square error (RMSE), the proposed method presents the lowest RMSE value as 0.5400. The experimental results show that the proposed method exhibits more accurate and robust forecasting performance compared with other model combinations and deep learning models.
{"title":"Short-term offshore wind speed forecasting approach based on multi-stage decomposition and deep residual network with self-attention","authors":"Hakan Acikgoz ,&nbsp;Deniz Korkmaz","doi":"10.1016/j.engappai.2025.110313","DOIUrl":"10.1016/j.engappai.2025.110313","url":null,"abstract":"<div><div>Wind energy is one of the widely used renewable energy systems. Wind speed forecasting is used to produce of wind energy and to ensure the sustainability of the power system. However, offshore wind speed forecasting is a challenging task with complex variables and highly nonlinear temporal dynamics of the ocean. This paper proposes a hybrid and robust offshore wind speed forecasting approach based on multi-stage decomposition, deep convolutional neural network (CNN), and extreme learning machine (ELM). Unlike conventional preprocessing for forecasting of renewable energy problems, the proposed approach combines two efficient decomposition methods as complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and ensemble empirical mode decomposition (EEMD). This method can decompose high-frequency and low-frequency components of the wind speed. While high-frequency components are decomposed with the EEMD, low-frequency components are directly sent to the ELM model. The obtained mode functions from the EEMD are then fed to the designed network for forecasting. The CNN model is constructed with the deep residual network and self-attention (SA) mechanism to improve the network performance. In the comparative evaluations, while other approaches give lower forecasting performance between 0.8233 and 2.1885 for the root mean square error (RMSE), the proposed method presents the lowest RMSE value as 0.5400. The experimental results show that the proposed method exhibits more accurate and robust forecasting performance compared with other model combinations and deep learning models.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110313"},"PeriodicalIF":7.5,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on learning with noisy labels in Natural Language Processing: How to train models with label noise
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-19 DOI: 10.1016/j.engappai.2025.110157
Han Zhang , Yazhou Zhang , Jiajun Li , Junxiu Liu , Lixia Ji
When applying deep neural network language models to related systems (e.g., question answering systems, chatbots, and intelligent assistants), many datasets contain different types or degrees of label noise. Label noise can lead to a decline in model performance and an increase in resource consumption. Therefore, learning with noisy labels is becoming an important task in Natural Language Processing (NLP). This paper aims to collect, analyze, and evaluate methods for learning with label noise in NLP. First, we analyze the relationship between data feature extraction, prediction output, and optimization in the context of noise robustness to help researchers understand the mechanisms behind noise generation. Based on this, we classified the noise processing methods into five types according to the training process: feature vector, transition matrix, prediction confidence, loss improvement, and data weighting. We analyze each method and conduct a systematic evaluation across six metrics. In addition, we summarized the commonly used resources such as datasets, open source codes, etc. Finally, we also analyzed the challenges faced in current research and the potential opportunities. As a comprehensive survey, this work will help researchers and industry developers to understand the current state of research and unique challenges facing label-noise learning, which facilitate the selection and combination of different methods in applications to further advancements.
{"title":"A survey on learning with noisy labels in Natural Language Processing: How to train models with label noise","authors":"Han Zhang ,&nbsp;Yazhou Zhang ,&nbsp;Jiajun Li ,&nbsp;Junxiu Liu ,&nbsp;Lixia Ji","doi":"10.1016/j.engappai.2025.110157","DOIUrl":"10.1016/j.engappai.2025.110157","url":null,"abstract":"<div><div>When applying deep neural network language models to related systems (e.g., question answering systems, chatbots, and intelligent assistants), many datasets contain different types or degrees of label noise. Label noise can lead to a decline in model performance and an increase in resource consumption. Therefore, learning with noisy labels is becoming an important task in Natural Language Processing (NLP). This paper aims to collect, analyze, and evaluate methods for learning with label noise in NLP. First, we analyze the relationship between data feature extraction, prediction output, and optimization in the context of noise robustness to help researchers understand the mechanisms behind noise generation. Based on this, we classified the noise processing methods into five types according to the training process: feature vector, transition matrix, prediction confidence, loss improvement, and data weighting. We analyze each method and conduct a systematic evaluation across six metrics. In addition, we summarized the commonly used resources such as datasets, open source codes, etc. Finally, we also analyzed the challenges faced in current research and the potential opportunities. As a comprehensive survey, this work will help researchers and industry developers to understand the current state of research and unique challenges facing label-noise learning, which facilitate the selection and combination of different methods in applications to further advancements.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110157"},"PeriodicalIF":7.5,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MIRROR: Multi-scale iterative refinement for robust chinese text recognition
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-18 DOI: 10.1016/j.engappai.2025.110270
Hengnian Qi , Qiuyi Xin , Jiabin Ye , Hao Yang , Kai Zhang , Chu Zhang , Qing Lang
Text recognition has become a key area of research due to its wide applications in various fields. As an important branch of computer vision, Chinese text recognition has gained increasing research and practical value. However, the existing Chinese text recognition methods are still limited. This paper proposes an innovative Chinese text recognition method, Multi-Scale Iterative Refinement for Robust Chinese Text Recognition (MIRROR). The model significantly improves the recognition accuracy of Chinese text through advanced algorithms and structural design. The MIRROR model consists of two core components: a feature extractor and a Next-Character Decoder. Specifically, this paper proposes a Spatial Local Self-Attention Module to enhance the model’s ability to model long-distance dependencies in complex character sequences, addressing the problem of complex distributions in medium-to-long distance Chinese character sequences. The Character Refinement Module effectively captures multi-scale information, handles stroke feature differences, and resolves inter-class similarity issues. By combining multi-scale feature extraction with iterative optimization for feature refinement, the model identifies common features across different styles of the same character, solves the intra-class variation problem, and improves model robustness. In addition, this paper introduces a Three-Dimensional Weight Attention Module to refine the granularity of character features. Experiments show that MIRROR significantly outperforms baseline models on Chinese benchmark datasets. On scene datasets, performance improves by 3.08% (from 76.90% to 79.98%), on web datasets by 1.46% (from 70.43% to 71.89%), on document datasets by 0.38% (from 98.72% to 99.10%), and on handwriting datasets by 9.29% (from 50.26% to 59.55%).
{"title":"MIRROR: Multi-scale iterative refinement for robust chinese text recognition","authors":"Hengnian Qi ,&nbsp;Qiuyi Xin ,&nbsp;Jiabin Ye ,&nbsp;Hao Yang ,&nbsp;Kai Zhang ,&nbsp;Chu Zhang ,&nbsp;Qing Lang","doi":"10.1016/j.engappai.2025.110270","DOIUrl":"10.1016/j.engappai.2025.110270","url":null,"abstract":"<div><div>Text recognition has become a key area of research due to its wide applications in various fields. As an important branch of computer vision, Chinese text recognition has gained increasing research and practical value. However, the existing Chinese text recognition methods are still limited. This paper proposes an innovative Chinese text recognition method, <em><strong>M</strong>ulti-Scale <strong>I</strong>terative <strong>R</strong>efinement for <strong>Ro</strong>bust Chinese Text <strong>R</strong>ecognition</em> (MIRROR). The model significantly improves the recognition accuracy of Chinese text through advanced algorithms and structural design. The MIRROR model consists of two core components: a feature extractor and a Next-Character Decoder. Specifically, this paper proposes a Spatial Local Self-Attention Module to enhance the model’s ability to model long-distance dependencies in complex character sequences, addressing the problem of complex distributions in medium-to-long distance Chinese character sequences. The Character Refinement Module effectively captures multi-scale information, handles stroke feature differences, and resolves inter-class similarity issues. By combining multi-scale feature extraction with iterative optimization for feature refinement, the model identifies common features across different styles of the same character, solves the intra-class variation problem, and improves model robustness. In addition, this paper introduces a Three-Dimensional Weight Attention Module to refine the granularity of character features. Experiments show that MIRROR significantly outperforms baseline models on Chinese benchmark datasets. On scene datasets, performance improves by 3.08% (from 76.90% to 79.98%), on web datasets by 1.46% (from 70.43% to 71.89%), on document datasets by 0.38% (from 98.72% to 99.10%), and on handwriting datasets by 9.29% (from 50.26% to 59.55%).</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110270"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Innovative integration of machine learning and colorimetry for precise potential of hydrogen monitoring in printed hydrogel sensors
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-18 DOI: 10.1016/j.engappai.2025.110293
Abdelrahman Sakr , Ahmed R. El shamy , Haider Butt
Proper potential of hydrogen (pH) monitoring finds wide applications in environmental monitoring, clinical diagnostics, and a variety of industrial processes. However, traditional pH sensors normally present several challenges related to adaptability, portability, and environmental compatibility. In addition, the recently developed hydrogel-based sensors have manifested several advantages due to the flexibility and biocompatibility of the material in a wide variety of applications. While much advancement has been made in integration techniques, further advances need improvement in precision and reliability. The present work describes a novel methodology of pH sensing through integration of hydrogel-based sensors with machine learning algorithms. pH-sensitive dye-impregnated hydrogel sensors have been fabricated using three-Dimensional (3D) printing technology, whereby colorimetric data analysis is combined with five machine learning models, namely Decision Trees, eXtreme Gradient Boosting, K-Nearest Neighbours, Random Forests, and Neural Networks, in the classification of pH based on Red, Green, Blue (RGB) data. The sensor designed can detect pH between 4 and 10 pH with high speed, stability, and reversibility. With precision, recall, and F1-scores all above 99%, this shows how efficient the classification approach is based on RGB and gives weight to the potential of the developed sensors for real-time applications in monitoring and diagnostics, hence making a big contribution to the evolution of pH sensing and paving the way for smarter, more adaptable sensor solutions.
{"title":"Innovative integration of machine learning and colorimetry for precise potential of hydrogen monitoring in printed hydrogel sensors","authors":"Abdelrahman Sakr ,&nbsp;Ahmed R. El shamy ,&nbsp;Haider Butt","doi":"10.1016/j.engappai.2025.110293","DOIUrl":"10.1016/j.engappai.2025.110293","url":null,"abstract":"<div><div>Proper potential of hydrogen (pH) monitoring finds wide applications in environmental monitoring, clinical diagnostics, and a variety of industrial processes. However, traditional pH sensors normally present several challenges related to adaptability, portability, and environmental compatibility. In addition, the recently developed hydrogel-based sensors have manifested several advantages due to the flexibility and biocompatibility of the material in a wide variety of applications. While much advancement has been made in integration techniques, further advances need improvement in precision and reliability. The present work describes a novel methodology of pH sensing through integration of hydrogel-based sensors with machine learning algorithms. pH-sensitive dye-impregnated hydrogel sensors have been fabricated using three-Dimensional (3D) printing technology, whereby colorimetric data analysis is combined with five machine learning models, namely Decision Trees, eXtreme Gradient Boosting, K-Nearest Neighbours, Random Forests, and Neural Networks, in the classification of pH based on Red, Green, Blue (RGB) data. The sensor designed can detect pH between 4 and 10 pH with high speed, stability, and reversibility. With precision, recall, and F1-scores all above 99%, this shows how efficient the classification approach is based on RGB and gives weight to the potential of the developed sensors for real-time applications in monitoring and diagnostics, hence making a big contribution to the evolution of pH sensing and paving the way for smarter, more adaptable sensor solutions.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110293"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive prompt guided unified image restoration with latent diffusion model
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-18 DOI: 10.1016/j.engappai.2025.110267
Xiang Lv , Mingwen Shao , Yecong Wan , Yuanjian Qiao , Changzhong Wang
Recently, Diffusion Models (DMs) have witnessed the remarkable success in image restoration tasks. However, DMs are not flexible and adaptive in dealing with uncertain multiple forms of image degradation (e.g., noise, blur and so on) due to the lack of degradation prior, resulting in undesirable boundary artifacts. In addition, DMs require a large number of inference iterations to restore clean image, which consumes massive computational resources. To address the forementioned limitations, we propose an adaptive unified two-stage restoration method based on latent diffusion model, termed APDiff that can effectively and adaptively handle real-world images with various degradation types. Specifically, in Stage I, we pre-train a Degradation Adaptive Prompt Learning Network (DAPLNet-S1) to obtain degradation prompt by exploring differences between low quality (LQ) and ground truth (GT) images adaptively. Then, we encode it into the latent space as key discriminant information for different degraded images. In Stage II, we propose a latent diffusion model to directly estimate a degradation prompt similar in pre-train DAPLNet-S1 only using LQ images. Meanwhile, to restore different degradation images effectively, we design a Prompt Guided Fourier Transformer Restorer to integrate the extracted prompt, which enhances characterization ability of model for global frequency feature and local spatial information. Since the generated prompts are low-dimensional latent vector representations, this can significantly reduce computational complexity of diffusion model. Thus, during the inference process, our method takes only 0.09 s to restore an image of SPA+. Extensive experiments demonstrate that APDiff achieves state-of-the-art performance for multi-degradation tasks.
{"title":"Adaptive prompt guided unified image restoration with latent diffusion model","authors":"Xiang Lv ,&nbsp;Mingwen Shao ,&nbsp;Yecong Wan ,&nbsp;Yuanjian Qiao ,&nbsp;Changzhong Wang","doi":"10.1016/j.engappai.2025.110267","DOIUrl":"10.1016/j.engappai.2025.110267","url":null,"abstract":"<div><div>Recently, Diffusion Models (DMs) have witnessed the remarkable success in image restoration tasks. However, DMs are not flexible and adaptive in dealing with uncertain multiple forms of image degradation (e.g., noise, blur and so on) due to the lack of degradation prior, resulting in undesirable boundary artifacts. In addition, DMs require a large number of inference iterations to restore clean image, which consumes massive computational resources. To address the forementioned limitations, we propose an adaptive unified two-stage restoration method based on latent diffusion model, termed APDiff that can effectively and adaptively handle real-world images with various degradation types. Specifically, in Stage I, we pre-train a Degradation Adaptive Prompt Learning Network (DAPLNet-S1) to obtain degradation prompt by exploring differences between low quality (LQ) and ground truth (GT) images adaptively. Then, we encode it into the latent space as key discriminant information for different degraded images. In Stage II, we propose a latent diffusion model to directly estimate a degradation prompt similar in pre-train DAPLNet-S1 only using LQ images. Meanwhile, to restore different degradation images effectively, we design a Prompt Guided Fourier Transformer Restorer to integrate the extracted prompt, which enhances characterization ability of model for global frequency feature and local spatial information. Since the generated prompts are low-dimensional latent vector representations, this can significantly reduce computational complexity of diffusion model. Thus, during the inference process, our method takes only 0.09 s to restore an image of SPA+. Extensive experiments demonstrate that APDiff achieves state-of-the-art performance for multi-degradation tasks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110267"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond spatial neighbors: Utilizing multivariate transfer entropy for interpretable graph-based spatio–temporal forecasting
IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-02-17 DOI: 10.1016/j.engappai.2025.110161
Safaa Berkani , Adil Bahaj , Bassma Guermah , Mounir Ghogho
Spatio–temporal forecasting is a challenging task that requires modeling complex interactions between multiple time series. While graph-based models have emerged as compelling tools for this task, their effectiveness heavily depends on the underlying graph structure that captures spatial dependencies but ignores the temporal relationships. To address this challenge, we propose the Multivariate Transfer Entropy-Multivariate Time series forecasting with Graph Neural Networks (MTE-MTGNN), a hybrid approach that combines statistical and deep learning methods. MTE-MTGNN introduces an interpretable graph construction layer founded on Multivariate Transfer Entropy, which effectively captures both spatial and temporal dependencies in the data. Empirical evaluations across five benchmark datasets demonstrate the superiority of our proposed approach in terms of predictive accuracy. The model shows particular strength in few-shot scenarios where traditional forecasting approaches typically struggle, achieving performance improvements of up to 3% on the RRSE metric in the exchange rate dataset and up to 4% on the correlation metric in the Hungarian Chickenpox dataset compared to state-of-the-art baselines. The findings witnessed across different experiments translate into significant practical benefits for real-world engineering applications and different domains.
{"title":"Beyond spatial neighbors: Utilizing multivariate transfer entropy for interpretable graph-based spatio–temporal forecasting","authors":"Safaa Berkani ,&nbsp;Adil Bahaj ,&nbsp;Bassma Guermah ,&nbsp;Mounir Ghogho","doi":"10.1016/j.engappai.2025.110161","DOIUrl":"10.1016/j.engappai.2025.110161","url":null,"abstract":"<div><div>Spatio–temporal forecasting is a challenging task that requires modeling complex interactions between multiple time series. While graph-based models have emerged as compelling tools for this task, their effectiveness heavily depends on the underlying graph structure that captures spatial dependencies but ignores the temporal relationships. To address this challenge, we propose the Multivariate Transfer Entropy-Multivariate Time series forecasting with Graph Neural Networks (MTE-MTGNN), a hybrid approach that combines statistical and deep learning methods. MTE-MTGNN introduces an interpretable graph construction layer founded on Multivariate Transfer Entropy, which effectively captures both spatial and temporal dependencies in the data. Empirical evaluations across five benchmark datasets demonstrate the superiority of our proposed approach in terms of predictive accuracy. The model shows particular strength in few-shot scenarios where traditional forecasting approaches typically struggle, achieving performance improvements of up to 3% on the RRSE metric in the exchange rate dataset and up to 4% on the correlation metric in the Hungarian Chickenpox dataset compared to state-of-the-art baselines. The findings witnessed across different experiments translate into significant practical benefits for real-world engineering applications and different domains.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110161"},"PeriodicalIF":7.5,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143418844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1