首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
Trajectory time impact on error stability for hyper-redundant continuum Manipulators: A comparative study 轨迹时间对超冗余连续体机械臂误差稳定性影响的比较研究
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114057
Elsayed Atif Aner, Mohamed Fawzy El-Khatib
The precise trajectory tracking of hyper-redundant continuum manipulators is essential for applications requiring both high accuracy and adaptability, such as minimally invasive surgery and confined space exploration. However, existing Artificial Intelligence (AI)-based control strategies often struggle to maintain precision under dynamic conditions characterized by rapid motion transitions and complex trajectories, particularly in scenarios involving short durations and tight curves. This study addresses this challenge by evaluating the performance of two proposed controllers—Particle Swarm Optimization-based Fuzzy Logic Controller (PSO-FLC) and Sliding Mode Controller (SMC)—in tracking an infinity-shaped trajectory across three distinct durations: 8 s, 4 s, and 2 s. Performance metrics, including trajectory accuracy, end-effector position error, speed profiles, and statistical error analysis, are used to systematically evaluate the controllers. The results indicate that both controllers deliver reliable performance during slower trajectories (8 s); however, the proposed SMC demonstrates superior robustness at higher speeds. It achieves lower position errors, smoother speed profiles, and greater dynamic stability, whereas the PSO-FLC exhibits significant performance degradation under rapid motion constraints. The model was implemented in MATLAB (Matrix Laboratory) and Simulink (Simulation and Link Editor), validated for fidelity, and subsequently tested with the proposed controller under various time constraints. The findings of this study establish the proposed SMC as a robust and reliable solution for high-speed dynamic applications, while positioning the PSO-FLC as a viable option for scenarios with less demanding motion requirements. These insights contribute to the optimization of controller design and selection for hyper-redundant continuum manipulators operating in complex environments.
超冗余连续机械臂的精确轨迹跟踪对于要求高精度和适应性的应用至关重要,例如微创手术和密闭空间探索。然而,现有的基于人工智能(AI)的控制策略往往难以在快速运动转换和复杂轨迹的动态条件下保持精度,特别是在涉及短持续时间和紧曲线的情况下。本研究通过评估两种提出的控制器——基于粒子群优化的模糊逻辑控制器(PSO-FLC)和滑模控制器(SMC)的性能来解决这一挑战,这两种控制器在三个不同的持续时间:8秒、4秒和2秒内跟踪无限大形状的轨迹。性能指标,包括轨迹精度、末端执行器位置误差、速度分布和统计误差分析,用于系统地评估控制器。结果表明,两种控制器在较慢的轨迹(8 s)中都具有可靠的性能;然而,所提出的SMC在更高的速度下表现出优越的鲁棒性。它实现了更低的位置误差,更平滑的速度轮廓,以及更大的动态稳定性,而PSO-FLC在快速运动约束下表现出明显的性能下降。该模型在MATLAB(矩阵实验室)和Simulink(仿真和链接编辑器)中实现,验证了保真度,并随后在各种时间约束下使用所提出的控制器进行了测试。本研究的结果表明,所提出的SMC是高速动态应用的稳健可靠的解决方案,同时将PSO-FLC定位为运动要求较低的场景的可行选择。这些见解有助于在复杂环境中运行的超冗余连续机械臂的控制器设计和选择的优化。
{"title":"Trajectory time impact on error stability for hyper-redundant continuum Manipulators: A comparative study","authors":"Elsayed Atif Aner,&nbsp;Mohamed Fawzy El-Khatib","doi":"10.1016/j.engappai.2026.114057","DOIUrl":"10.1016/j.engappai.2026.114057","url":null,"abstract":"<div><div>The precise trajectory tracking of hyper-redundant continuum manipulators is essential for applications requiring both high accuracy and adaptability, such as minimally invasive surgery and confined space exploration. However, existing Artificial Intelligence (AI)-based control strategies often struggle to maintain precision under dynamic conditions characterized by rapid motion transitions and complex trajectories, particularly in scenarios involving short durations and tight curves. This study addresses this challenge by evaluating the performance of two proposed controllers—Particle Swarm Optimization-based Fuzzy Logic Controller (PSO-FLC) and Sliding Mode Controller (SMC)—in tracking an infinity-shaped trajectory across three distinct durations: 8 s, 4 s, and 2 s. Performance metrics, including trajectory accuracy, end-effector position error, speed profiles, and statistical error analysis, are used to systematically evaluate the controllers. The results indicate that both controllers deliver reliable performance during slower trajectories (8 s); however, the proposed SMC demonstrates superior robustness at higher speeds. It achieves lower position errors, smoother speed profiles, and greater dynamic stability, whereas the PSO-FLC exhibits significant performance degradation under rapid motion constraints. The model was implemented in MATLAB (Matrix Laboratory) and Simulink (Simulation and Link Editor), validated for fidelity, and subsequently tested with the proposed controller under various time constraints. The findings of this study establish the proposed SMC as a robust and reliable solution for high-speed dynamic applications, while positioning the PSO-FLC as a viable option for scenarios with less demanding motion requirements. These insights contribute to the optimization of controller design and selection for hyper-redundant continuum manipulators operating in complex environments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114057"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Seismic fragility assessment of curved girder bridges under vehicle-induced risks: A specialized deep learning-based neural network approach 基于深度学习的神经网络方法对车辆风险下曲线梁桥地震易损性的评估
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114066
Wei-zuo Guo , Wei-you Guo , Yan Gong , Shu-mao Qiu
The special horizontal alignment of curved girder bridges often leads to higher seismic demands than those of straight bridges, resulting in greater seismic fragility. With the continuing growth in transportation and logistics demand, the likelihood of heavy vehicles being stranded on bridges during earthquakes further increases, amplifying the seismic risk of curved girder bridges. However, existing data-driven seismic fragility assessment methods generally neglect the additional risks introduced by vehicle loads. Therefore, this study develops a specialized deep learning model—the seismic fragility embedding neural network under vehicle-induced risks for curved girder bridges (SFENR)—to assess their seismic fragility under combined vehicle–earthquake effects. An automated parametric finite element (APFE) program is developed to efficiently simulate the vehicle–curved girder bridge system and batch-produce nonlinear dynamic responses, thereby providing essential data support for training the SFENR model. A case study is then conducted on a typical three-span continuous curved box girder bridge to systematically investigate how vehicles with different weights and positions affect the seismic fragility of bridge components. The results demonstrate that the proposed SFENR model substantially outperforms conventional neural networks in terms of both memory efficiency and prediction accuracy. Specifically, the SFENR achieves a nearly 50% reduction in memory usage while improving Accuracy by 2–10%, with both Precision and Recall consistently maintained above 70%. Furthermore, the fragility curves of structural components exhibit greater sensitivity to variations in the tangential rather than radial positions of vehicles on the bridge deck. The presence of vehicles induces a non-monotonic effect on the seismic fragility of curved girder bridges—meaning that vehicles increase fragility at lower ground motion intensities but reduce it at higher intensities. This highlights the importance of considering vehicle effects in seismic risk evaluation and advances the development of more reliable fragility assessment for highway bridges.
曲线梁桥由于其特殊的水平排列方式,往往比直梁桥具有更高的抗震要求,从而产生更大的地震易损性。随着交通运输和物流需求的持续增长,地震期间重型车辆滞留在桥梁上的可能性进一步增加,加大了弯梁桥的地震风险。然而,现有的数据驱动地震易损性评估方法往往忽略了车辆荷载带来的附加风险。因此,本研究开发了一种专门的深度学习模型——车辆诱发风险下的弯曲梁桥地震易损性嵌入神经网络,以评估其在车震联合作用下的地震易损性。开发了自动化参数化有限元(APFE)程序,可有效模拟车辆-曲线梁桥系统,批量生成非线性动力响应,为SFENR模型的训练提供必要的数据支持。以典型的三跨连续弯曲箱梁桥为例,系统研究了不同重量和位置的车辆对桥梁构件地震易损性的影响。结果表明,所提出的SFENR模型在记忆效率和预测精度方面都大大优于传统神经网络。具体来说,SFENR实现了近50%的内存使用减少,同时提高准确率2-10%,精度和召回率始终保持在70%以上。此外,结构构件的脆性曲线对车辆在桥面上的切向位置而不是径向位置的变化表现出更大的敏感性。车辆的存在对弯曲梁桥的地震易损性产生非单调效应,即车辆在较低地震动强度下增加易损性,而在较高地震动强度下降低易损性。这突出了在地震风险评估中考虑车辆影响的重要性,推动了更可靠的公路桥梁易损性评估的发展。
{"title":"Seismic fragility assessment of curved girder bridges under vehicle-induced risks: A specialized deep learning-based neural network approach","authors":"Wei-zuo Guo ,&nbsp;Wei-you Guo ,&nbsp;Yan Gong ,&nbsp;Shu-mao Qiu","doi":"10.1016/j.engappai.2026.114066","DOIUrl":"10.1016/j.engappai.2026.114066","url":null,"abstract":"<div><div>The special horizontal alignment of curved girder bridges often leads to higher seismic demands than those of straight bridges, resulting in greater seismic fragility. With the continuing growth in transportation and logistics demand, the likelihood of heavy vehicles being stranded on bridges during earthquakes further increases, amplifying the seismic risk of curved girder bridges. However, existing data-driven seismic fragility assessment methods generally neglect the additional risks introduced by vehicle loads. Therefore, this study develops a specialized deep learning model—the seismic fragility embedding neural network under vehicle-induced risks for curved girder bridges (SFENR)—to assess their seismic fragility under combined vehicle–earthquake effects. An automated parametric finite element (APFE) program is developed to efficiently simulate the vehicle–curved girder bridge system and batch-produce nonlinear dynamic responses, thereby providing essential data support for training the SFENR model. A case study is then conducted on a typical three-span continuous curved box girder bridge to systematically investigate how vehicles with different weights and positions affect the seismic fragility of bridge components. The results demonstrate that the proposed SFENR model substantially outperforms conventional neural networks in terms of both memory efficiency and prediction accuracy. Specifically, the SFENR achieves a nearly 50% reduction in memory usage while improving Accuracy by 2–10%, with both Precision and Recall consistently maintained above 70%. Furthermore, the fragility curves of structural components exhibit greater sensitivity to variations in the tangential rather than radial positions of vehicles on the bridge deck. The presence of vehicles induces a non-monotonic effect on the seismic fragility of curved girder bridges—meaning that vehicles increase fragility at lower ground motion intensities but reduce it at higher intensities. This highlights the importance of considering vehicle effects in seismic risk evaluation and advances the development of more reliable fragility assessment for highway bridges.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114066"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight spatio-temporal residual neural network and transformer architecture with positional gating for video-based smoke and fire detection 基于位置门控的轻型时空残差神经网络和变压器结构用于视频烟雾和火灾探测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.113996
Rafaqat Alam Khan , Usama Ijaz Bajwa , Rana Hammad Raza , Muhammad Umar Farooq
The occurrence of fire incidents is considered one of the common hazards which not only risks human lives, but also impacts economy and environment. Detecting fire and smoke in its initial stages is highly important to prevent them from becoming uncontrollable. Conventional sensor-based detectors have limitations such as geographic area coverage, time required to reach the sensor, and false alarm rates. However, traditional sensor-based detectors are being substituted with smart video-based detectors. These provide effective monitoring, detection and detailed analysis of smoke and fires in both indoor/outdoor environments. This study introduced a real-time automated artificial intelligence (AI)-based video model for early-stage detection of smoke and fire, effectively mitigating false alarms caused by clouds, fogs or other fire-colored backgrounds or objects. The model Dual Attention Multi-Resolution Three-Dimensional Network with Positional Gating Unit (DAMR3DNet_PGU) was trained using hybrid Spatio-Temporal Residual Neural Network and Transformer architecture (Transformer) with Positional Gating on a wide range of unique smoke and fire patterns sourced from publicly available benchmark video datasets. Experiment results illustrated significant improvements in True Positive Rate (TPR), True Negative Rate (TNR), False Positive (FP), False Negative (FN), false alarm and accuracy, when compared with various state-of-the-art methods. The efficacy of the proposed DAMR3DNet_PGU method utilizing conventional closed-circuit television (CCTV) cameras for fire and smoke detection was affirmed. The proposed technique demonstrated robust performance across multiple datasets. It achieved high accuracy rates for smoke and fire detection, while significantly reducing false negatives, false alarm and with lightweight model compared to existing approaches.
火灾事故的发生被认为是危害人类生命安全、影响经济和环境的常见灾害之一。在火灾和烟雾的最初阶段探测到它们对于防止它们变得无法控制是非常重要的。传统的基于传感器的探测器有局限性,如地理区域覆盖、到达传感器所需的时间和误报率。然而,传统的基于传感器的探测器正在被基于智能视频的探测器所取代。这些系统提供室内/室外环境中烟雾和火灾的有效监测、探测和详细分析。该研究引入了一种基于实时自动化人工智能(AI)的视频模型,用于烟雾和火灾的早期检测,有效地减轻了由云、雾或其他火色背景或物体引起的误报。采用混合时空残差神经网络和具有位置门控的变压器架构(Transformer)对来自公开可用的基准视频数据集的各种独特的烟雾和火灾模式进行训练,建立了带有位置门控单元的双注意力多分辨率三维网络模型(DAMR3DNet_PGU)。实验结果表明,与各种最先进的方法相比,该方法在真阳性率(TPR)、真阴性率(TNR)、假阳性(FP)、假阴性(FN)、误报警和准确性方面有显著提高。证实了DAMR3DNet_PGU方法利用传统闭路电视(CCTV)摄像机进行火灾和烟雾探测的有效性。所提出的技术在多个数据集上表现出稳健的性能。与现有方法相比,它实现了烟雾和火灾探测的高准确率,同时显着减少了误报和误报警,并且模型轻巧。
{"title":"Lightweight spatio-temporal residual neural network and transformer architecture with positional gating for video-based smoke and fire detection","authors":"Rafaqat Alam Khan ,&nbsp;Usama Ijaz Bajwa ,&nbsp;Rana Hammad Raza ,&nbsp;Muhammad Umar Farooq","doi":"10.1016/j.engappai.2026.113996","DOIUrl":"10.1016/j.engappai.2026.113996","url":null,"abstract":"<div><div>The occurrence of fire incidents is considered one of the common hazards which not only risks human lives, but also impacts economy and environment. Detecting fire and smoke in its initial stages is highly important to prevent them from becoming uncontrollable. Conventional sensor-based detectors have limitations such as geographic area coverage, time required to reach the sensor, and false alarm rates. However, traditional sensor-based detectors are being substituted with smart video-based detectors. These provide effective monitoring, detection and detailed analysis of smoke and fires in both indoor/outdoor environments. This study introduced a real-time automated artificial intelligence (AI)-based video model for early-stage detection of smoke and fire, effectively mitigating false alarms caused by clouds, fogs or other fire-colored backgrounds or objects. The model Dual Attention Multi-Resolution Three-Dimensional Network with Positional Gating Unit (DAMR3DNet_PGU) was trained using hybrid Spatio-Temporal Residual Neural Network and Transformer architecture (Transformer) with Positional Gating on a wide range of unique smoke and fire patterns sourced from publicly available benchmark video datasets. Experiment results illustrated significant improvements in True Positive Rate (TPR), True Negative Rate (TNR), False Positive (FP), False Negative (FN), false alarm and accuracy, when compared with various state-of-the-art methods. The efficacy of the proposed DAMR3DNet_PGU method utilizing conventional closed-circuit television (CCTV) cameras for fire and smoke detection was affirmed. The proposed technique demonstrated robust performance across multiple datasets. It achieved high accuracy rates for smoke and fire detection, while significantly reducing false negatives, false alarm and with lightweight model compared to existing approaches.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113996"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-based target fencing control for delay-tolerant unmanned aerial vehicle swarm 基于学习的容延迟无人机群目标围栏控制
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114069
Hao Yu, Xiu-xia Yang, Yi Zhang, Wen-qiang Yao
This study focuses on the cooperative fencing mission for unmanned aerial vehicle (UAV) swarm under communication delays, proposing an adaptive self-organized control framework based on a Radial Basis Function-Brain Emotional Learning-Based Intelligent Controller (RBF-BELBIC). Firstly, a fixed-time convergent observer is developed to realize simultaneous estimation of multiple states of the target, achieving precise estimation independent of initial states through dual-channel Hurwitz polynomial configuration. Secondly, a self-organized distributed control scheme integrating consensus term, navigation term, and potential field term is constructed. This strategy enables the UAV swarm to autonomously generate a dynamic fencing convex hull around the target, eliminating the dependency on predefined geometric configurations while guaranteeing collision avoidance. Thirdly, a dual-layer intelligent robust controller driven by the RBF-BELBIC network is designed to tackle the control lag effects caused by communication delays. This architecture establishes a hierarchical structure where the RBF network serves as an upper layer for online gain optimization, and the BELBIC acts as a lower reactive control layer, thereby enabling simultaneous disturbance compensation and dynamic control policy adaptation. Closed-loop stability is analytically established using Lyapunov theory. Simulations verify that the proposed control strategy extends the tolerable delay bound by an order of magnitude over conventional methods (from 100 ms to 1000 ms). Concurrently, it reduces fencing position and velocity errors by 99.36% and 97.45%, compared to single-layer learning networks under large delays, demonstrating superior robustness in complex environments.
针对通信延迟条件下无人机(UAV)群协同围防任务,提出了一种基于径向基函数-基于大脑情绪学习的智能控制器(RBF-BELBIC)自适应自组织控制框架。首先,开发了一种固定时间收敛观测器,实现了目标的多状态同时估计,通过双通道Hurwitz多项式配置实现了不依赖于初始状态的精确估计;其次,构造了共识项、导航项和势场项相结合的自组织分布式控制方案;该策略使无人机群能够在目标周围自主生成动态围栏凸壳,在保证避碰的同时消除对预定义几何构型的依赖。第三,设计了由RBF-BELBIC网络驱动的双层智能鲁棒控制器,解决了通信延迟带来的控制滞后效应。该体系结构建立了一个分层结构,其中RBF网络作为在线增益优化的上层,BELBIC作为下层的无功控制层,从而同时实现干扰补偿和动态控制策略自适应。利用李雅普诺夫理论解析建立了闭环稳定性。仿真验证了所提出的控制策略比传统方法(从100 ms到1000 ms)延长了一个数量级的可容忍延迟。同时,与大延迟下的单层学习网络相比,该方法将击剑的位置和速度误差分别降低了99.36%和97.45%,在复杂环境下表现出优越的鲁棒性。
{"title":"Learning-based target fencing control for delay-tolerant unmanned aerial vehicle swarm","authors":"Hao Yu,&nbsp;Xiu-xia Yang,&nbsp;Yi Zhang,&nbsp;Wen-qiang Yao","doi":"10.1016/j.engappai.2026.114069","DOIUrl":"10.1016/j.engappai.2026.114069","url":null,"abstract":"<div><div>This study focuses on the cooperative fencing mission for unmanned aerial vehicle (UAV) swarm under communication delays, proposing an adaptive self-organized control framework based on a Radial Basis Function-Brain Emotional Learning-Based Intelligent Controller (RBF-BELBIC). Firstly, a fixed-time convergent observer is developed to realize simultaneous estimation of multiple states of the target, achieving precise estimation independent of initial states through dual-channel Hurwitz polynomial configuration. Secondly, a self-organized distributed control scheme integrating consensus term, navigation term, and potential field term is constructed. This strategy enables the UAV swarm to autonomously generate a dynamic fencing convex hull around the target, eliminating the dependency on predefined geometric configurations while guaranteeing collision avoidance. Thirdly, a dual-layer intelligent robust controller driven by the RBF-BELBIC network is designed to tackle the control lag effects caused by communication delays. This architecture establishes a hierarchical structure where the RBF network serves as an upper layer for online gain optimization, and the BELBIC acts as a lower reactive control layer, thereby enabling simultaneous disturbance compensation and dynamic control policy adaptation. Closed-loop stability is analytically established using Lyapunov theory. Simulations verify that the proposed control strategy extends the tolerable delay bound by an order of magnitude over conventional methods (from 100 ms to 1000 ms). Concurrently, it reduces fencing position and velocity errors by 99.36% and 97.45%, compared to single-layer learning networks under large delays, demonstrating superior robustness in complex environments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114069"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced steel alloy identification using hybrid attention mechanism with femtosecond laser ablation-spark-induced breakdown spectroscopy 利用飞秒激光烧蚀-火花诱导击穿光谱混合注意机制增强钢合金识别
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114002
Zhenman Gao , Xudong Chen , Jianyong Zhuang , Chunhui Huang , Zuxiang Xie , Xiaoyong He
This paper proposes a deep learning architecture integrating a hybrid attention mechanism with femtosecond laser ablation-spark-induced breakdown spectroscopy technology to overcome efficiency and accuracy limitations in steel alloy classification associated with traditional laser-induced breakdown spectroscopy techniques. Traditional methods often struggle with feature redundancy and noise interference when processing high-dimensional, low signal-to-noise ratio spectral data. This paper introduces a dynamic feature optimization strategy through heterogeneous fusion of efficient channel attention network and Transformer self-attention modules, enabling adaptive multi-level feature selection and enhancement. The model utilizes channel attention in the shallow layers to identify critical spectral response regions, while deeper layers use self-attention to model global spectral sequence relationships, establishing a “local-to-global” bimodal feature extraction approach. This design reduces data dimensions from 2048 spectral points to a 6-dimensional feature vector (0.3% of the original features) while improving temporal spectral analysis through synergistic feature selection and noise suppression. Focusing on iron element responses in steel matrices, experiments achieved 100% test-set classification accuracy, demonstrating the superiority of the hybrid attention mechanism. Theoretical and empirical analysis reveal that Convolutional Neural Networks-Transformer integration dynamically balances local detail sensitivity and global context awareness, yielding 40% faster inference speeds. The developed dual-stage architecture (shallow efficient channel attention + deep Transformer) offers significant advantages for rapid metal material detection, providing an innovative approach for industrial-grade online steel alloy identification. This work establishes a novel “convolution-attention” co-optimization paradigm for complex spectral signal analysis.
本文提出了一种将混合注意机制与飞秒激光烧蚀-火花诱导击穿光谱技术相结合的深度学习架构,以克服传统激光诱导击穿光谱技术在钢合金分类中的效率和精度限制。传统的方法在处理高维、低信噪比的频谱数据时,往往存在特征冗余和噪声干扰的问题。本文介绍了一种基于高效通道注意网络和Transformer自注意模块异构融合的动态特征优化策略,实现自适应多层次特征选择和增强。该模型利用浅层通道关注识别关键光谱响应区域,而深层利用自关注建模全局光谱序列关系,建立了“局部到全局”双峰特征提取方法。该设计将数据维度从2048个光谱点降低到6维特征向量(原始特征的0.3%),同时通过协同特征选择和噪声抑制改进了时间谱分析。以钢基体中的铁元素响应为研究对象,实验实现了100%的测试集分类准确率,证明了混合注意机制的优越性。理论和实证分析表明,卷积神经网络-变压器集成动态平衡了局部细节敏感性和全局上下文感知,使推理速度提高了40%。所开发的双级结构(浅层高效通道关注+深层变压器)为快速金属材料检测提供了显著优势,为工业级在线钢合金识别提供了一种创新方法。本研究为复杂频谱信号分析建立了一种新颖的“卷积-注意力”协同优化范式。
{"title":"Enhanced steel alloy identification using hybrid attention mechanism with femtosecond laser ablation-spark-induced breakdown spectroscopy","authors":"Zhenman Gao ,&nbsp;Xudong Chen ,&nbsp;Jianyong Zhuang ,&nbsp;Chunhui Huang ,&nbsp;Zuxiang Xie ,&nbsp;Xiaoyong He","doi":"10.1016/j.engappai.2026.114002","DOIUrl":"10.1016/j.engappai.2026.114002","url":null,"abstract":"<div><div>This paper proposes a deep learning architecture integrating a hybrid attention mechanism with femtosecond laser ablation-spark-induced breakdown spectroscopy technology to overcome efficiency and accuracy limitations in steel alloy classification associated with traditional laser-induced breakdown spectroscopy techniques. Traditional methods often struggle with feature redundancy and noise interference when processing high-dimensional, low signal-to-noise ratio spectral data. This paper introduces a dynamic feature optimization strategy through heterogeneous fusion of efficient channel attention network and Transformer self-attention modules, enabling adaptive multi-level feature selection and enhancement. The model utilizes channel attention in the shallow layers to identify critical spectral response regions, while deeper layers use self-attention to model global spectral sequence relationships, establishing a “local-to-global” bimodal feature extraction approach. This design reduces data dimensions from 2048 spectral points to a 6-dimensional feature vector (0.3% of the original features) while improving temporal spectral analysis through synergistic feature selection and noise suppression. Focusing on iron element responses in steel matrices, experiments achieved 100% test-set classification accuracy, demonstrating the superiority of the hybrid attention mechanism. Theoretical and empirical analysis reveal that Convolutional Neural Networks-Transformer integration dynamically balances local detail sensitivity and global context awareness, yielding 40% faster inference speeds. The developed dual-stage architecture (shallow efficient channel attention + deep Transformer) offers significant advantages for rapid metal material detection, providing an innovative approach for industrial-grade online steel alloy identification. This work establishes a novel “convolution-attention” co-optimization paradigm for complex spectral signal analysis.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114002"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Local-Global Fusion Vision Mamba UNet Framework for medical image segmentation 一种局部-全局融合视觉Mamba UNet框架用于医学图像分割
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.113987
Yanbo Li , Zihan Mao , Feiwei Qin , Yong Peng , Guodao Zhang , Xugang Xi , Xiaoqin Ma , Huanhuan Yu , Yu Zhou , Zhu Zhu
As a State Space Model (SSM) that achieves long-range dependency modeling with linear computational complexity, Mamba demonstrates significant efficiency advantages in medical image segmentation. However, while Mamba-based methods enable long-range modeling with linear complexity, their global dependency mechanisms often lead to local feature attenuation, particularly affecting the processing of complex anatomical structures. Existing multi-scale fusion methods also exhibit limited compatibility with State Space Models. To address these challenges, this paper proposes the Local-Global Fusion Vision Mamba UNet (LGFVM-UNet) framework. Its core innovation lies in the Dynamic Gating-enhanced Local-Global Fusion Visual State Space (LGF-VSS) block, which enables the synergistic modeling of global context and local details. Additionally, we designed a Multi-level Cross-scale Feature Fusion Block (MCFB) that enhances multi-scale feature representation through bidirectional resampling and spatial-channel dual attention mechanisms. Additionally, we propose a Gradient Statistics-based Adaptive Hierarchical Loss that dynamically adjusts multi-level supervision weights to optimize the learning process. The proposed method is experimentally validated on five public medical image segmentation datasets spanning diverse imaging modalities and anatomical structures. Results demonstrate that our approach outperforms state-of-the-art methods, excelling in long-range dependency modeling, local detail capture, and multi-scale feature fusion. The source code of our work is available at https://github.com/NicoleDyson/LGFVM-UNet.
Mamba作为一种状态空间模型(SSM),实现了具有线性计算复杂度的远程依赖建模,在医学图像分割中具有显著的效率优势。然而,尽管基于mamba的方法能够实现线性复杂性的远程建模,但其全局依赖机制往往会导致局部特征衰减,特别是影响复杂解剖结构的处理。现有的多尺度融合方法与状态空间模型的兼容性也有限。为了解决这些问题,本文提出了局部-全局融合视觉曼巴UNet (LGFVM-UNet)框架。其核心创新在于动态门控增强的局部-全局融合视觉状态空间(LGF-VSS)块,实现了全局上下文和局部细节的协同建模。此外,我们设计了一个多级跨尺度特征融合块(MCFB),通过双向重采样和空间通道双注意机制增强多尺度特征表示。此外,我们提出了一种基于梯度统计的自适应分层损失,动态调整多层监督权重以优化学习过程。该方法在五个公共医学图像分割数据集上进行了实验验证,这些数据集跨越了不同的成像方式和解剖结构。结果表明,我们的方法优于最先进的方法,在远程依赖建模、局部细节捕获和多尺度特征融合方面表现出色。我们工作的源代码可在https://github.com/NicoleDyson/LGFVM-UNet上获得。
{"title":"A Local-Global Fusion Vision Mamba UNet Framework for medical image segmentation","authors":"Yanbo Li ,&nbsp;Zihan Mao ,&nbsp;Feiwei Qin ,&nbsp;Yong Peng ,&nbsp;Guodao Zhang ,&nbsp;Xugang Xi ,&nbsp;Xiaoqin Ma ,&nbsp;Huanhuan Yu ,&nbsp;Yu Zhou ,&nbsp;Zhu Zhu","doi":"10.1016/j.engappai.2026.113987","DOIUrl":"10.1016/j.engappai.2026.113987","url":null,"abstract":"<div><div>As a State Space Model (SSM) that achieves long-range dependency modeling with linear computational complexity, Mamba demonstrates significant efficiency advantages in medical image segmentation. However, while Mamba-based methods enable long-range modeling with linear complexity, their global dependency mechanisms often lead to local feature attenuation, particularly affecting the processing of complex anatomical structures. Existing multi-scale fusion methods also exhibit limited compatibility with State Space Models. To address these challenges, this paper proposes the Local-Global Fusion Vision Mamba UNet (LGFVM-UNet) framework. Its core innovation lies in the Dynamic Gating-enhanced Local-Global Fusion Visual State Space (LGF-VSS) block, which enables the synergistic modeling of global context and local details. Additionally, we designed a Multi-level Cross-scale Feature Fusion Block (MCFB) that enhances multi-scale feature representation through bidirectional resampling and spatial-channel dual attention mechanisms. Additionally, we propose a Gradient Statistics-based Adaptive Hierarchical Loss that dynamically adjusts multi-level supervision weights to optimize the learning process. The proposed method is experimentally validated on five public medical image segmentation datasets spanning diverse imaging modalities and anatomical structures. Results demonstrate that our approach outperforms state-of-the-art methods, excelling in long-range dependency modeling, local detail capture, and multi-scale feature fusion. The source code of our work is available at <span><span>https://github.com/NicoleDyson/LGFVM-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113987"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning to enhance strain-resilience humidity sensing on flexible surface acoustic wave platform 基于机器学习的柔性表面声波平台应变弹性湿度传感
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114088
Yanhong Xia , Zhangbin Ji , Jian Zhou , Yihao Guo , Hui Chen , Jinbo Zhang , Yongqing Fu
Flexible surface acoustic wave (SAW) humidity sensors have garnered considerable attention in fields such as environmental monitoring and healthcare, mainly attributed to their advantages such as wearability, applicability in non-planar scenarios, quasi-digital output, and wireless passive capabilities. However, improvement in performance of these flexible SAW humidity sensors faces great challenges such as low electromechanical coupling coefficient, poor humidity response or sensitivity, and introduction of detection errors caused by mechanical strain interference. Herein, we developed a flexible SAW humidity sensor utilizing an aluminum scandium nitride (AlScN) piezoelectric film deposited on ultrathin glass substrates, incorporating ternary nanocomposites of graphene quantum dots-polyethyleneimine-silica nanoparticles (GQDs-PEI-SiO2 NPs) as the sensitive layers, which demonstrated an ultra-high sensitivity of 5.02 kHz (kHz)/%Relative Humidity (RH). To address critical issues of strain interferences under randomly bending or deformation conditions, we applied machine learning (ML) algorithms to establish correlations between sensor's response signal features and humidity labels, thereby effectively mitigating unreliable humidity measurements caused by significant strain interferences, with improved precision and specificity. After comprehensive evaluation and analysis using various artificial intelligence algorithms, multilayer perceptron regression model was identified as the best performer in humidity prediction under strain interferences, with a coefficient of determination as high as 0.997 and a mean square error of ∼0.479. Reliability and generalization capabilities of this model were verified, and such the strategy not only significantly enhances the performance metrics of flexible humidity sensors but also provides an innovative and precision solution under various strain interferences using the flexible SAW sensors.
柔性表面声波(SAW)湿度传感器在环境监测和医疗保健等领域受到了广泛关注,主要归功于其可穿戴性、非平面场景适用性、准数字输出和无线无源能力等优势。然而,这些柔性SAW湿度传感器的性能改进面临着机电耦合系数低、湿度响应或灵敏度差、引入机械应变干扰引起的检测误差等巨大挑战。本文中,我们利用沉积在超薄玻璃衬底上的氮化铝钪(AlScN)压电薄膜,采用石墨烯量子点-聚乙烯亚胺-二氧化硅纳米颗粒(GQDs-PEI-SiO2 NPs)三元纳米复合材料作为敏感层,开发了一种柔性SAW湿度传感器,其相对湿度(RH)的灵敏度为5.02 kHz /%。为了解决随机弯曲或变形条件下应变干扰的关键问题,我们应用机器学习(ML)算法建立传感器响应信号特征与湿度标签之间的相关性,从而有效减轻由显著应变干扰引起的不可靠湿度测量,提高了精度和特异性。经过多种人工智能算法的综合评估和分析,多层感知器回归模型在应变干扰下的湿度预测中表现最佳,其决定系数高达0.997,均方误差为~ 0.479。验证了该模型的可靠性和泛化能力,该策略不仅显著提高了柔性湿度传感器的性能指标,而且为柔性SAW传感器在各种应变干扰下提供了一种创新的高精度解决方案。
{"title":"Machine learning to enhance strain-resilience humidity sensing on flexible surface acoustic wave platform","authors":"Yanhong Xia ,&nbsp;Zhangbin Ji ,&nbsp;Jian Zhou ,&nbsp;Yihao Guo ,&nbsp;Hui Chen ,&nbsp;Jinbo Zhang ,&nbsp;Yongqing Fu","doi":"10.1016/j.engappai.2026.114088","DOIUrl":"10.1016/j.engappai.2026.114088","url":null,"abstract":"<div><div>Flexible surface acoustic wave (SAW) humidity sensors have garnered considerable attention in fields such as environmental monitoring and healthcare, mainly attributed to their advantages such as wearability, applicability in non-planar scenarios, quasi-digital output, and wireless passive capabilities. However, improvement in performance of these flexible SAW humidity sensors faces great challenges such as low electromechanical coupling coefficient, poor humidity response or sensitivity, and introduction of detection errors caused by mechanical strain interference. Herein, we developed a flexible SAW humidity sensor utilizing an aluminum scandium nitride (AlScN) piezoelectric film deposited on ultrathin glass substrates, incorporating ternary nanocomposites of graphene quantum dots-polyethyleneimine-silica nanoparticles (GQDs-PEI-SiO<sub>2</sub> NPs) as the sensitive layers, which demonstrated an ultra-high sensitivity of 5.02 kHz (kHz)/%Relative Humidity (RH). To address critical issues of strain interferences under randomly bending or deformation conditions, we applied machine learning (ML) algorithms to establish correlations between sensor's response signal features and humidity labels, thereby effectively mitigating unreliable humidity measurements caused by significant strain interferences, with improved precision and specificity. After comprehensive evaluation and analysis using various artificial intelligence algorithms, multilayer perceptron regression model was identified as the best performer in humidity prediction under strain interferences, with a coefficient of determination as high as 0.997 and a mean square error of ∼0.479. Reliability and generalization capabilities of this model were verified, and such the strategy not only significantly enhances the performance metrics of flexible humidity sensors but also provides an innovative and precision solution under various strain interferences using the flexible SAW sensors.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114088"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An encoder-decoder model with self-attention mechanism for airport aviation noise estimation 基于自注意机制的机场航空噪声估计编解码器模型
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114007
Weili Zeng , Wentao Guo , Hao Feng , Yadong Zhou
Estimating aviation noise around airports is the prerequisite and foundation for assessing and controlling noise impacts. However, mainstream models are unable to capture the time-series correlation between noise impact factors. This deficiency directly undermines the estimation accuracy and generalization ability of the models. To this end, this paper proposes an encoder-decoder model with a self-attention mechanism (SAM-EDM) to estimate the aviation noise around airports. The model employs a deep autoencoder to reduce the dimensionality of noise-related factors, which not only captures complex nonlinear relationships among different variables but also eliminates redundancy and anomalies in the input features. On this basis, the encoder incorporates a Bidirectional Long Short-Term Memory (Bi-LSTM) network to learn bidirectional temporal dependencies across different time steps. The decoder incorporates physical prior knowledge and a self-attention mechanism into a Gated Recurrent Unit (GRU) and subsequently employs a fully connected layer to produce the final noise estimation outputs. A case study based on Hefei Xinqiao International Airport in China demonstrates that the SAM-EDM model achieves a coefficient of determination of 0.94 across different aircraft types. The model achieves a mean absolute error of 1.17 dB on the test set and 1.19 dB under unseen scenarios, outperforming traditional physical models, lightweight physics-guided neural networks, and pure deep learning models, demonstrating high estimation accuracy and strong generalization capability.
机场周围航空噪声的估算是评价和控制噪声影响的前提和基础。然而,主流模型无法捕捉噪声影响因子之间的时间序列相关性。这一缺陷直接影响了模型的估计精度和泛化能力。为此,本文提出了一种具有自关注机制的编码器-解码器模型(SAM-EDM)来估计机场周围的航空噪声。该模型采用深度自编码器对噪声相关因素进行降维处理,既捕获了不同变量之间复杂的非线性关系,又消除了输入特征中的冗余和异常。在此基础上,编码器结合了双向长短期记忆(Bi-LSTM)网络来学习不同时间步长的双向时间依赖性。解码器将物理先验知识和自注意机制整合到门控循环单元(GRU)中,随后采用完全连接层来产生最终的噪声估计输出。基于合肥新桥国际机场的案例研究表明,SAM-EDM模型在不同机型间的决定系数为0.94。该模型在测试集上的平均绝对误差为1.17 dB,在未知场景下的平均绝对误差为1.19 dB,优于传统物理模型、轻量级物理引导神经网络和纯深度学习模型,具有较高的估计精度和较强的泛化能力。
{"title":"An encoder-decoder model with self-attention mechanism for airport aviation noise estimation","authors":"Weili Zeng ,&nbsp;Wentao Guo ,&nbsp;Hao Feng ,&nbsp;Yadong Zhou","doi":"10.1016/j.engappai.2026.114007","DOIUrl":"10.1016/j.engappai.2026.114007","url":null,"abstract":"<div><div>Estimating aviation noise around airports is the prerequisite and foundation for assessing and controlling noise impacts. However, mainstream models are unable to capture the time-series correlation between noise impact factors. This deficiency directly undermines the estimation accuracy and generalization ability of the models. To this end, this paper proposes an encoder-decoder model with a self-attention mechanism (SAM-EDM) to estimate the aviation noise around airports. The model employs a deep autoencoder to reduce the dimensionality of noise-related factors, which not only captures complex nonlinear relationships among different variables but also eliminates redundancy and anomalies in the input features. On this basis, the encoder incorporates a Bidirectional Long Short-Term Memory (Bi-LSTM) network to learn bidirectional temporal dependencies across different time steps. The decoder incorporates physical prior knowledge and a self-attention mechanism into a Gated Recurrent Unit (GRU) and subsequently employs a fully connected layer to produce the final noise estimation outputs. A case study based on Hefei Xinqiao International Airport in China demonstrates that the SAM-EDM model achieves a coefficient of determination of 0.94 across different aircraft types. The model achieves a mean absolute error of 1.17 dB on the test set and 1.19 dB under unseen scenarios, outperforming traditional physical models, lightweight physics-guided neural networks, and pure deep learning models, demonstrating high estimation accuracy and strong generalization capability.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114007"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biologically inspired vision fusion: Central-peripheral synergy for medical image classification 生物学启发的视觉融合:医学图像分类的中枢-外周协同作用
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.engappai.2026.114026
Rui Lu , Long Yu , Shengwei Tian , Yukun Xiao
The synergy between foveal and peripheral processing is fundamental to the efficiency of biological vision. While hybrid Convolutional Neural Network (CNN)-Transformer architectures aim to capture both local and global features, they often rely on static, predefined structures that struggle to dynamically align information and adaptively allocate computational resources, ultimately limiting their performance. To address this limitation, we introduce the Central-Peripheral Vision Transformer (CPVT), a novel architecture that explicitly and hierarchically mimics this biological dichotomy. CPVT employs fine-grained, convolutionally modulated attention in its shallow layers to emulate foveal vision, while seamlessly transitioning to a coarse-grained, global attention mechanism in deeper layers to emulate peripheral vision. This design is enhanced by two specialized Feed-Forward Networks that facilitate synergistic information interaction. Rigorously validated on diverse medical imaging benchmarks, CPVT achieves state-of-the-art performance, attaining classification accuracies of 87.98% on the International Skin Imaging Collaboration (ISIC) 2018 challenge dataset and 90.41% on the Kvasir dataset. These results demonstrate that an adaptive, hierarchical integration of biological vision principles can significantly enhance machine perception for medical image analysis.
中央凹和外周处理之间的协同作用是生物视觉效率的基础。虽然混合卷积神经网络(CNN)-Transformer架构旨在捕获局部和全局特征,但它们通常依赖于静态的预定义结构,这些结构难以动态对齐信息并自适应地分配计算资源,最终限制了它们的性能。为了解决这一限制,我们引入了中央-周边视觉转换器(CPVT),这是一种明确地、分层地模仿这种生物二分法的新架构。CPVT在其浅层中采用细粒度、卷积调制的注意力来模拟中央凹视觉,而在更深层中无缝过渡到粗粒度、全局注意力机制来模拟周边视觉。该设计通过两个专门的前馈网络来增强,以促进协同信息交互。经过各种医学成像基准的严格验证,CPVT达到了最先进的性能,在国际皮肤成像协作(ISIC) 2018挑战数据集上的分类准确率为87.98%,在Kvasir数据集上的分类准确率为90.41%。这些结果表明,生物视觉原理的自适应分层集成可以显著增强医学图像分析的机器感知。
{"title":"Biologically inspired vision fusion: Central-peripheral synergy for medical image classification","authors":"Rui Lu ,&nbsp;Long Yu ,&nbsp;Shengwei Tian ,&nbsp;Yukun Xiao","doi":"10.1016/j.engappai.2026.114026","DOIUrl":"10.1016/j.engappai.2026.114026","url":null,"abstract":"<div><div>The synergy between foveal and peripheral processing is fundamental to the efficiency of biological vision. While hybrid Convolutional Neural Network (CNN)-Transformer architectures aim to capture both local and global features, they often rely on static, predefined structures that struggle to dynamically align information and adaptively allocate computational resources, ultimately limiting their performance. To address this limitation, we introduce the Central-Peripheral Vision Transformer (CPVT), a novel architecture that explicitly and hierarchically mimics this biological dichotomy. CPVT employs fine-grained, convolutionally modulated attention in its shallow layers to emulate foveal vision, while seamlessly transitioning to a coarse-grained, global attention mechanism in deeper layers to emulate peripheral vision. This design is enhanced by two specialized Feed-Forward Networks that facilitate synergistic information interaction. Rigorously validated on diverse medical imaging benchmarks, CPVT achieves state-of-the-art performance, attaining classification accuracies of 87.98% on the International Skin Imaging Collaboration (ISIC) 2018 challenge dataset and 90.41% on the Kvasir dataset. These results demonstrate that an adaptive, hierarchical integration of biological vision principles can significantly enhance machine perception for medical image analysis.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114026"},"PeriodicalIF":8.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task-aware evolution in physics-informed neural networks: Application to Saint-Venant torsion problems 物理信息神经网络中的任务感知进化:在Saint-Venant扭转问题中的应用
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-31 DOI: 10.1016/j.engappai.2026.113988
Suyeong Jo , Sanghyeon Park , Jeesuk Shin , Jongcheon Park , Hosung Kim , Seungchan Ko , Sangseung Lee , Joongoo Jeon
The Saint-Venant torsion theory is a classical theory for analyzing the torsional behavior of structural components. Conventional numerical methods, including the finite element method (FEM), typically rely on mesh-based approaches, which often result in significant increases in computational cost. The objective of this study is to develop a series of novel numerical methods based on physics-informed neural networks (PINN) for solving the Saint-Venant torsion equations. Utilizing the automatic differentiation capability of neural networks, the PINN can provide partial differential equations (PDEs) solvers without the need for intricate computational techniques. We present an integrated framework that simultaneously addresses single-instance stiffness (via VS-PINN) and multi-query parametric efficiency (via Parametric PINN) for torsion problems, which has not been explored in prior work. First, a PINN solver was developed to compute the torsional constant for bars with arbitrary cross-sectional geometries. This was followed by the development of a solver capable of handling cases with sharp geometric transitions; variable-scaling PINN (VS-PINN). Finally, a parametric PINN was constructed to address the limitations of conventional single-instance PINN. The results from all three solvers showed good agreement with reference solutions, demonstrating their accuracy and robustness. Specifically, we report 0.1% error for circular and square sections, 3.0% error for triangular sections with PINN, a reduction from 0.97% to 0.11% with VS-PINN in the stiff 1D case, and 1.0% error for the Parametric PINN across varying torque parameters. Once training has been completed, the parametric PINN can predict solutions with remarkable efficiency for varying PDE data or problem settings. Thanks to the retraining-free nature of the parametric PINN, the model can achieve over 100× faster speed for a large number of instances, compared to FEM. Each solver can be selectively employed in a task-aware manner, ensuring that its utilization aligns with the specific objectives, such as geometry-specific solving, handling stiffness, or parametric generalization.
圣维南扭转理论是分析结构构件扭转性能的经典理论。传统的数值方法,包括有限元法(FEM),通常依赖于基于网格的方法,这往往导致计算成本的显著增加。本研究的目的是开发一系列基于物理信息神经网络(PINN)的新型数值方法来求解Saint-Venant扭转方程。利用神经网络的自动微分能力,PINN可以在不需要复杂的计算技术的情况下提供偏微分方程(PDEs)的解。我们提出了一个集成的框架,同时解决了单实例刚度(通过VS-PINN)和多查询参数效率(通过参数PINN)的扭转问题,这在以前的工作中没有被探索过。首先,建立了求解任意横截面杆的扭转常数的PINN求解器。随后开发了一种求解器,能够处理具有尖锐几何过渡的情况;可变缩放PINN (VS-PINN)。最后,针对传统的单实例PINN的局限性,构造了一个参数PINN。三种解算器的结果与参考解一致,证明了它们的准确性和鲁棒性。具体来说,我们报告了圆形和方形截面的误差为0.1%,三角形截面的误差为3.0%,刚性1D情况下VS-PINN的误差从0.97%降至0.11%,参数PINN在不同扭矩参数下的误差为1.0%。一旦训练完成,参数化的PINN可以对不同的PDE数据或问题设置以显著的效率预测解决方案。由于参数PINN不需要再训练,该模型在处理大量实例时的速度比FEM快100倍以上。每个求解器都可以有选择地以任务感知的方式使用,确保其使用与特定目标保持一致,例如特定于几何的求解、处理刚度或参数泛化。
{"title":"Task-aware evolution in physics-informed neural networks: Application to Saint-Venant torsion problems","authors":"Suyeong Jo ,&nbsp;Sanghyeon Park ,&nbsp;Jeesuk Shin ,&nbsp;Jongcheon Park ,&nbsp;Hosung Kim ,&nbsp;Seungchan Ko ,&nbsp;Sangseung Lee ,&nbsp;Joongoo Jeon","doi":"10.1016/j.engappai.2026.113988","DOIUrl":"10.1016/j.engappai.2026.113988","url":null,"abstract":"<div><div>The Saint-Venant torsion theory is a classical theory for analyzing the torsional behavior of structural components. Conventional numerical methods, including the finite element method (FEM), typically rely on mesh-based approaches, which often result in significant increases in computational cost. The objective of this study is to develop a series of novel numerical methods based on physics-informed neural networks (PINN) for solving the Saint-Venant torsion equations. Utilizing the automatic differentiation capability of neural networks, the PINN can provide partial differential equations (PDEs) solvers without the need for intricate computational techniques. We present an integrated framework that simultaneously addresses single-instance stiffness (via VS-PINN) and multi-query parametric efficiency (via Parametric PINN) for torsion problems, which has not been explored in prior work. First, a PINN solver was developed to compute the torsional constant for bars with arbitrary cross-sectional geometries. This was followed by the development of a solver capable of handling cases with sharp geometric transitions; variable-scaling PINN (VS-PINN). Finally, a parametric PINN was constructed to address the limitations of conventional single-instance PINN. The results from all three solvers showed good agreement with reference solutions, demonstrating their accuracy and robustness. Specifically, we report 0.1% error for circular and square sections, 3.0% error for triangular sections with PINN, a reduction from 0.97% to 0.11% with VS-PINN in the stiff 1D case, and 1.0% error for the Parametric PINN across varying torque parameters. Once training has been completed, the parametric PINN can predict solutions with remarkable efficiency for varying PDE data or problem settings. Thanks to the retraining-free nature of the parametric PINN, the model can achieve over 100× faster speed for a large number of instances, compared to FEM. Each solver can be selectively employed in a task-aware manner, ensuring that its utilization aligns with the specific objectives, such as geometry-specific solving, handling stiffness, or parametric generalization.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"168 ","pages":"Article 113988"},"PeriodicalIF":8.0,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1