首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
Green supplier evaluation based on ISO14001 standard via entropy-integrated proximity indexed value method under Linear Diophantine fuzzy sets 基于ISO14001标准的线性丢番图模糊集下熵积分接近指标值法绿色供应商评价
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-10 DOI: 10.1016/j.engappai.2026.114049
Sait Gül , Ali Aydoğdu , Umut Hulusi İnan
The linear Diophantine fuzzy set (LDFS) incorporates two reference parameters, thereby enabling a more comprehensive representation of human judgment. This structure provides flexibility, as the decision-maker can adjust the meaning of the reference parameters to reflect changes in the decision context. Among the many applications of fuzzy sets, information measures such as distance and entropy are particularly significant. Entropy is widely employed in objective attribute-weighting procedures of Multiple Attribute Decision Making (MADM) applications, as it captures the intrinsic information content of attributes. So, the first contribution of this study is the development of a new entropy measure for LDFS. Besides, the second contribution is the extension of the Proximity Indexed Value (PIV) method into the LDFS framework, marking the first proposal of LDF-oriented PIV in the literature. PIV was selected due to its flexibility, ease of application, and the proven strength against the rank reversal phenomenon. The proposed entropy measure is integrated into the attribute-weighting procedure of this new LDF-En-PIV extension. The third contribution is an application-oriented decision model for the selection of green suppliers with respect to their performance in building and employing an ISO14001 Environmental Management System. In this case study, four green supplier alternatives were evaluated across the main components of ISO14001 by a panel of experienced industry experts, with rankings obtained through the LDF-En-PIV approach. The robustness of the proposed approach was presented through comparative analyses with crisp PIV and LDF-ARAS. All comparisons yield consistent rankings, demonstrating the reliability of the proposed approach.
线性丢芬图模糊集(LDFS)包含两个参考参数,从而能够更全面地表示人类的判断。这种结构提供了灵活性,因为决策者可以调整参考参数的含义以反映决策上下文中的变化。在模糊集的众多应用中,距离和熵等信息度量尤为重要。熵是多属性决策(MADM)应用中广泛应用的客观属性加权过程,它捕获属性的内在信息内容。因此,本研究的第一个贡献是为LDFS开发了一种新的熵测度。此外,第二个贡献是将邻近索引值(PIV)方法扩展到LDFS框架中,标志着文献中首次提出面向ldf的PIV。PIV之所以被选中,是因为它的灵活性,易于应用,以及对等级反转现象的证明强度。提出的熵测度被整合到新的LDF-En-PIV扩展的属性加权过程中。第三个贡献是基于绿色供应商在建立和采用ISO14001环境管理体系方面的表现,为选择绿色供应商提供了面向应用的决策模型。在本案例研究中,由经验丰富的行业专家组成的小组对ISO14001的主要组成部分进行了四个绿色供应商替代方案的评估,并通过LDF-En-PIV方法获得了排名。通过与crisp PIV和LDF-ARAS的对比分析,证明了该方法的鲁棒性。所有的比较产生一致的排名,证明了所提出的方法的可靠性。
{"title":"Green supplier evaluation based on ISO14001 standard via entropy-integrated proximity indexed value method under Linear Diophantine fuzzy sets","authors":"Sait Gül ,&nbsp;Ali Aydoğdu ,&nbsp;Umut Hulusi İnan","doi":"10.1016/j.engappai.2026.114049","DOIUrl":"10.1016/j.engappai.2026.114049","url":null,"abstract":"<div><div>The linear Diophantine fuzzy set (LDFS) incorporates two reference parameters, thereby enabling a more comprehensive representation of human judgment. This structure provides flexibility, as the decision-maker can adjust the meaning of the reference parameters to reflect changes in the decision context. Among the many applications of fuzzy sets, information measures such as distance and entropy are particularly significant. Entropy is widely employed in objective attribute-weighting procedures of Multiple Attribute Decision Making (MADM) applications, as it captures the intrinsic information content of attributes. So, the first contribution of this study is the development of a new entropy measure for LDFS. Besides, the second contribution is the extension of the Proximity Indexed Value (PIV) method into the LDFS framework, marking the first proposal of LDF-oriented PIV in the literature. PIV was selected due to its flexibility, ease of application, and the proven strength against the rank reversal phenomenon. The proposed entropy measure is integrated into the attribute-weighting procedure of this new LDF-En-PIV extension. The third contribution is an application-oriented decision model for the selection of green suppliers with respect to their performance in building and employing an ISO14001 Environmental Management System. In this case study, four green supplier alternatives were evaluated across the main components of ISO14001 by a panel of experienced industry experts, with rankings obtained through the LDF-En-PIV approach. The robustness of the proposed approach was presented through comparative analyses with crisp PIV and LDF-ARAS. All comparisons yield consistent rankings, demonstrating the reliability of the proposed approach.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114049"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based coke dry quenching material location prediction using physical information reconstruction features 基于物理信息重构特征的深度学习焦炭干淬物料位置预测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-09 DOI: 10.1016/j.engappai.2026.114117
Xinyang Meng , Keliang Pang , Zhiyuan Gu , Youzhi Zheng , Fujun Liu , Chaoran Wan , Haotian Wu , Minmin Sun , Hua Zhao
Coke dry quenching (CDQ) is a common, environmentally friendly technology applied in iron and steel production and plays an important role in improving coke quality as well as in emission reduction and pollution reduction. Material location prediction is crucial for ensuring the stable operation of dry quenching systems. In this paper, we propose a novel artificial intelligence approach for predicting the location of coke materials in CDQ furnaces by incorporating a method known as physical information feature reconstruction (PIFR). This method integrates physical a priori knowledge (such as the law of mass conservation and furnace structural characteristics) into the feature engineering process, effectively improving the accuracy and stability of time-series predictions in both single-step and multistep forecasting tasks. The experimental results demonstrate that PIFR significantly enhances the performance of various deep learning models. Specifically, for the long short-term memory model, the mean squared error and mean absolute error decreased by 51.25% and 37.63%, respectively, whereas the coefficient of determination increased to 0.941. Moreover, PIFR effectively mitigates issues commonly encountered in multi-step prediction, such as cumulative error and prediction curve flattening. The application of PIFR not only improves the accuracy of the model but also significantly enhances its generalization capability.
焦炭干熄法是钢铁生产中常用的一种环保技术,在提高焦炭质量、减少排放和污染方面具有重要作用。物料位置预测是保证干淬火系统稳定运行的关键。在本文中,我们提出了一种新的人工智能方法,通过结合一种称为物理信息特征重构(PIFR)的方法来预测CDQ炉中焦炭材料的位置。该方法将物理先验知识(如质量守恒定律、炉膛结构特征等)整合到特征工程过程中,有效提高了单步和多步预测任务中时间序列预测的准确性和稳定性。实验结果表明,PIFR显著提高了各种深度学习模型的性能。其中,长短期记忆模型的均方误差和平均绝对误差分别下降了51.25%和37.63%,而决定系数增加到0.941。此外,PIFR有效地缓解了多步预测中常见的问题,如累积误差和预测曲线平坦化。PIFR的应用不仅提高了模型的精度,而且显著增强了模型的泛化能力。
{"title":"Deep learning-based coke dry quenching material location prediction using physical information reconstruction features","authors":"Xinyang Meng ,&nbsp;Keliang Pang ,&nbsp;Zhiyuan Gu ,&nbsp;Youzhi Zheng ,&nbsp;Fujun Liu ,&nbsp;Chaoran Wan ,&nbsp;Haotian Wu ,&nbsp;Minmin Sun ,&nbsp;Hua Zhao","doi":"10.1016/j.engappai.2026.114117","DOIUrl":"10.1016/j.engappai.2026.114117","url":null,"abstract":"<div><div>Coke dry quenching (CDQ) is a common, environmentally friendly technology applied in iron and steel production and plays an important role in improving coke quality as well as in emission reduction and pollution reduction. Material location prediction is crucial for ensuring the stable operation of dry quenching systems. In this paper, we propose a novel artificial intelligence approach for predicting the location of coke materials in CDQ furnaces by incorporating a method known as physical information feature reconstruction (PIFR). This method integrates physical <em>a priori</em> knowledge (such as the law of mass conservation and furnace structural characteristics) into the feature engineering process, effectively improving the accuracy and stability of time-series predictions in both single-step and multistep forecasting tasks. The experimental results demonstrate that PIFR significantly enhances the performance of various deep learning models. Specifically, for the long short-term memory model, the mean squared error and mean absolute error decreased by 51.25% and 37.63%, respectively, whereas the coefficient of determination increased to 0.941. Moreover, PIFR effectively mitigates issues commonly encountered in multi-step prediction, such as cumulative error and prediction curve flattening. The application of PIFR not only improves the accuracy of the model but also significantly enhances its generalization capability.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114117"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
System for diagnosis and optimization of combustion in pulverized coal boilers based on artificial intelligence methods 基于人工智能方法的煤粉锅炉燃烧诊断与优化系统
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-09 DOI: 10.1016/j.engappai.2026.114103
S.S. Abdurakipov, E.B. Butakov, E.P. Kopyev, D.M. Markovich
Efficient and reliable monitoring of processes in coal-fired boilers requires advanced combustion diagnostic and optimization methods. This paper introduces an integrated system based on machine learning and deep learning technologies to diagnose and optimize combustion regimes in pulverized coal-fired boilers, aiming to improve efficiency and safety. An artificial neural network accurately simulated thermogravimetric mass loss curves, achieving an average coefficient of determination of 99%. Deep learning methods were employed to detect combustion regimes by monitoring the coal flame and identifying anomalies in flame images. For datasets lacking precise measurements, an unsupervised autoencoder was developed. It achieved an average precision of 77% and a recall of 66%. For datasets with measurements, a supervised convolutional neural network provided a higher average recall of 89%. Various machine learning algorithms were employed to predict deviations from stable combustion modes, and a long short-term memory network with an attention mechanism performed best. It had a mean absolute percentage error of up to 8% and an average coefficient of determination of 91%.
高效、可靠的燃煤锅炉过程监测需要先进的燃烧诊断和优化方法。本文介绍了一种基于机器学习和深度学习技术的综合系统,用于煤粉锅炉燃烧状态的诊断和优化,以提高效率和安全性。人工神经网络准确模拟了热重质量损失曲线,平均确定系数达到99%。采用深度学习方法通过监测煤炭火焰和识别火焰图像中的异常来检测燃烧状态。对于缺乏精确测量的数据集,开发了一种无监督自编码器。它的平均准确率为77%,召回率为66%。对于具有测量值的数据集,有监督的卷积神经网络提供了89%的平均召回率。不同的机器学习算法被用于预测偏离稳定燃烧模式,其中带有注意机制的长短期记忆网络表现最好。其平均绝对百分比误差高达8%,平均确定系数为91%。
{"title":"System for diagnosis and optimization of combustion in pulverized coal boilers based on artificial intelligence methods","authors":"S.S. Abdurakipov,&nbsp;E.B. Butakov,&nbsp;E.P. Kopyev,&nbsp;D.M. Markovich","doi":"10.1016/j.engappai.2026.114103","DOIUrl":"10.1016/j.engappai.2026.114103","url":null,"abstract":"<div><div>Efficient and reliable monitoring of processes in coal-fired boilers requires advanced combustion diagnostic and optimization methods. This paper introduces an integrated system based on machine learning and deep learning technologies to diagnose and optimize combustion regimes in pulverized coal-fired boilers, aiming to improve efficiency and safety. An artificial neural network accurately simulated thermogravimetric mass loss curves, achieving an average coefficient of determination of 99%. Deep learning methods were employed to detect combustion regimes by monitoring the coal flame and identifying anomalies in flame images. For datasets lacking precise measurements, an unsupervised autoencoder was developed. It achieved an average precision of 77% and a recall of 66%. For datasets with measurements, a supervised convolutional neural network provided a higher average recall of 89%. Various machine learning algorithms were employed to predict deviations from stable combustion modes, and a long short-term memory network with an attention mechanism performed best. It had a mean absolute percentage error of up to 8% and an average coefficient of determination of 91%.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114103"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced nitrogen-oxides prediction in biomass combustion via a dual-channel neural network with flame imaging and residual attention 基于火焰成像和残余注意力的双通道神经网络增强生物质燃烧中氮氧化物预测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-11 DOI: 10.1016/j.engappai.2026.114090
Runfang Hao , Mingyu Wang , Shengjun Chang , Li Qin , Yongqiang Cheng , Gang Lu
Reliable and accurate monitoring of nitrogen oxides (NOx) in flue gas is crucial for emission control in electrical power generation plants due to environmental concerns. Traditional data-driven methods for NOx prediction are often based on single-channel strategies and use multiple variables from the combustion process, which show insufficient feature correlation, making reliable and accurate NOx prediction very difficult. To tackle these limitations, this study proposes a Dual-Channel Deep Neural Network incorporating a Residual Attention Mechanism (DCDNN-RAM) which integrates flame visual features with a residual attention mechanism. An Enhanced Laplacian of Gaussian (ELG) filtering algorithm is employed to optimize the preprocessing of flame images and reduce significantly feature discrimination. An innovative heterogeneous dual-channel parallel architecture is developed, where the primary channel utilizes small convolutional kernels to extract local detail features while the secondary channel employs large kernels to capture global contextual information, coupled with a spatial-frequency collaborative feature extraction module for effective fusion of deep local and shallow global features. Notably, the incorporated dual residual attention mechanism (RAM) effectively enhances key feature representation via channel-spatial adaptive weight allocation. Experimental validation under oxy-biomass combustion conditions demonstrates that the proposed model achieves a coefficient of determination (R2) of 0.946 with a mean absolute error (MAE) of 4.26, outperforming four benchmark single-channel models with MAE reductions of 63.71%, 51.2%, 38.44%, and 24.33%, respectively. This study provides a promising solution for the reliable and accurate prediction of NOx emissions, and thus offers important practical value for promoting cleaner production and supporting the carbon neutrality goal of the power generation industry.
由于环境问题,可靠和准确地监测烟气中的氮氧化物(NOx)对于发电厂的排放控制至关重要。传统的数据驱动NOx预测方法往往基于单通道策略,使用燃烧过程中的多个变量,特征相关性不足,难以实现可靠、准确的NOx预测。为了解决这些限制,本研究提出了一种结合残余注意机制的双通道深度神经网络(DCDNN-RAM),该网络将火焰视觉特征与残余注意机制相结合。采用增强的拉普拉斯高斯滤波算法对火焰图像的预处理进行了优化,显著降低了特征识别。开发了一种创新的异构双通道并行架构,其中主通道利用小卷积核提取局部细节特征,副通道利用大卷积核捕获全局上下文信息,再加上空间-频率协同特征提取模块,有效融合深度局部特征和浅全局特征。值得注意的是,采用双剩余注意机制(RAM),通过信道空间自适应权重分配,有效地增强了关键特征的表示。在全氧生物质燃烧条件下的实验验证表明,该模型的决定系数(R2)为0.946,平均绝对误差(MAE)为4.26,优于4个基准单通道模型,平均绝对误差分别降低了63.71%、51.2%、38.44%和24.33%。本研究为实现NOx排放的可靠、准确预测提供了一种有前景的解决方案,对促进清洁生产、支持发电行业实现碳中和目标具有重要的实用价值。
{"title":"Enhanced nitrogen-oxides prediction in biomass combustion via a dual-channel neural network with flame imaging and residual attention","authors":"Runfang Hao ,&nbsp;Mingyu Wang ,&nbsp;Shengjun Chang ,&nbsp;Li Qin ,&nbsp;Yongqiang Cheng ,&nbsp;Gang Lu","doi":"10.1016/j.engappai.2026.114090","DOIUrl":"10.1016/j.engappai.2026.114090","url":null,"abstract":"<div><div>Reliable and accurate monitoring of nitrogen oxides (NOx) in flue gas is crucial for emission control in electrical power generation plants due to environmental concerns. Traditional data-driven methods for NOx prediction are often based on single-channel strategies and use multiple variables from the combustion process, which show insufficient feature correlation, making reliable and accurate NOx prediction very difficult. To tackle these limitations, this study proposes a Dual-Channel Deep Neural Network incorporating a Residual Attention Mechanism (DCDNN-RAM) which integrates flame visual features with a residual attention mechanism. An Enhanced Laplacian of Gaussian (ELG) filtering algorithm is employed to optimize the preprocessing of flame images and reduce significantly feature discrimination. An innovative heterogeneous dual-channel parallel architecture is developed, where the primary channel utilizes small convolutional kernels to extract local detail features while the secondary channel employs large kernels to capture global contextual information, coupled with a spatial-frequency collaborative feature extraction module for effective fusion of deep local and shallow global features. Notably, the incorporated dual residual attention mechanism (RAM) effectively enhances key feature representation via channel-spatial adaptive weight allocation. Experimental validation under oxy-biomass combustion conditions demonstrates that the proposed model achieves a coefficient of determination (R<sup>2</sup>) of 0.946 with a mean absolute error (MAE) of 4.26, outperforming four benchmark single-channel models with MAE reductions of 63.71%, 51.2%, 38.44%, and 24.33%, respectively. This study provides a promising solution for the reliable and accurate prediction of NOx emissions, and thus offers important practical value for promoting cleaner production and supporting the carbon neutrality goal of the power generation industry.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114090"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scene graph-driven reasoning for action planning of humanoid robot 仿人机器人动作规划的场景图驱动推理
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-12 DOI: 10.1016/j.engappai.2026.114150
Dmitry Yudin , Alexander Lazarev , Eva Bakaeva , Angelika Kochetkova , Alexey Kovalev , Aleksandr Panov
Recent progress in visual data analysis has significantly improved the ability of autonomous robots to understand their surroundings and perform complex tasks. This paper presents a modular method named Scene Graph-driven Reasoning for Action Planning (SG-RAPL) designed for high-level planning in dynamic environments, enabling adaptive control of humanoid robots. The method employs a three-dimensional (3D) scene graph to represent the environment and detect abnormal situations, while a large language model (LLM) translates natural-language commands into consecutive low-level actions. An original perceptual segmentation and tracking module constructs the scene graph in real time by providing instance segmentation, obstacle detection, and object pose estimation using data fusion with Augmented Reality University of Cordoba (ArUco) markers. The Planner module decomposes high-level tasks into subtasks such as navigation and object manipulation. Extensive experiments conducted on a manually collected and annotated dataset demonstrate that the proposed artificial intelligence-based approach efficiently plans complex actions in both virtual and real-world warehouse environments. The code and dataset of the proposed approach will be made publicly available.
视觉数据分析的最新进展显著提高了自主机器人理解周围环境和执行复杂任务的能力。提出了一种基于场景图驱动的动作规划推理(SG-RAPL)的模块化方法,用于动态环境下的高级规划,实现仿人机器人的自适应控制。该方法采用三维(3D)场景图来表示环境并检测异常情况,而大型语言模型(LLM)将自然语言命令转换为连续的低级动作。原始的感知分割和跟踪模块通过与增强现实科尔多瓦大学(ArUco)标记进行数据融合,提供实例分割、障碍物检测和目标姿态估计,实时构建场景图。Planner模块将高级任务分解为子任务,例如导航和对象操作。在人工收集和注释的数据集上进行的大量实验表明,所提出的基于人工智能的方法有效地规划了虚拟和现实世界仓库环境中的复杂动作。建议方法的代码和数据集将向公众提供。
{"title":"Scene graph-driven reasoning for action planning of humanoid robot","authors":"Dmitry Yudin ,&nbsp;Alexander Lazarev ,&nbsp;Eva Bakaeva ,&nbsp;Angelika Kochetkova ,&nbsp;Alexey Kovalev ,&nbsp;Aleksandr Panov","doi":"10.1016/j.engappai.2026.114150","DOIUrl":"10.1016/j.engappai.2026.114150","url":null,"abstract":"<div><div>Recent progress in visual data analysis has significantly improved the ability of autonomous robots to understand their surroundings and perform complex tasks. This paper presents a modular method named Scene Graph-driven Reasoning for Action Planning (SG-RAPL) designed for high-level planning in dynamic environments, enabling adaptive control of humanoid robots. The method employs a three-dimensional (3D) scene graph to represent the environment and detect abnormal situations, while a large language model (LLM) translates natural-language commands into consecutive low-level actions. An original perceptual segmentation and tracking module constructs the scene graph in real time by providing instance segmentation, obstacle detection, and object pose estimation using data fusion with Augmented Reality University of Cordoba (ArUco) markers. The Planner module decomposes high-level tasks into subtasks such as navigation and object manipulation. Extensive experiments conducted on a manually collected and annotated dataset demonstrate that the proposed artificial intelligence-based approach efficiently plans complex actions in both virtual and real-world warehouse environments. The code and dataset of the proposed approach will be made publicly available.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114150"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Local-Global Fusion Vision Mamba UNet Framework for medical image segmentation 一种局部-全局融合视觉Mamba UNet框架用于医学图像分割
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-06 DOI: 10.1016/j.engappai.2026.113987
Yanbo Li , Zihan Mao , Feiwei Qin , Yong Peng , Guodao Zhang , Xugang Xi , Xiaoqin Ma , Huanhuan Yu , Yu Zhou , Zhu Zhu
As a State Space Model (SSM) that achieves long-range dependency modeling with linear computational complexity, Mamba demonstrates significant efficiency advantages in medical image segmentation. However, while Mamba-based methods enable long-range modeling with linear complexity, their global dependency mechanisms often lead to local feature attenuation, particularly affecting the processing of complex anatomical structures. Existing multi-scale fusion methods also exhibit limited compatibility with State Space Models. To address these challenges, this paper proposes the Local-Global Fusion Vision Mamba UNet (LGFVM-UNet) framework. Its core innovation lies in the Dynamic Gating-enhanced Local-Global Fusion Visual State Space (LGF-VSS) block, which enables the synergistic modeling of global context and local details. Additionally, we designed a Multi-level Cross-scale Feature Fusion Block (MCFB) that enhances multi-scale feature representation through bidirectional resampling and spatial-channel dual attention mechanisms. Additionally, we propose a Gradient Statistics-based Adaptive Hierarchical Loss that dynamically adjusts multi-level supervision weights to optimize the learning process. The proposed method is experimentally validated on five public medical image segmentation datasets spanning diverse imaging modalities and anatomical structures. Results demonstrate that our approach outperforms state-of-the-art methods, excelling in long-range dependency modeling, local detail capture, and multi-scale feature fusion. The source code of our work is available at https://github.com/NicoleDyson/LGFVM-UNet.
Mamba作为一种状态空间模型(SSM),实现了具有线性计算复杂度的远程依赖建模,在医学图像分割中具有显著的效率优势。然而,尽管基于mamba的方法能够实现线性复杂性的远程建模,但其全局依赖机制往往会导致局部特征衰减,特别是影响复杂解剖结构的处理。现有的多尺度融合方法与状态空间模型的兼容性也有限。为了解决这些问题,本文提出了局部-全局融合视觉曼巴UNet (LGFVM-UNet)框架。其核心创新在于动态门控增强的局部-全局融合视觉状态空间(LGF-VSS)块,实现了全局上下文和局部细节的协同建模。此外,我们设计了一个多级跨尺度特征融合块(MCFB),通过双向重采样和空间通道双注意机制增强多尺度特征表示。此外,我们提出了一种基于梯度统计的自适应分层损失,动态调整多层监督权重以优化学习过程。该方法在五个公共医学图像分割数据集上进行了实验验证,这些数据集跨越了不同的成像方式和解剖结构。结果表明,我们的方法优于最先进的方法,在远程依赖建模、局部细节捕获和多尺度特征融合方面表现出色。我们工作的源代码可在https://github.com/NicoleDyson/LGFVM-UNet上获得。
{"title":"A Local-Global Fusion Vision Mamba UNet Framework for medical image segmentation","authors":"Yanbo Li ,&nbsp;Zihan Mao ,&nbsp;Feiwei Qin ,&nbsp;Yong Peng ,&nbsp;Guodao Zhang ,&nbsp;Xugang Xi ,&nbsp;Xiaoqin Ma ,&nbsp;Huanhuan Yu ,&nbsp;Yu Zhou ,&nbsp;Zhu Zhu","doi":"10.1016/j.engappai.2026.113987","DOIUrl":"10.1016/j.engappai.2026.113987","url":null,"abstract":"<div><div>As a State Space Model (SSM) that achieves long-range dependency modeling with linear computational complexity, Mamba demonstrates significant efficiency advantages in medical image segmentation. However, while Mamba-based methods enable long-range modeling with linear complexity, their global dependency mechanisms often lead to local feature attenuation, particularly affecting the processing of complex anatomical structures. Existing multi-scale fusion methods also exhibit limited compatibility with State Space Models. To address these challenges, this paper proposes the Local-Global Fusion Vision Mamba UNet (LGFVM-UNet) framework. Its core innovation lies in the Dynamic Gating-enhanced Local-Global Fusion Visual State Space (LGF-VSS) block, which enables the synergistic modeling of global context and local details. Additionally, we designed a Multi-level Cross-scale Feature Fusion Block (MCFB) that enhances multi-scale feature representation through bidirectional resampling and spatial-channel dual attention mechanisms. Additionally, we propose a Gradient Statistics-based Adaptive Hierarchical Loss that dynamically adjusts multi-level supervision weights to optimize the learning process. The proposed method is experimentally validated on five public medical image segmentation datasets spanning diverse imaging modalities and anatomical structures. Results demonstrate that our approach outperforms state-of-the-art methods, excelling in long-range dependency modeling, local detail capture, and multi-scale feature fusion. The source code of our work is available at <span><span>https://github.com/NicoleDyson/LGFVM-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113987"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep learning model for photovoltaic soiling loss prediction and estimation based on Large Kernel Cross-Attention Fusion 基于大核交叉关注融合的光伏污染损失预测与估计深度学习模型
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-12 DOI: 10.1016/j.engappai.2026.114097
Shaokai Zheng , Peng Yan , Shengsu Ni , Daolei Wang
The loss of photovoltaic (PV) power due to environmental soiling presents a significant challenge to the PV power generation industry, making accurate prediction and estimation of power loss critical. However, most existing algorithmic models rely on traditional fusion methods to integrate PV images and environmental factors (time and irradiance) across modalities, limiting their ability to effectively utilize high-quality cross-modal information for downstream tasks. This paper proposes a novel cross-modal interactive fusion mechanism, Large Kernel Cross-Attention Fusion (LKCA Fusion), and introduces a new photovoltaic soiling loss (PVSL) prediction and estimation model, Large Kernel Fusion Solar Network (LKFSolarNet). LKFSolarNet utilizes an improved image backbone architecture to efficiently extract features from PV soiling images, followed by LKCA Fusion to perform cross-modal fusion between these image features and environmental factors. LKCA Fusion incorporates lightweight large kernel convolutions to enhance the model's ability to capture global information across different PV modalities and improve cross-modal interaction. Additionally, a Gradient Flow Enhanced branch is introduced to further strengthen the training of the image backbone network, enhancing overall model performance. Experiments on open-source Solar Panel Soiling Image dataset demonstrate that LKFSolarNet reduces Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) by 3.9% and 4.0%, respectively, in the prediction task and improves accuracy by 3.6% in the 16-class estimation task. Compared to the latest methods, LKFSolarNet reduces MAE and RMSE losses by 19.7% and 5.9%, respectively, and shows some improvement in estimation accuracy.
由于环境污染造成的光伏发电损失是光伏发电行业面临的一个重大挑战,准确预测和估算光伏发电损失至关重要。然而,大多数现有算法模型依赖于传统的融合方法来跨模态整合PV图像和环境因素(时间和辐照度),限制了它们有效利用高质量跨模态信息进行下游任务的能力。提出了一种新的跨模态交互融合机制——大核交叉关注融合(Large Kernel Cross-Attention fusion, LKCA fusion),并介绍了一种新的光伏污染损失(PVSL)预测与估计模型——大核融合太阳能网络(Large Kernel fusion Solar Network, LKFSolarNet)。LKFSolarNet利用改进的图像骨干架构高效提取PV污染图像的特征,然后通过LKCA Fusion将这些图像特征与环境因素进行跨模态融合。LKCA Fusion集成了轻量级的大核卷积,以增强模型在不同PV模式下捕获全局信息的能力,并改善跨模式交互。此外,还引入了梯度流增强分支,进一步加强了图像骨干网络的训练,提高了模型的整体性能。在开源太阳能电池板污染图像数据集上的实验表明,LKFSolarNet在预测任务中的平均绝对误差(MAE)和均方根误差(RMSE)分别降低了3.9%和4.0%,在16类估计任务中的准确率提高了3.6%。与最新方法相比,LKFSolarNet将MAE和RMSE损失分别降低了19.7%和5.9%,估计精度有所提高。
{"title":"A deep learning model for photovoltaic soiling loss prediction and estimation based on Large Kernel Cross-Attention Fusion","authors":"Shaokai Zheng ,&nbsp;Peng Yan ,&nbsp;Shengsu Ni ,&nbsp;Daolei Wang","doi":"10.1016/j.engappai.2026.114097","DOIUrl":"10.1016/j.engappai.2026.114097","url":null,"abstract":"<div><div>The loss of photovoltaic (PV) power due to environmental soiling presents a significant challenge to the PV power generation industry, making accurate prediction and estimation of power loss critical. However, most existing algorithmic models rely on traditional fusion methods to integrate PV images and environmental factors (time and irradiance) across modalities, limiting their ability to effectively utilize high-quality cross-modal information for downstream tasks. This paper proposes a novel cross-modal interactive fusion mechanism, Large Kernel Cross-Attention Fusion (LKCA Fusion), and introduces a new photovoltaic soiling loss (PVSL) prediction and estimation model, Large Kernel Fusion Solar Network (LKFSolarNet). LKFSolarNet utilizes an improved image backbone architecture to efficiently extract features from PV soiling images, followed by LKCA Fusion to perform cross-modal fusion between these image features and environmental factors. LKCA Fusion incorporates lightweight large kernel convolutions to enhance the model's ability to capture global information across different PV modalities and improve cross-modal interaction. Additionally, a Gradient Flow Enhanced branch is introduced to further strengthen the training of the image backbone network, enhancing overall model performance. Experiments on open-source Solar Panel Soiling Image dataset demonstrate that LKFSolarNet reduces Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) by 3.9% and 4.0%, respectively, in the prediction task and improves accuracy by 3.6% in the 16-class estimation task. Compared to the latest methods, LKFSolarNet reduces MAE and RMSE losses by 19.7% and 5.9%, respectively, and shows some improvement in estimation accuracy.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114097"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural-aware key node identification in hypergraphs via representation learning and fine-tuning 基于表征学习和微调的超图结构感知关键节点识别
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-07 DOI: 10.1016/j.engappai.2026.114108
Xiaonan Ni , Guangyuan Mei , Su-Su Zhang , Yang Chen , Xin Xu , Chuang Liu , Xiu-Xiu Zhan
The ability to pinpoint strategically important nodes plays a decisive role in shaping diffusion outcomes and maintaining the stability of complex systems. Yet, most existing approaches remain rooted in pairwise interaction assumptions, making them ill-suited for systems where collective participation and attribute-sharing give rise to higher-order structures. In this work, we introduce AHGA, a learning-driven framework that leverages autoencoder-based representations, hypergraph neural network pre-training, and an active learning mechanism to uncover nodes that jointly influence propagation dynamics and structural cohesion. Rather than relying on handcrafted descriptors, AHGA learns informative higher-order features and progressively refines node importance through selective supervision. Evaluations on eight empirical hypergraphs show that this strategy leads to substantially more reliable rankings, with improvements of up to 36.8% over classical baselines. Beyond ranking accuracy, nodes prioritized by AHGA exhibit pronounced structural leverage: their removal triggers an accelerated loss of network efficiency, reaching 0.6628, markedly exceeding the disruptive effect achieved by competing methods. These findings demonstrate that AHGA not only advances higher-order node identification methodology, but also offers practical guidance for intervention strategies in scenarios such as misinformation containment and infrastructure robustness.
精确定位战略重要节点的能力在形成扩散结果和维持复杂系统的稳定性方面起着决定性作用。然而,大多数现有的方法仍然植根于成对交互假设,这使得它们不适合集体参与和属性共享产生高阶结构的系统。在这项工作中,我们引入了AHGA,这是一个学习驱动的框架,它利用基于自编码器的表示、超图神经网络预训练和主动学习机制来发现共同影响传播动态和结构内聚的节点。AHGA不依赖于手工制作的描述符,而是学习信息丰富的高阶特征,并通过选择性监督逐步优化节点重要性。对八个经验超图的评估表明,这种策略导致了更可靠的排名,比经典基线提高了36.8%。除了排序精度之外,AHGA优先排序的节点表现出明显的结构杠杆:它们的移除会加速网络效率的损失,达到0.6628,明显超过竞争方法所达到的破坏效果。这些发现表明,AHGA不仅推进了高阶节点识别方法,而且为错误信息遏制和基础设施鲁棒性等情况下的干预策略提供了实用指导。
{"title":"Structural-aware key node identification in hypergraphs via representation learning and fine-tuning","authors":"Xiaonan Ni ,&nbsp;Guangyuan Mei ,&nbsp;Su-Su Zhang ,&nbsp;Yang Chen ,&nbsp;Xin Xu ,&nbsp;Chuang Liu ,&nbsp;Xiu-Xiu Zhan","doi":"10.1016/j.engappai.2026.114108","DOIUrl":"10.1016/j.engappai.2026.114108","url":null,"abstract":"<div><div>The ability to pinpoint strategically important nodes plays a decisive role in shaping diffusion outcomes and maintaining the stability of complex systems. Yet, most existing approaches remain rooted in pairwise interaction assumptions, making them ill-suited for systems where collective participation and attribute-sharing give rise to higher-order structures. In this work, we introduce AHGA, a learning-driven framework that leverages autoencoder-based representations, hypergraph neural network pre-training, and an active learning mechanism to uncover nodes that jointly influence propagation dynamics and structural cohesion. Rather than relying on handcrafted descriptors, AHGA learns informative higher-order features and progressively refines node importance through selective supervision. Evaluations on eight empirical hypergraphs show that this strategy leads to substantially more reliable rankings, with improvements of up to 36.8% over classical baselines. Beyond ranking accuracy, nodes prioritized by AHGA exhibit pronounced structural leverage: their removal triggers an accelerated loss of network efficiency, reaching 0.6628, markedly exceeding the disruptive effect achieved by competing methods. These findings demonstrate that AHGA not only advances higher-order node identification methodology, but also offers practical guidance for intervention strategies in scenarios such as misinformation containment and infrastructure robustness.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114108"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A domain knowledge and cognitive law driven approach to anti-vibration hammer defect detection 领域知识和认知规律驱动的防震锤缺陷检测方法
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-10 DOI: 10.1016/j.engappai.2026.114100
Hang Niu, Xinyu Ge, Xiaoyu Zhao, Ke Yang, Qianming Wang, Yongjie Zhai, Zhedong Hu
The intelligent detection of anti-vibration hammer defects in transmission lines via computer vision is confronted with challenges due to the limited number of defect samples and the high similarity between defect classes. To this end, a domain knowledge and cognitive law driven approach to anti-vibration hammer defect detection is proposed, which integrates a Structural Knowledge and Geometric Feature-driven image generation method (SKGF) with a Cognitive Law-guided Multilevel Progressive target Detection framework (CLMP-Det). The imposition of morphological and tilt angle constraints is incorporated into the SKGF, based on prior knowledge of the anti-vibration hammer’s structure and its tilt angle distribution characteristics. These constraints can guide the generation of artificial anti-vibration hammer samples semantically consistent with the real physical structure and solve the problem of insufficient defective samples. Secondly, CLMP-Det is designed to simulate the human visual cognitive law through a progressive strategy, progressing from ease to difficulty. This strategy includes two sequential phases: preliminary perception and in-depth discrimination, which enhance the model’s capacity to distinguish between the challenging normal and tilt defect categories. The results of the experiment demonstrate that the proposed method significantly improves the overall detection performance of several widely-used detectors. Compared to the baseline model, our approach achieves a 7.1% improvement in mean average precision. Thus, the method’s robust generalization capability and potential for engineering applications are fully validated.
由于缺陷样本数量有限,且缺陷类别之间具有较高的相似性,利用计算机视觉对传输线防震锤缺陷进行智能检测面临着挑战。为此,提出了一种领域知识和认知规律驱动的抗振锤缺陷检测方法,该方法将结构知识和几何特征驱动的图像生成方法(SKGF)与认知规律指导的多层次渐进目标检测框架(CLMP-Det)相结合。基于对抗振锤结构及其倾斜角分布特性的先验知识,将形态和倾斜角约束的施加纳入到SKGF中。这些约束条件可以指导生成语义上与真实物理结构一致的人工抗振锤试样,解决缺陷试样不足的问题。其次,CLMP-Det通过从简单到困难的递进策略来模拟人类视觉认知规律。该策略包括两个连续的阶段:初步感知和深度识别,这增强了模型区分具有挑战性的正常和倾斜缺陷类别的能力。实验结果表明,该方法显著提高了几种常用检测器的整体检测性能。与基线模型相比,我们的方法在平均精度上提高了7.1%。从而充分验证了该方法的鲁棒泛化能力和工程应用潜力。
{"title":"A domain knowledge and cognitive law driven approach to anti-vibration hammer defect detection","authors":"Hang Niu,&nbsp;Xinyu Ge,&nbsp;Xiaoyu Zhao,&nbsp;Ke Yang,&nbsp;Qianming Wang,&nbsp;Yongjie Zhai,&nbsp;Zhedong Hu","doi":"10.1016/j.engappai.2026.114100","DOIUrl":"10.1016/j.engappai.2026.114100","url":null,"abstract":"<div><div>The intelligent detection of anti-vibration hammer defects in transmission lines via computer vision is confronted with challenges due to the limited number of defect samples and the high similarity between defect classes. To this end, a domain knowledge and cognitive law driven approach to anti-vibration hammer defect detection is proposed, which integrates a Structural Knowledge and Geometric Feature-driven image generation method (SKGF) with a Cognitive Law-guided Multilevel Progressive target Detection framework (CLMP-Det). The imposition of morphological and tilt angle constraints is incorporated into the SKGF, based on prior knowledge of the anti-vibration hammer’s structure and its tilt angle distribution characteristics. These constraints can guide the generation of artificial anti-vibration hammer samples semantically consistent with the real physical structure and solve the problem of insufficient defective samples. Secondly, CLMP-Det is designed to simulate the human visual cognitive law through a progressive strategy, progressing from ease to difficulty. This strategy includes two sequential phases: preliminary perception and in-depth discrimination, which enhance the model’s capacity to distinguish between the challenging normal and tilt defect categories. The results of the experiment demonstrate that the proposed method significantly improves the overall detection performance of several widely-used detectors. Compared to the baseline model, our approach achieves a 7.1% improvement in mean average precision. Thus, the method’s robust generalization capability and potential for engineering applications are fully validated.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114100"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-domain few-shot hyperspectral classification via orthogonal feature disentanglement 基于正交特征解纠缠的跨域少射高光谱分类
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-07 DOI: 10.1016/j.engappai.2026.114071
Yurong Zhang, Jinrong He, Yuhang Li
In cross-domain few-shot hyperspectral image classification, the limited availability of labeled target-domain samples renders the model highly sensitive to domain shifts. Existing feature disentanglement approaches struggle to simultaneously suppress domain-specific noise and retain cross-domain discriminative semantics. This leads to the entanglement of domain-shared and domain-specific components, increases the risk of negative transfer, and ultimately becomes the bottleneck limiting further improvements in accuracy and generalization. To address these challenges, this paper presents an Orthogonal Feature Disentanglement Network (OFD-Net). Using orthogonal subspace decomposition, OFD-Net projects features into two mutually exclusive subspaces: domain-shared and domain-specific. The domain-shared subspace focuses on extracting cross-domain invariant features, while the domain-specific subspace retains local domain discriminative information. This dual-stream architecture effectively suppresses interference from irrelevant inter-domain features. Additionally, feature orthogonality constraints enhance the model's adaptability to target domain shifts. OFD-Net also adopts a multi-task learning framework. Cross-domain alignment loss ensures the consistency of shared feature distributions between the source and target domains, while inter-class discriminative loss improves the class separability of specific features, creating a hierarchical feature optimization mechanism. On four public benchmark datasets including Indian Pines, Pavia University, Salinas, and Houston, OFD-Net achieves Overall Accuracy of 80.33%, 85.80%, 93.33%, and 80.05% respectively. Its performance outperforms existing state-of-the-art methods, demonstrating superior cross-domain transfer robustness and feature discriminative capability.
在跨域少镜头高光谱图像分类中,标记目标域样本的有限可用性使得模型对域漂移高度敏感。现有的特征解纠缠方法难以同时抑制特定领域的噪声和保留跨领域的判别语义。这导致了领域共享和领域特定组件的纠缠,增加了负迁移的风险,并最终成为限制准确性和泛化进一步提高的瓶颈。为了解决这些问题,本文提出了正交特征解纠缠网络(OFD-Net)。利用正交子空间分解,OFD-Net将特征划分为两个相互排斥的子空间:域共享和域特定。领域共享子空间侧重于提取跨领域的不变特征,而特定于领域的子空间则保留了局部领域的判别信息。这种双流结构有效地抑制了不相关域间特征的干扰。此外,特征正交性约束增强了模型对目标域位移的适应性。OFD-Net还采用了多任务学习框架。跨域对齐损失保证了源域和目标域共享特征分布的一致性,而类间判别损失提高了特定特征的类可分性,创建了层次化的特征优化机制。在Indian Pines、Pavia University、Salinas和Houston四个公共基准数据集上,OFD-Net的总体准确率分别为80.33%、85.80%、93.33%和80.05%。它的性能优于现有的最先进的方法,表现出优越的跨域转移鲁棒性和特征判别能力。
{"title":"Cross-domain few-shot hyperspectral classification via orthogonal feature disentanglement","authors":"Yurong Zhang,&nbsp;Jinrong He,&nbsp;Yuhang Li","doi":"10.1016/j.engappai.2026.114071","DOIUrl":"10.1016/j.engappai.2026.114071","url":null,"abstract":"<div><div>In cross-domain few-shot hyperspectral image classification, the limited availability of labeled target-domain samples renders the model highly sensitive to domain shifts. Existing feature disentanglement approaches struggle to simultaneously suppress domain-specific noise and retain cross-domain discriminative semantics. This leads to the entanglement of domain-shared and domain-specific components, increases the risk of negative transfer, and ultimately becomes the bottleneck limiting further improvements in accuracy and generalization. To address these challenges, this paper presents an Orthogonal Feature Disentanglement Network (OFD-Net). Using orthogonal subspace decomposition, OFD-Net projects features into two mutually exclusive subspaces: domain-shared and domain-specific. The domain-shared subspace focuses on extracting cross-domain invariant features, while the domain-specific subspace retains local domain discriminative information. This dual-stream architecture effectively suppresses interference from irrelevant inter-domain features. Additionally, feature orthogonality constraints enhance the model's adaptability to target domain shifts. OFD-Net also adopts a multi-task learning framework. Cross-domain alignment loss ensures the consistency of shared feature distributions between the source and target domains, while inter-class discriminative loss improves the class separability of specific features, creating a hierarchical feature optimization mechanism. On four public benchmark datasets including Indian Pines, Pavia University, Salinas, and Houston, OFD-Net achieves Overall Accuracy of 80.33%, 85.80%, 93.33%, and 80.05% respectively. Its performance outperforms existing state-of-the-art methods, demonstrating superior cross-domain transfer robustness and feature discriminative capability.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114071"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1