首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
An efficient physics-informed neural network model for predicting methane and carbon dioxide adsorption in shale: Simultaneous enhancement of recovery and carbon sequestration 预测页岩中甲烷和二氧化碳吸附的有效物理信息神经网络模型:同时增强采收率和碳固存
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-24 DOI: 10.1016/j.engappai.2026.113911
Yu Zhou , Jinyang Li , Wenchuan Liu , Xinlong Lu , Zizuo Liu , Xiaoping Li , Dengwei Jing
Machine learning (ML) models offer rapid and low-cost prediction of methane (CH4) and carbon dioxide (CO2) adsorption in shale, which is crucial for enhancing recovery and achieving CO2 geological sequestration. However, with adsorption mechanisms model not yet fully established, existing purely data-driven ML lacks reliable physical constraints and exhibits weak interpretability, limited accuracy, and poor generalization. To address this gap, a novel fractal supercritical Dubinin-Radushkevich-Langmuir (FSDR-L) model was derived to describe the adsorption behaviors of CH4 and CO2 in shale and to directly quantify the critical pore size for gas adsorption mechanism transition. The results indicate that increasing temperature shifts CH4/CO2 molecules in shale toward monolayer adsorption, while reducing the contribution of pore-filling. Subsequently, a physics-informed neural network (PINN) model guided by the insights of the FSDR-L model was developed for the first time to predict CH4/CO2 adsorption amounts in shale. The findings reveal that the PINN model achieved reductions of 38.93 % in mean absolute percentage error, 39.47 % in mean absolute error, and 57.46 % in root mean square error compared to the best-performing conventional ML model, demonstrating superior predictive performance and generalization capability in capturing complex shale-gas adsorption behaviors. Finally, to enhance the interpretability of the PINN model, a variance-based sensitivity analysis was conducted, revealing that total organic carbon, pressure, temperature, and pore volume are the key factors governing CH4/CO2 adsorption capacity in shale.
机器学习(ML)模型提供了页岩中甲烷(CH4)和二氧化碳(CO2)吸附的快速和低成本预测,这对于提高采收率和实现二氧化碳地质封存至关重要。然而,由于吸附机理模型尚未完全建立,现有纯数据驱动的ML缺乏可靠的物理约束,可解释性较弱,精度有限,泛化能力差。为了解决这一空白,我们建立了一种新的分形超临界Dubinin-Radushkevich-Langmuir (FSDR-L)模型来描述页岩对CH4和CO2的吸附行为,并直接量化气体吸附机制转变的临界孔径。结果表明,温度升高会使页岩中CH4/CO2分子向单层吸附方向转变,同时降低孔隙填充的贡献;随后,在FSDR-L模型的指导下,首次开发了物理信息神经网络(PINN)模型,用于预测页岩中CH4/CO2的吸附量。研究结果表明,与性能最好的传统ML模型相比,PINN模型的平均绝对百分比误差降低了38.93%,平均绝对误差降低了39.47%,均方根误差降低了57.46%,在捕获复杂页岩气吸附行为方面表现出了卓越的预测性能和泛化能力。最后,为了提高PINN模型的可解释性,进行了基于方差的敏感性分析,发现总有机碳、压力、温度和孔隙体积是影响页岩CH4/CO2吸附能力的关键因素。
{"title":"An efficient physics-informed neural network model for predicting methane and carbon dioxide adsorption in shale: Simultaneous enhancement of recovery and carbon sequestration","authors":"Yu Zhou ,&nbsp;Jinyang Li ,&nbsp;Wenchuan Liu ,&nbsp;Xinlong Lu ,&nbsp;Zizuo Liu ,&nbsp;Xiaoping Li ,&nbsp;Dengwei Jing","doi":"10.1016/j.engappai.2026.113911","DOIUrl":"10.1016/j.engappai.2026.113911","url":null,"abstract":"<div><div>Machine learning (ML) models offer rapid and low-cost prediction of methane (CH<sub>4</sub>) and carbon dioxide (CO<sub>2</sub>) adsorption in shale, which is crucial for enhancing recovery and achieving CO<sub>2</sub> geological sequestration. However, with adsorption mechanisms model not yet fully established, existing purely data-driven ML lacks reliable physical constraints and exhibits weak interpretability, limited accuracy, and poor generalization. To address this gap, a novel fractal supercritical Dubinin-Radushkevich-Langmuir (FSDR-L) model was derived to describe the adsorption behaviors of CH<sub>4</sub> and CO<sub>2</sub> in shale and to directly quantify the critical pore size for gas adsorption mechanism transition. The results indicate that increasing temperature shifts CH<sub>4</sub>/CO<sub>2</sub> molecules in shale toward monolayer adsorption, while reducing the contribution of pore-filling. Subsequently, a physics-informed neural network (PINN) model guided by the insights of the FSDR-L model was developed for the first time to predict CH<sub>4</sub>/CO<sub>2</sub> adsorption amounts in shale. The findings reveal that the PINN model achieved reductions of 38.93 % in mean absolute percentage error, 39.47 % in mean absolute error, and 57.46 % in root mean square error compared to the best-performing conventional ML model, demonstrating superior predictive performance and generalization capability in capturing complex shale-gas adsorption behaviors. Finally, to enhance the interpretability of the PINN model, a variance-based sensitivity analysis was conducted, revealing that total organic carbon, pressure, temperature, and pore volume are the key factors governing CH<sub>4</sub>/CO<sub>2</sub> adsorption capacity in shale.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113911"},"PeriodicalIF":8.0,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Environmental air pollution forecasting using sensor networks, edge computing, and temporal fusion transformers 环境空气污染预测使用传感器网络,边缘计算,和时间融合变压器
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-24 DOI: 10.1016/j.engappai.2026.113961
Chintalapudi V. Suresh , Muralidhar Nayak Bhukya , Yogendra Arya
Air pollution predictions are fundamental to the effective monitoring and public health policy of urban areas. I present an air quality forecasting mechanism using distributed sensor networks, edge computing, and the Temporal Fusion Transformer (TFT) for forecasting the future Air Quality Index (AQI) categories. The framework proposed here uses multivariate time series acquired through air quality sensors coupled with dynamic pollutant and meteorological parameter measurements and static context information to accommodate intricate temporal dependencies. Edge-assisted preprocessing has been applied to reduce the communication load and increase the system responsiveness, while centralized transformer-based modeling is proposed for scalable and interpretable forecasting. Extensive experimentation results indicate that the proposed model exceeds recurrent and attention-based baselines in classification accuracy, precision, recall and F1-score, and provides lower inference latency. Ablation and latency analyses additionally support the role of static features, temporal attention, and edge-level processing. Such findings demonstrate the feasibility that the proposed framework provides an effective, interpretable, and deployable solution for real-time urban air quality forecasting.
空气污染预测是城市地区有效监测和公共卫生政策的基础。我提出了一种空气质量预测机制,使用分布式传感器网络、边缘计算和时间融合变压器(TFT)来预测未来的空气质量指数(AQI)类别。本文提出的框架使用通过空气质量传感器获得的多变量时间序列,加上动态污染物和气象参数测量以及静态上下文信息,以适应复杂的时间依赖性。采用边缘辅助预处理技术减少通信负荷,提高系统响应能力;采用集中式变压器建模技术实现可扩展性和可解释性预测。大量的实验结果表明,该模型在分类准确率、精度、召回率和f1得分方面都超过了循环基线和基于注意力的基线,并且提供了更低的推理延迟。消融和延迟分析也支持静态特征、时间注意力和边缘处理的作用。这些发现表明,所提出的框架为实时城市空气质量预测提供了一个有效的、可解释的和可部署的解决方案。
{"title":"Environmental air pollution forecasting using sensor networks, edge computing, and temporal fusion transformers","authors":"Chintalapudi V. Suresh ,&nbsp;Muralidhar Nayak Bhukya ,&nbsp;Yogendra Arya","doi":"10.1016/j.engappai.2026.113961","DOIUrl":"10.1016/j.engappai.2026.113961","url":null,"abstract":"<div><div>Air pollution predictions are fundamental to the effective monitoring and public health policy of urban areas. I present an air quality forecasting mechanism using distributed sensor networks, edge computing, and the Temporal Fusion Transformer (TFT) for forecasting the future Air Quality Index (AQI) categories. The framework proposed here uses multivariate time series acquired through air quality sensors coupled with dynamic pollutant and meteorological parameter measurements and static context information to accommodate intricate temporal dependencies. Edge-assisted preprocessing has been applied to reduce the communication load and increase the system responsiveness, while centralized transformer-based modeling is proposed for scalable and interpretable forecasting. Extensive experimentation results indicate that the proposed model exceeds recurrent and attention-based baselines in classification accuracy, precision, recall and F1-score, and provides lower inference latency. Ablation and latency analyses additionally support the role of static features, temporal attention, and edge-level processing. Such findings demonstrate the feasibility that the proposed framework provides an effective, interpretable, and deployable solution for real-time urban air quality forecasting.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113961"},"PeriodicalIF":8.0,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-channel adaptive neural network for querying the optimal time-varying damage route with collective spatial keywords 基于集合空间关键字的多通道自适应神经网络时变损伤路径查询
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-24 DOI: 10.1016/j.engappai.2026.113906
Zhilei Xu, Wei Huang
In geographic information system services, the optimal route planning with collective spatial keywords plays a crucial role in providing efficient and feasible travel solutions. Existing research on damage conditions of points of interest in road networks over time remains incomplete. To address this issue, we propose an innovative multi-channel adaptive neural network model and algorithm for querying the optimal path with collective spatial keywords on a time-varying damage network. To address the variations in edge lengths in time-varying networks, we have designed multi-channel gated neurons that incorporate node state judgment, departure time selection, and transmission time control. These neurons are integrated with a logic gate to manage the time-varying edge lengths. To ensure the accuracy of data exchange, we have introduced a multi-channel mechanism for data isolation. We have analyzed the time complexity and correctness of the proposed algorithms and conducted comparative experiments using a public road network dataset. The experimental results demonstrate the effectiveness and superiority of the proposed method in solving the optimal path with collective spatial keywords query problem on time-varying damage networks, providing technical support for intelligent traffic planning projects.
在地理信息系统服务中,具有集合空间关键词的最优路径规划对于提供高效可行的出行方案起着至关重要的作用。现有的关于道路网络中兴趣点随时间变化的损坏情况的研究仍然不完整。为了解决这一问题,我们提出了一种创新的多通道自适应神经网络模型和算法,用于在时变损伤网络上查询具有集合空间关键字的最优路径。为了解决时变网络中边缘长度的变化,我们设计了包含节点状态判断、出发时间选择和传输时间控制的多通道门控神经元。这些神经元与逻辑门相结合,以管理时变的边缘长度。为了确保数据交换的准确性,我们引入了多通道数据隔离机制。我们分析了所提出算法的时间复杂度和正确性,并使用公共道路网络数据集进行了对比实验。实验结果证明了该方法在解决时变损伤网络上具有集合空间关键词查询的最优路径问题上的有效性和优越性,为智能交通规划项目提供了技术支持。
{"title":"A multi-channel adaptive neural network for querying the optimal time-varying damage route with collective spatial keywords","authors":"Zhilei Xu,&nbsp;Wei Huang","doi":"10.1016/j.engappai.2026.113906","DOIUrl":"10.1016/j.engappai.2026.113906","url":null,"abstract":"<div><div>In geographic information system services, the optimal route planning with collective spatial keywords plays a crucial role in providing efficient and feasible travel solutions. Existing research on damage conditions of points of interest in road networks over time remains incomplete. To address this issue, we propose an innovative multi-channel adaptive neural network model and algorithm for querying the optimal path with collective spatial keywords on a time-varying damage network. To address the variations in edge lengths in time-varying networks, we have designed multi-channel gated neurons that incorporate node state judgment, departure time selection, and transmission time control. These neurons are integrated with a logic gate to manage the time-varying edge lengths. To ensure the accuracy of data exchange, we have introduced a multi-channel mechanism for data isolation. We have analyzed the time complexity and correctness of the proposed algorithms and conducted comparative experiments using a public road network dataset. The experimental results demonstrate the effectiveness and superiority of the proposed method in solving the optimal path with collective spatial keywords query problem on time-varying damage networks, providing technical support for intelligent traffic planning projects.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113906"},"PeriodicalIF":8.0,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A twin-branch decoupled network for multi-class unsupervised anomaly detection 多类无监督异常检测的双分支解耦网络
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113891
Bohan Wang , Jihong Wan , Jie Zhao , Xiaocao Ouyang , Xiaoping Li
The use of powerful pre-trained Vision Transformer (ViT) encoders in Multi-Class Unsupervised Anomaly Detection (MUAD) can lead to an “identity mapping shortcut”, where the model’s strong generalization inadvertently reconstructs anomalies. To address this specific manifestation of the over-generalization problem, this paper proposes DAD-Net, an innovative hybrid framework combining ViT and Convolutional Neural Networks (CNNs) that imposes synergistic constraints from both the model architecture and the training objective. Architecturally, a novel asymmetric twin-branch CNN decoder is designed to achieve a multi-scale reconstruction of normal patterns. Its shallow branch is specialized for reconstructing high-frequency textures, while its deep branch models abstract semantics. At the objective level, a hard feature loss compels the model to focus on the most complex normal patterns, effectively inhibiting the formation of the “identity mapping shortcut”. Comprehensive experiments validate DAD-Net’s direct applicability to engineering tasks. For industrial defect detection, the framework achieves superior performance on standard benchmarks. Furthermore, the model shows excellent generalization on a challenging cross-domain medical dataset. This highlights its potential as a versatile tool for other critical domains, such as medical diagnostic support. Ablation studies confirm the effectiveness of our core designs, positioning DAD-Net as a robust and practical solution for real-world quality control systems.
在多类无监督异常检测(MUAD)中使用强大的预训练视觉转换器(ViT)编码器可以导致“身份映射捷径”,其中模型的强泛化无意中重建异常。为了解决这种过度泛化问题的具体表现,本文提出了DAD-Net,这是一种结合ViT和卷积神经网络(cnn)的创新混合框架,它从模型架构和训练目标两个方面施加了协同约束。在结构上,设计了一种新的非对称双分支CNN解码器,实现了正常模式的多尺度重建。它的浅分支专门用于重建高频纹理,而其深分支模型抽象语义。在客观层面上,硬特征损失迫使模型关注最复杂的正态模式,有效地抑制了“身份映射捷径”的形成。综合实验验证了DAD-Net对工程任务的直接适用性。对于工业缺陷检测,该框架在标准基准上实现了卓越的性能。此外,该模型在具有挑战性的跨领域医学数据集上表现出良好的泛化效果。这突出了它作为其他关键领域(如医疗诊断支持)的通用工具的潜力。消融研究证实了我们核心设计的有效性,将DAD-Net定位为现实世界质量控制系统的强大实用解决方案。
{"title":"A twin-branch decoupled network for multi-class unsupervised anomaly detection","authors":"Bohan Wang ,&nbsp;Jihong Wan ,&nbsp;Jie Zhao ,&nbsp;Xiaocao Ouyang ,&nbsp;Xiaoping Li","doi":"10.1016/j.engappai.2026.113891","DOIUrl":"10.1016/j.engappai.2026.113891","url":null,"abstract":"<div><div>The use of powerful pre-trained Vision Transformer (ViT) encoders in Multi-Class Unsupervised Anomaly Detection (MUAD) can lead to an “identity mapping shortcut”, where the model’s strong generalization inadvertently reconstructs anomalies. To address this specific manifestation of the over-generalization problem, this paper proposes DAD-Net, an innovative hybrid framework combining ViT and Convolutional Neural Networks (CNNs) that imposes synergistic constraints from both the model architecture and the training objective. Architecturally, a novel asymmetric twin-branch CNN decoder is designed to achieve a multi-scale reconstruction of normal patterns. Its shallow branch is specialized for reconstructing high-frequency textures, while its deep branch models abstract semantics. At the objective level, a hard feature loss compels the model to focus on the most complex normal patterns, effectively inhibiting the formation of the “identity mapping shortcut”. Comprehensive experiments validate DAD-Net’s direct applicability to engineering tasks. For industrial defect detection, the framework achieves superior performance on standard benchmarks. Furthermore, the model shows excellent generalization on a challenging cross-domain medical dataset. This highlights its potential as a versatile tool for other critical domains, such as medical diagnostic support. Ablation studies confirm the effectiveness of our core designs, positioning DAD-Net as a robust and practical solution for real-world quality control systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113891"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enriched multi-view ensemble approach for high-dimensional imbalanced data classification 高维不平衡数据分类的丰富多视图集成方法
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113940
Yuhong Xu , Dongyi Ding , Peijie Huang , Zhiwen Yu , C.L. Philip Chen
High-dimensional imbalanced data classification is a challenging issue in real-world applications, where massive invalid features and class imbalance severely impede the behavior of classifiers. Due to high-dimensional features, imbalanced approaches suffer hardship in yielding adequate results. To tackle these issues, this paper proposes an enriched multi-view ensemble approach (EMEA), aiming to construct an accurate and resilient classifier ensemble system for high-dimensional class-skewed data. First, an enriched multi-view optimization (EMO) is designed to extract effective and diverse features from high-dimensional imbalanced data, it promotes the classification ability through subview learning on multiple diverse scenarios. Then a prioritized integration of subviews (PIS) is developed to conduct selective integration for subviews, aiming to construct a high-quality view that enhances decision-making for high-dimensional imbalanced data classification. Finally, EMEA employs resampling to construct a balanced subset, mitigating the impact of class imbalance on the base classifier. The experiments on 16 high-dimensional class-skewed datasets demonstrate that EMEA is superior to other mainstream imbalanced ensemble approaches.
在现实应用中,高维不平衡数据分类是一个具有挑战性的问题,大量无效特征和类不平衡严重阻碍了分类器的行为。由于高维特征,不平衡的方法很难产生足够的结果。为了解决这些问题,本文提出了一种丰富的多视图集成方法(EMEA),旨在为高维类倾斜数据构建一个准确、有弹性的分类器集成系统。首先,设计了一种丰富的多视图优化(EMO)算法,从高维不平衡数据中提取有效且多样的特征,通过对多个不同场景的子视图学习,提高分类能力;在此基础上,提出了子视图优先级集成(PIS)方法,对子视图进行选择性集成,构建高质量的子视图,增强对高维不平衡数据分类的决策能力。最后,EMEA采用重采样来构建一个平衡子集,减轻类不平衡对基分类器的影响。在16个高维类偏斜数据集上的实验表明,EMEA方法优于其他主流的不平衡集成方法。
{"title":"Enriched multi-view ensemble approach for high-dimensional imbalanced data classification","authors":"Yuhong Xu ,&nbsp;Dongyi Ding ,&nbsp;Peijie Huang ,&nbsp;Zhiwen Yu ,&nbsp;C.L. Philip Chen","doi":"10.1016/j.engappai.2026.113940","DOIUrl":"10.1016/j.engappai.2026.113940","url":null,"abstract":"<div><div>High-dimensional imbalanced data classification is a challenging issue in real-world applications, where massive invalid features and class imbalance severely impede the behavior of classifiers. Due to high-dimensional features, imbalanced approaches suffer hardship in yielding adequate results. To tackle these issues, this paper proposes an enriched multi-view ensemble approach (EMEA), aiming to construct an accurate and resilient classifier ensemble system for high-dimensional class-skewed data. First, an enriched multi-view optimization (EMO) is designed to extract effective and diverse features from high-dimensional imbalanced data, it promotes the classification ability through subview learning on multiple diverse scenarios. Then a prioritized integration of subviews (PIS) is developed to conduct selective integration for subviews, aiming to construct a high-quality view that enhances decision-making for high-dimensional imbalanced data classification. Finally, EMEA employs resampling to construct a balanced subset, mitigating the impact of class imbalance on the base classifier. The experiments on 16 high-dimensional class-skewed datasets demonstrate that EMEA is superior to other mainstream imbalanced ensemble approaches.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113940"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revolutionizing artificial intelligence enabled predictive analytics with smart consumer electronics for real-time healthcare monitoring 革命性的人工智能通过智能消费电子产品实现预测分析,实现实时医疗监控
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2025.113712
Ala Saleh Alluhaidan , Amal M. Aqlan , Mashael Maashi , Ahmed Alsayat , Mashail N. Alkhomsan , Faten Derouez , Rakan Alanazi , Tawfiq Hasanin
The healthcare field has undergone a significant shift in the last few years with the advent of data streaming expertise. Data streaming refers to the constant transfer and study of real-time data from multiple sources. In the healthcare environment, data streaming enables healthcare providers to monitor patients' health, predict health issues, and provide personalized care. Real-time observation of patient well-being and predictive analytics for disease analysis and prevention have become gradually significant in healthcare, as they permit healthcare providers to perceive probable health problems before they arise and occur before they become severe. Consumer electronics health technology has transformed health monitoring by permitting constant tracking of crucial signs, physical activity, and other health restrictions. Incorporating artificial intelligence (AI) and deep learning (DL) into consumer electronic devices promises to improve personalized healthcare by aiding real-time data study and early recognition of health problems. In this manuscript, a Personal Health Monitoring with Predictive Analytics and Consumer Electronics using Dimensionality Reduction and Ensemble Classifiers (PHMPACE-DREC) model is presented. The intention is to propose a consumer electronics method for real-time health monitoring and predictive analytics using advanced models to enable proactive and personalized healthcare solutions. To accomplish that, the PHMPACE-DREC model involves a data pre-processing stage initially by applying min-max normalization to convert the input data into a suitable format. Next, the feature selection step is applied, which is a critical stage as it decreases the data dimensionality and enhances efficiency by using three methods, such as Fast Correlation-based Filter Feature (FCBF), Recursive Feature Elimination (RFE), and Least Absolute Shrinkage and Selection Operator (LASSO). Finally, the classification process is performed by the three ensemble classifiers, such as Elman Neural Network (ENN), Deep Q-Network (DQN), and Conditional Variational Autoencoder (CVAE). The experimental analysis of the PHMPACE-DREC approach portrayed a superior accuracy value of 99.11 % over existing methods under the Wearables dataset.
随着数据流专业知识的出现,医疗保健领域在过去几年中经历了重大转变。数据流是指对来自多个来源的实时数据进行不断的传输和研究。在医疗保健环境中,数据流使医疗保健提供者能够监控患者的健康状况、预测健康问题并提供个性化护理。对患者健康状况的实时观察和疾病分析和预防的预测分析在医疗保健中逐渐变得重要,因为它们允许医疗保健提供者在可能出现的健康问题出现之前和在它们变得严重之前发现它们。消费电子健康技术通过允许持续跟踪关键体征、身体活动和其他健康限制,改变了健康监测。将人工智能(AI)和深度学习(DL)整合到消费电子设备中,通过帮助实时数据研究和早期识别健康问题,有望改善个性化医疗保健。在这份手稿中,个人健康监测与预测分析和消费电子产品使用降维和集成分类器(PHMPACE-DREC)模型提出。其目的是提出一种用于实时健康监测和预测分析的消费电子方法,使用先进的模型来实现主动和个性化的医疗保健解决方案。为了实现这一点,PHMPACE-DREC模型涉及一个数据预处理阶段,首先应用最小-最大归一化将输入数据转换为合适的格式。接下来是特征选择步骤,这是一个关键阶段,因为它通过使用快速相关滤波特征(FCBF)、递归特征消除(RFE)和最小绝对收缩和选择算子(LASSO)三种方法来降低数据维数并提高效率。最后,采用Elman神经网络(ENN)、Deep Q-Network (DQN)和条件变分自编码器(CVAE)三种集成分类器进行分类。实验分析表明,在可穿戴设备数据集下,PHMPACE-DREC方法的准确率比现有方法高99.11%。
{"title":"Revolutionizing artificial intelligence enabled predictive analytics with smart consumer electronics for real-time healthcare monitoring","authors":"Ala Saleh Alluhaidan ,&nbsp;Amal M. Aqlan ,&nbsp;Mashael Maashi ,&nbsp;Ahmed Alsayat ,&nbsp;Mashail N. Alkhomsan ,&nbsp;Faten Derouez ,&nbsp;Rakan Alanazi ,&nbsp;Tawfiq Hasanin","doi":"10.1016/j.engappai.2025.113712","DOIUrl":"10.1016/j.engappai.2025.113712","url":null,"abstract":"<div><div>The healthcare field has undergone a significant shift in the last few years with the advent of data streaming expertise. Data streaming refers to the constant transfer and study of real-time data from multiple sources. In the healthcare environment, data streaming enables healthcare providers to monitor patients' health, predict health issues, and provide personalized care. Real-time observation of patient well-being and predictive analytics for disease analysis and prevention have become gradually significant in healthcare, as they permit healthcare providers to perceive probable health problems before they arise and occur before they become severe. Consumer electronics health technology has transformed health monitoring by permitting constant tracking of crucial signs, physical activity, and other health restrictions. Incorporating artificial intelligence (AI) and deep learning (DL) into consumer electronic devices promises to improve personalized healthcare by aiding real-time data study and early recognition of health problems. In this manuscript, a Personal Health Monitoring with Predictive Analytics and Consumer Electronics using Dimensionality Reduction and Ensemble Classifiers (PHMPACE-DREC) model is presented. The intention is to propose a consumer electronics method for real-time health monitoring and predictive analytics using advanced models to enable proactive and personalized healthcare solutions. To accomplish that, the PHMPACE-DREC model involves a data pre-processing stage initially by applying min-max normalization to convert the input data into a suitable format. Next, the feature selection step is applied, which is a critical stage as it decreases the data dimensionality and enhances efficiency by using three methods, such as Fast Correlation-based Filter Feature (FCBF), Recursive Feature Elimination (RFE), and Least Absolute Shrinkage and Selection Operator (LASSO). Finally, the classification process is performed by the three ensemble classifiers, such as Elman Neural Network (ENN), Deep Q-Network (DQN), and Conditional Variational Autoencoder (CVAE). The experimental analysis of the PHMPACE-DREC approach portrayed a superior accuracy value of 99.11 % over existing methods under the Wearables dataset.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113712"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fourier-enhanced sequence-to-sequence latent graph neural networks for multi-node spatiotemporal forecasting in a hydroelectric reservoir 基于傅里叶增强序列对序列潜在图神经网络的水电站多节点时空预测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113939
Laio Oriel Seman , Stefano Frizzo Stefenon , Kin-Choong Yow , Leandro dos Santos Coelho , Viviana Cocco Mariani
This paper presents a Fourier-enhanced dynamic sequence-to-sequence latent graph neural network (Seq2SeqLatentGNN), a deep learning architecture for multi-node spatiotemporal forecasting in hydroelectric reservoir systems. The model integrates three key components: (i) a custom Fourier layer that analyzes global temporal patterns through frequency-domain transformations, (ii) a latent correlation graph convolutional network that infers relational structures between monitoring stations without requiring predefined adjacency matrices, and (iii) an attention-based sequence-to-sequence model that processes temporal dependencies while enabling multi-step forecasting. The architecture simultaneously learns graph structure and forecasting tasks, adapting to changing spatial relationships between reservoir nodes. The proposed architecture was evaluated using a comprehensive dataset derived from 19 interconnected hydroelectric reservoirs located in southern Brazil. The dataset encompasses multiple years of high-resolution (hourly) measurements, including reservoir water levels, inflow and outflow rates, precipitation records, and energy production metrics. Experimental results demonstrate that Seq2SeqLatentGNN achieves superior performance compared to conventional statistical models and contemporary machine learning methods, as measured by standard error metrics. Analysis of the learned latent correlations reveals meaningful spatial dependencies that align with hydrological principles. The model exhibits consistent performance across varying temporal patterns, adapts to regime transitions, and captures both periodic and nonstationary dynamics. The proposed architecture contributes to spatiotemporal forecasting by combining spectral processing, dynamic graph learning, and sequence modeling in a unified framework applicable to systems with evolving connectivity patterns.
本文提出了一种基于傅里叶增强的动态序列到序列潜在图神经网络(Seq2SeqLatentGNN),这是一种用于水电水库系统多节点时空预测的深度学习架构。该模型集成了三个关键组件:(i)通过频域变换分析全球时间模式的自定义傅立叶层,(ii)推断监测站之间关系结构的潜在相关图卷积网络,而不需要预定义的邻接矩阵,以及(iii)基于注意力的序列到序列模型,该模型在实现多步骤预测的同时处理时间依赖性。该体系结构同时学习图结构和预测任务,以适应水库节点之间不断变化的空间关系。使用来自巴西南部19个相互连接的水力发电水库的综合数据集对拟议的建筑进行了评估。该数据集包含多年的高分辨率(每小时)测量数据,包括水库水位、流入和流出率、降水记录和能源生产指标。实验结果表明,与传统统计模型和当代机器学习方法相比,Seq2SeqLatentGNN在标准误差度量方面取得了卓越的性能。对习得的潜在相关性的分析揭示了与水文原理一致的有意义的空间依赖性。该模型在不同的时间模式中表现出一致的性能,适应状态转换,并捕获周期性和非平稳动态。该体系结构将光谱处理、动态图学习和序列建模结合在一个统一的框架中,适用于具有不断变化的连接模式的系统,有助于进行时空预测。
{"title":"Fourier-enhanced sequence-to-sequence latent graph neural networks for multi-node spatiotemporal forecasting in a hydroelectric reservoir","authors":"Laio Oriel Seman ,&nbsp;Stefano Frizzo Stefenon ,&nbsp;Kin-Choong Yow ,&nbsp;Leandro dos Santos Coelho ,&nbsp;Viviana Cocco Mariani","doi":"10.1016/j.engappai.2026.113939","DOIUrl":"10.1016/j.engappai.2026.113939","url":null,"abstract":"<div><div>This paper presents a Fourier-enhanced dynamic sequence-to-sequence latent graph neural network (Seq2SeqLatentGNN), a deep learning architecture for multi-node spatiotemporal forecasting in hydroelectric reservoir systems. The model integrates three key components: (i) a custom Fourier layer that analyzes global temporal patterns through frequency-domain transformations, (ii) a latent correlation graph convolutional network that infers relational structures between monitoring stations without requiring predefined adjacency matrices, and (iii) an attention-based sequence-to-sequence model that processes temporal dependencies while enabling multi-step forecasting. The architecture simultaneously learns graph structure and forecasting tasks, adapting to changing spatial relationships between reservoir nodes. The proposed architecture was evaluated using a comprehensive dataset derived from 19 interconnected hydroelectric reservoirs located in southern Brazil. The dataset encompasses multiple years of high-resolution (hourly) measurements, including reservoir water levels, inflow and outflow rates, precipitation records, and energy production metrics. Experimental results demonstrate that Seq2SeqLatentGNN achieves superior performance compared to conventional statistical models and contemporary machine learning methods, as measured by standard error metrics. Analysis of the learned latent correlations reveals meaningful spatial dependencies that align with hydrological principles. The model exhibits consistent performance across varying temporal patterns, adapts to regime transitions, and captures both periodic and nonstationary dynamics. The proposed architecture contributes to spatiotemporal forecasting by combining spectral processing, dynamic graph learning, and sequence modeling in a unified framework applicable to systems with evolving connectivity patterns.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113939"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Micro ribonucleic acids-drug sensitivity prediction by variational graph auto-encoder and collaborative matrix factorization 基于变分图自编码器和协同矩阵分解的微核糖核酸药物敏感性预测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113901
Yunyin Li , Shudong Wang , Yuanyuan Zhang , Chuanru Ren , Shanchen Pang , Tiyao Liu , Yingye Liu
The mechanisms of action for numerous drugs involve micro ribonucleic acids (miRNAs), highlighting the significance of studying miRNA-mediated drug sensitivity in drug discovery and disease treatment. Despite advancements in computational approaches, challenges persist in effectively extracting drug and miRNA features and accurately predicting their associations. The existing similarity networks of drugs and miRNAs are in urgent need of supplementing comprehensive similarity information. In addition, most computational methods extract only single-level features without combining information from different levels, limiting the performance of the models. To overcome these challenges, we combine Variational graph auto-encoder and Collaborative matrix factorization to identify MiRNA-Drug Sensitivity (VCMDS). VCMDS figures out the Gaussian Interaction Profile (GIP) kernel similarities between drugs and miRNAs and adds these measurements to each their network. By aggregating multiple sources of information, the GIP kernel similarity provides useful information by considering a wider network of interactions and measuring similarity more accurately. Subsequently, it extracts features of miRNAs and drugs at various levels by applying variational graph auto-encoder and collaborative matrix factorization. Linear and nonlinear features can be combined to produce high-quality features and thus improve the prediction performance. Finally, predicted scores are obtained using a fully connected network. VCMDS achieves an average Area Under Curve (AUC) of 0.9632 in the 5-fold Cross-Validation (CV) experiment, outperforming other competitive methods. Two types of case studies further demonstrate the effectiveness of VCMDS.
许多药物的作用机制都涉及到微核糖核酸(miRNAs),这凸显了研究mirna介导的药物敏感性在药物发现和疾病治疗中的重要意义。尽管计算方法取得了进步,但在有效提取药物和miRNA特征并准确预测它们之间的关联方面仍然存在挑战。现有的药物和mirna相似网络急需补充全面的相似信息。此外,大多数计算方法只提取单一层次的特征,而没有将不同层次的信息结合起来,这限制了模型的性能。为了克服这些挑战,我们结合变分图自编码器和协同矩阵分解来识别mirna -药物敏感性(VCMDS)。VCMDS计算出药物和mirna之间的高斯相互作用谱(GIP)核相似性,并将这些测量值添加到它们的每个网络中。通过聚合多个信息源,GIP内核相似性通过考虑更广泛的交互网络和更准确地度量相似性来提供有用的信息。随后,利用变分图自编码器和协同矩阵分解技术提取mirna和药物的各个层次的特征。线性特征和非线性特征相结合可以产生高质量的特征,从而提高预测性能。最后,利用全连接网络得到预测分数。在5重交叉验证(CV)实验中,VCMDS的平均曲线下面积(AUC)为0.9632,优于其他竞争方法。两种类型的案例研究进一步证明了VCMDS的有效性。
{"title":"Micro ribonucleic acids-drug sensitivity prediction by variational graph auto-encoder and collaborative matrix factorization","authors":"Yunyin Li ,&nbsp;Shudong Wang ,&nbsp;Yuanyuan Zhang ,&nbsp;Chuanru Ren ,&nbsp;Shanchen Pang ,&nbsp;Tiyao Liu ,&nbsp;Yingye Liu","doi":"10.1016/j.engappai.2026.113901","DOIUrl":"10.1016/j.engappai.2026.113901","url":null,"abstract":"<div><div>The mechanisms of action for numerous drugs involve micro ribonucleic acids (miRNAs), highlighting the significance of studying miRNA-mediated drug sensitivity in drug discovery and disease treatment. Despite advancements in computational approaches, challenges persist in effectively extracting drug and miRNA features and accurately predicting their associations. The existing similarity networks of drugs and miRNAs are in urgent need of supplementing comprehensive similarity information. In addition, most computational methods extract only single-level features without combining information from different levels, limiting the performance of the models. To overcome these challenges, we combine Variational graph auto-encoder and Collaborative matrix factorization to identify MiRNA-Drug Sensitivity (VCMDS). VCMDS figures out the Gaussian Interaction Profile (GIP) kernel similarities between drugs and miRNAs and adds these measurements to each their network. By aggregating multiple sources of information, the GIP kernel similarity provides useful information by considering a wider network of interactions and measuring similarity more accurately. Subsequently, it extracts features of miRNAs and drugs at various levels by applying variational graph auto-encoder and collaborative matrix factorization. Linear and nonlinear features can be combined to produce high-quality features and thus improve the prediction performance. Finally, predicted scores are obtained using a fully connected network. VCMDS achieves an average Area Under Curve (AUC) of 0.9632 in the 5-fold Cross-Validation (CV) experiment, outperforming other competitive methods. Two types of case studies further demonstrate the effectiveness of VCMDS.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113901"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on cable terminal interface defect state detection based on electric field characteristics and multi-core improved support vector machine 基于电场特征和多核改进支持向量机的电缆终端接口缺陷状态检测研究
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113739
Yujing Tang , Yang Fu , Qin Cai , Jieping Wu , Qi Wang , Guoqiang Gao
As key equipment for high-speed rail power transmission and the connection of high-voltage systems, the cable terminals are crucial to ensuring the stable operation of the railway system. However, the existing detection methods for cable terminals are easily affected by on-site noise and have low detection accuracy. Therefore, this paper proposes a method for detecting interface defect status of high-speed cable terminals based on the electric field strength feature set and multi-kernel support vector machine (MK-SVM). Firstly, a spatial electric field detection platform was built to extract the electric field intensity of the prefabricated defective cable terminals of different lengths. Secondly, the optimization of the characteristic parameters of electric field strength of defective cable terminals was realized based on the Pearson coefficient method. In order to improve the recognition effect and model generalization ability, a MK-SVM combining linear kernel function and radial basis kernel function was proposed. Finally, a comparative study was conducted on the optimization effects of particle swarm algorithm, firefly algorithm, simulated annealing algorithm and genetic algorithm on MK-SVM. Research has shown that using genetic algorithm for parameter optimization of multi-core SVM has the best performance, with recognition accuracy, average precision, average recall, and average F1 score of 95.6 %, 96 %, 95.6 %, and 0.96, respectively. Compared with the unoptimized SVM, the four feature parameters increased by 8.9 %, 7.9 %, 8.9 %, and 9.6 %, respectively.
电缆终端作为高铁输电和高压系统连接的关键设备,对保证铁路系统的稳定运行至关重要。但现有的电缆终端检测方法容易受到现场噪声的影响,检测精度较低。为此,本文提出了一种基于电场强度特征集和多核支持向量机(MK-SVM)的高速电缆终端接口缺陷状态检测方法。首先,建立空间电场检测平台,提取预制不同长度缺陷电缆端子的电场强度;其次,基于Pearson系数法实现了缺陷电缆端子电场强度特征参数的优化。为了提高识别效果和模型泛化能力,提出了线性核函数和径向基核函数相结合的MK-SVM算法。最后,对比研究了粒子群算法、萤火虫算法、模拟退火算法和遗传算法对MK-SVM的优化效果。研究表明,采用遗传算法对多核支持向量机进行参数优化的效果最好,识别准确率为95.6%,平均精密度为96%,平均查全率为95.6%,平均F1分数为0.96。与未优化支持向量机相比,4个特征参数分别提高了8.9%、7.9%、8.9%和9.6%。
{"title":"Research on cable terminal interface defect state detection based on electric field characteristics and multi-core improved support vector machine","authors":"Yujing Tang ,&nbsp;Yang Fu ,&nbsp;Qin Cai ,&nbsp;Jieping Wu ,&nbsp;Qi Wang ,&nbsp;Guoqiang Gao","doi":"10.1016/j.engappai.2026.113739","DOIUrl":"10.1016/j.engappai.2026.113739","url":null,"abstract":"<div><div>As key equipment for high-speed rail power transmission and the connection of high-voltage systems, the cable terminals are crucial to ensuring the stable operation of the railway system. However, the existing detection methods for cable terminals are easily affected by on-site noise and have low detection accuracy. Therefore, this paper proposes a method for detecting interface defect status of high-speed cable terminals based on the electric field strength feature set and multi-kernel support vector machine (MK-SVM). Firstly, a spatial electric field detection platform was built to extract the electric field intensity of the prefabricated defective cable terminals of different lengths. Secondly, the optimization of the characteristic parameters of electric field strength of defective cable terminals was realized based on the Pearson coefficient method. In order to improve the recognition effect and model generalization ability, a MK-SVM combining linear kernel function and radial basis kernel function was proposed. Finally, a comparative study was conducted on the optimization effects of particle swarm algorithm, firefly algorithm, simulated annealing algorithm and genetic algorithm on MK-SVM. Research has shown that using genetic algorithm for parameter optimization of multi-core SVM has the best performance, with recognition accuracy, average precision, average recall, and average F1 score of 95.6 %, 96 %, 95.6 %, and 0.96, respectively. Compared with the unoptimized SVM, the four feature parameters increased by 8.9 %, 7.9 %, 8.9 %, and 9.6 %, respectively.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113739"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balance divergence for knowledge distillation 知识蒸馏的平衡发散
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-23 DOI: 10.1016/j.engappai.2026.113943
Yafei Qi , Chen Wang , Zhaoning Zhang , Yaping Liu , Yongmin Zhang
Knowledge distillation (KD) represents a fundamental artificial intelligence (AI) technique for model compression and optimization. In computer vision AI applications, most KD methods use Kullback–Leibler (KL) divergence to align teacher–student output probabilities, but often neglect crucial negative aspects of teacher “dark knowledge” by underweighting low-probability signals. This limitation leads to suboptimal logit mimicry and unbalanced knowledge transfer to the student network. In this paper, we investigate the impact of this imbalance and propose a novel method, named Balance Divergence Distillation (BDD). By introducing a compensatory operation using reverse KL divergence, our method can improve the modeling of the extremely small values in the negative from the teacher and preserve the learning capacity for the positive. Furthermore, we test the impact of different temperature coefficients adjustments, which can lead to further balance in knowledge transfer. The evaluation results demonstrate that our method achieves accuracy improvements of 1%3% for lightweight student networks over standard KD methods on both Canadian Institute for Advanced Research 100 classes(CIFAR-100) and ImageNet datasets. Additionally, when applied to semantic segmentation, our approach enhances the student by 4.55% in mean Intersection over Union (mIoU) compared to the baseline on the Cityscapes dataset. These experiments confirm that our method provides a simple yet highly effective solution that can be seamlessly integrated with various KD frameworks across different vision tasks.
知识蒸馏(Knowledge distillation, KD)是一种用于模型压缩和优化的基本人工智能技术。在计算机视觉人工智能应用中,大多数KD方法使用kullbackleibler (KL)散度来对齐师生输出概率,但往往通过低估低概率信号而忽略了教师“暗知识”的关键负面方面。这种限制导致次优的逻辑模仿和不平衡的知识转移到学生网络。在本文中,我们研究了这种不平衡的影响,并提出了一种新的方法,称为平衡发散蒸馏(BDD)。通过引入反向KL散度的补偿操作,我们的方法可以改进对来自教师的负值极小值的建模,并保留对正值的学习能力。此外,我们还测试了不同温度系数调整对知识转移的影响,从而进一步平衡知识转移。评估结果表明,在加拿大高级研究所100类(CIFAR-100)和ImageNet数据集上,我们的方法在轻量级学生网络上的准确率比标准KD方法提高了1% ~ 3%。此外,当应用于语义分割时,与cityscape数据集的基线相比,我们的方法将学生的平均交叉口比联盟(mIoU)提高了4.55%。这些实验证实,我们的方法提供了一种简单而高效的解决方案,可以与不同视觉任务的各种KD框架无缝集成。
{"title":"Balance divergence for knowledge distillation","authors":"Yafei Qi ,&nbsp;Chen Wang ,&nbsp;Zhaoning Zhang ,&nbsp;Yaping Liu ,&nbsp;Yongmin Zhang","doi":"10.1016/j.engappai.2026.113943","DOIUrl":"10.1016/j.engappai.2026.113943","url":null,"abstract":"<div><div>Knowledge distillation (KD) represents a fundamental artificial intelligence (AI) technique for model compression and optimization. In computer vision AI applications, most KD methods use Kullback–Leibler (KL) divergence to align teacher–student output probabilities, but often neglect crucial negative aspects of teacher “dark knowledge” by underweighting low-probability signals. This limitation leads to suboptimal logit mimicry and unbalanced knowledge transfer to the student network. In this paper, we investigate the impact of this imbalance and propose a novel method, named Balance Divergence Distillation (BDD). By introducing a compensatory operation using reverse KL divergence, our method can improve the modeling of the extremely small values in the negative from the teacher and preserve the learning capacity for the positive. Furthermore, we test the impact of different temperature coefficients adjustments, which can lead to further balance in knowledge transfer. The evaluation results demonstrate that our method achieves accuracy improvements of <span><math><mrow><mn>1</mn><mtext>%</mtext><mo>∼</mo><mn>3</mn><mtext>%</mtext></mrow></math></span> for lightweight student networks over standard KD methods on both Canadian Institute for Advanced Research 100 classes(CIFAR-100) and ImageNet datasets. Additionally, when applied to semantic segmentation, our approach enhances the student by 4.55% in mean Intersection over Union (mIoU) compared to the baseline on the Cityscapes dataset. These experiments confirm that our method provides a simple yet highly effective solution that can be seamlessly integrated with various KD frameworks across different vision tasks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113943"},"PeriodicalIF":8.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1