首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
Adaptive critical speed prediction for straddle-type monorail operational safety: A meta-learning framework with few-shot deployment 跨座式单轨运行安全的自适应临界速度预测:一个具有少量部署的元学习框架
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114063
Junchao Zhou , Ao Chen , Shangwu Huang , Jianjie Gao , Haiping Du
Accelerating global urbanization has intensified the demand for efficient and sustainable transportation solutions in high-density areas. Traditional ground-based transit systems face congestion and pollution challenges in spatially constrained regions. Against this backdrop, the Straddle-type Monorail System (SMS), distinguished by its lightweight structure, lower infrastructure costs, and unique elevated spatial efficiency, emerges as a critical option for optimizing urban commuting networks. However, a fundamental challenge for Straddle-type Monorail Vehicle (SMV) operational safety is lateral shimmy vibration instability. Conventional dynamic modelling approaches struggle to predict shimmy bifurcation boundaries effectively due to computational inefficiency and poor parametric generalization. To address these limitations, this research proposes a novel meta-learning framework named MAML-CNN-LSTM-Attention (M-CLA) for few-shot critical speed prediction, which integrates Model-Agnostic Meta-Learning (MAML), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Attention mechanism. Trained on a 7-DOF vehicle-track coupling model, the M-CLA framework processes lateral displacement and velocity time-series data to achieve 99.67% prediction accuracy for the critical speed under few-shot conditions. It demonstrates rapid adaptation and superior generalization across scenarios with minimal data, offering a practical AI tool for enhancing SMS safety, reducing maintenance costs, and preventing derailments. The framework rapidly adapts to new operational scenarios with minimal data, outperforming traditional deep learning methods in both prediction accuracy and cross-condition generalization. It provides infrastructure managers with an Artificial Intelligence (AI)-driven tool for dynamic optimization and safety evaluation of SMS, effectively contributing to derailment prevention, maintenance cost reduction, and enhanced operational safety across diverse urban rail transit environments.
全球城市化的加速加剧了对高密度地区高效和可持续交通解决方案的需求。在空间有限的地区,传统的地面交通系统面临着拥堵和污染的挑战。在这种背景下,跨座式单轨系统(SMS)以其轻量化结构、较低的基础设施成本和独特的空间效率提升而闻名,成为优化城市通勤网络的关键选择。然而,跨座式单轨车辆(SMV)运行安全性面临的一个根本挑战是横向摆振不稳定性。由于计算效率低和参数泛化差,传统的动态建模方法难以有效地预测摆振分岔边界。为了解决这些问题,本研究提出了一个新的元学习框架,命名为mml -CNN-LSTM-Attention (M-CLA),用于短时临界速度预测,该框架集成了模型不确定元学习(MAML)、卷积神经网络(CNN)、长短期记忆(LSTM)和注意机制。在7自由度车辆-轨道耦合模型的训练下,M-CLA框架对侧向位移和速度时间序列数据进行处理,在少弹条件下对临界速度的预测精度达到99.67%。它以最少的数据展示了快速适应和卓越的通用性,为提高SMS安全性、降低维护成本和防止脱轨提供了实用的人工智能工具。该框架能够以最少的数据快速适应新的操作场景,在预测精度和交叉条件泛化方面都优于传统的深度学习方法。它为基础设施管理人员提供了一个人工智能(AI)驱动的工具,用于SMS的动态优化和安全评估,有效地促进了脱轨预防,降低了维护成本,并提高了不同城市轨道交通环境的运营安全性。
{"title":"Adaptive critical speed prediction for straddle-type monorail operational safety: A meta-learning framework with few-shot deployment","authors":"Junchao Zhou ,&nbsp;Ao Chen ,&nbsp;Shangwu Huang ,&nbsp;Jianjie Gao ,&nbsp;Haiping Du","doi":"10.1016/j.engappai.2026.114063","DOIUrl":"10.1016/j.engappai.2026.114063","url":null,"abstract":"<div><div>Accelerating global urbanization has intensified the demand for efficient and sustainable transportation solutions in high-density areas. Traditional ground-based transit systems face congestion and pollution challenges in spatially constrained regions. Against this backdrop, the Straddle-type Monorail System (SMS), distinguished by its lightweight structure, lower infrastructure costs, and unique elevated spatial efficiency, emerges as a critical option for optimizing urban commuting networks. However, a fundamental challenge for Straddle-type Monorail Vehicle (SMV) operational safety is lateral shimmy vibration instability. Conventional dynamic modelling approaches struggle to predict shimmy bifurcation boundaries effectively due to computational inefficiency and poor parametric generalization. To address these limitations, this research proposes a novel meta-learning framework named MAML-CNN-LSTM-Attention (M-CLA) for few-shot critical speed prediction, which integrates Model-Agnostic Meta-Learning (MAML), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Attention mechanism. Trained on a 7-DOF vehicle-track coupling model, the M-CLA framework processes lateral displacement and velocity time-series data to achieve 99.67% prediction accuracy for the critical speed under few-shot conditions. It demonstrates rapid adaptation and superior generalization across scenarios with minimal data, offering a practical AI tool for enhancing SMS safety, reducing maintenance costs, and preventing derailments. The framework rapidly adapts to new operational scenarios with minimal data, outperforming traditional deep learning methods in both prediction accuracy and cross-condition generalization. It provides infrastructure managers with an Artificial Intelligence (AI)-driven tool for dynamic optimization and safety evaluation of SMS, effectively contributing to derailment prevention, maintenance cost reduction, and enhanced operational safety across diverse urban rail transit environments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114063"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart indoor occupancy detection based on optimized camera placement, multi-view de-duplication, and large language model semantic understanding 基于优化摄像头放置、多视图去重复和大语言模型语义理解的智能室内占用检测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114157
Deli Liu, Xiaoping Zhou, Dongxiao Chen, Yu Li
Accurate occupancy detection in indoor environments is essential for optimizing energy use, enhancing occupant comfort, and ensuring safety in smart buildings. This study aims to design and validate an end-to-end framework that not only counts occupants reliably but also generates rich semantic descriptions of their behaviors and spatial interactions. We propose a four-stage methodology: (1) multi-objective optimization of camera placement through field-of-view analysis and grid modeling to maximize coverage and minimize blind spots; (2) on-device human detection using a fine-tuned You Only Look Once version 8 (YOLOv8) model; (3) cross-camera identity tracking using Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT) to assign unique global identifiers and eliminate duplicate counts; and (4) a multimodal large language model (LLM) that consumes annotated, identity-aware multi-view images to produce coherent natural-language summaries and structured outputs detailing occupant numbers, actions, and locations. Extensive evaluations conducted on a diverse multi-view dataset—including challenging scenarios of heavy occlusion and clothing changes—demonstrate the robustness and real-time applicability of the proposed framework. The key contribution of this work is the first demonstration of integrating identity-aware, multi-camera de-duplication with large language model–driven scene interpretation, enabling automated, actionable insights that extend beyond simple occupancy counts. This novel combination advances intelligent building management by providing both precise occupancy analytics and contextual understanding to support adaptive control and energy-efficient operation.
在智能建筑中,室内环境的准确占用检测对于优化能源使用、提高居住者舒适度和确保安全至关重要。本研究旨在设计并验证一个端到端框架,该框架不仅可以可靠地计算占用者数量,还可以生成他们的行为和空间交互的丰富语义描述。我们提出了一种四阶段的方法:(1)通过视场分析和网格建模对摄像机的放置进行多目标优化,以最大化覆盖范围和最小化盲点;(2)设备上的人工检测,使用经过微调的You Only Look Once version 8 (YOLOv8)模型;(3)使用深度关联度量(DeepSORT)的简单在线和实时跟踪跨相机身份跟踪来分配唯一的全局标识符并消除重复计数;(4)一个多模态大语言模型(LLM),它使用带注释的、身份感知的多视图图像来生成连贯的自然语言摘要和结构化输出,详细说明居住者的数量、行动和位置。在不同的多视图数据集上进行了广泛的评估,包括严重遮挡和服装变化的挑战性场景,证明了所提出框架的鲁棒性和实时适用性。这项工作的关键贡献是首次展示了将身份感知、多摄像头重复数据删除与大型语言模型驱动的场景解释相结合,实现了自动化、可操作的洞察,而不仅仅是简单的占用计数。这种新颖的组合通过提供精确的占用分析和上下文理解来推进智能建筑管理,以支持自适应控制和节能操作。
{"title":"Smart indoor occupancy detection based on optimized camera placement, multi-view de-duplication, and large language model semantic understanding","authors":"Deli Liu,&nbsp;Xiaoping Zhou,&nbsp;Dongxiao Chen,&nbsp;Yu Li","doi":"10.1016/j.engappai.2026.114157","DOIUrl":"10.1016/j.engappai.2026.114157","url":null,"abstract":"<div><div>Accurate occupancy detection in indoor environments is essential for optimizing energy use, enhancing occupant comfort, and ensuring safety in smart buildings. This study aims to design and validate an end-to-end framework that not only counts occupants reliably but also generates rich semantic descriptions of their behaviors and spatial interactions. We propose a four-stage methodology: (1) multi-objective optimization of camera placement through field-of-view analysis and grid modeling to maximize coverage and minimize blind spots; (2) on-device human detection using a fine-tuned You Only Look Once version 8 (YOLOv8) model; (3) cross-camera identity tracking using Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT) to assign unique global identifiers and eliminate duplicate counts; and (4) a multimodal large language model (LLM) that consumes annotated, identity-aware multi-view images to produce coherent natural-language summaries and structured outputs detailing occupant numbers, actions, and locations. Extensive evaluations conducted on a diverse multi-view dataset—including challenging scenarios of heavy occlusion and clothing changes—demonstrate the robustness and real-time applicability of the proposed framework. The key contribution of this work is the first demonstration of integrating identity-aware, multi-camera de-duplication with large language model–driven scene interpretation, enabling automated, actionable insights that extend beyond simple occupancy counts. This novel combination advances intelligent building management by providing both precise occupancy analytics and contextual understanding to support adaptive control and energy-efficient operation.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114157"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on a residual learning based neural-kernel framework with applications in short-term load forecasting 残差学习神经核框架在短期负荷预测中的应用研究
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.113989
Wangyi Xu , Yushu Xiang , Xin Ma , Wangpeng Li
Short-term load forecasting is essential for power system operation, yet it remains challenging due to the non-stationary nature of load data and the difficulty of capturing complex nonlinear relationships. To address this issue, a residual learning–based neural kernel framework is proposed for short–term load forecasting. The framework integrates a Fourier kernel-based neural kernel module into a deep residual network as a residual function. The Fourier kernel enables automatic identification and separation of periodic components and long-term trends in load data, while the non-parametric property of the kernel model helps reduce model complexity. Meanwhile, the shortcut connections in the residual network effectively alleviate the vanishing gradient problem, ensuring stable and efficient model training. To further improve model performance, the Artificial Bee Colony (ABC) algorithm is employed for hyperparameter optimization, allowing efficient approximation of the global optimum. In addition, a novel Theil UII-S loss function is introduced to enhance the model’s sensitivity to abnormal load fluctuations through adaptive gradient regulation. Experimental results on four real-world power datasets demonstrate that the proposed model outperforms 23 benchmark methods in terms of prediction accuracy. Ablation studies further verify the individual contributions of the Fourier kernel, the loss function, and the ABC algorithm, providing useful insights for future research.
短期负荷预测对电力系统运行至关重要,但由于负荷数据的非平稳性质和难以捕捉复杂的非线性关系,短期负荷预测仍然具有挑战性。针对这一问题,提出了一种基于残差学习的短期负荷预测神经核框架。该框架将基于傅里叶核的神经核模块作为残差函数集成到深度残差网络中。傅里叶核可以自动识别和分离负荷数据中的周期性成分和长期趋势,而核模型的非参数特性有助于降低模型的复杂性。同时,残差网络中的快捷连接有效缓解了梯度消失问题,保证了模型训练的稳定高效。为了进一步提高模型的性能,采用人工蜂群(Artificial Bee Colony, ABC)算法进行超参数优化,可以有效地逼近全局最优。此外,引入了一种新的Theil ui - s损失函数,通过自适应梯度调节来提高模型对负荷异常波动的灵敏度。在4个实际电力数据集上的实验结果表明,该模型的预测精度优于23种基准方法。消融研究进一步验证了傅里叶核、损失函数和ABC算法的各自贡献,为未来的研究提供了有用的见解。
{"title":"Research on a residual learning based neural-kernel framework with applications in short-term load forecasting","authors":"Wangyi Xu ,&nbsp;Yushu Xiang ,&nbsp;Xin Ma ,&nbsp;Wangpeng Li","doi":"10.1016/j.engappai.2026.113989","DOIUrl":"10.1016/j.engappai.2026.113989","url":null,"abstract":"<div><div>Short-term load forecasting is essential for power system operation, yet it remains challenging due to the non-stationary nature of load data and the difficulty of capturing complex nonlinear relationships. To address this issue, a residual learning–based neural kernel framework is proposed for short–term load forecasting. The framework integrates a Fourier kernel-based neural kernel module into a deep residual network as a residual function. The Fourier kernel enables automatic identification and separation of periodic components and long-term trends in load data, while the non-parametric property of the kernel model helps reduce model complexity. Meanwhile, the shortcut connections in the residual network effectively alleviate the vanishing gradient problem, ensuring stable and efficient model training. To further improve model performance, the Artificial Bee Colony (ABC) algorithm is employed for hyperparameter optimization, allowing efficient approximation of the global optimum. In addition, a novel Theil UII-S loss function is introduced to enhance the model’s sensitivity to abnormal load fluctuations through adaptive gradient regulation. Experimental results on four real-world power datasets demonstrate that the proposed model outperforms 23 benchmark methods in terms of prediction accuracy. Ablation studies further verify the individual contributions of the Fourier kernel, the loss function, and the ABC algorithm, providing useful insights for future research.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113989"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-path adaptive feature elevation system for detecting small targets in remote sensing imagery 用于遥感图像小目标检测的双路径自适应特征高程系统
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114132
Liangjun Xu, Hui Ma
Detecting small targets in remote sensing imagery has long been a challenge due to factors such as weak target features and complex backgrounds. Existing methods primarily focus on improving detection efficiency, often resulting in suboptimal accuracy for small targets. This study proposes the dual-path adaptive feature elevation system (DAES-net) for detecting small targets in remote sensing imagery, which significantly enhances small target detection accuracy while maintaining reasonable detection efficiency, effectively overcoming this challenge. DAES-net first integrates a proprietary dual-path self-calibration module (DSM). This module optimizes feature fusion through global modeling and local denoising, enhancing global feature correlation while reducing redundancy to provide more precise fused features for the detection system. Second, the dynamic normalized wasserstein distance (D-NWD) loss function was designed to achieve more precise localization of minute targets. By dynamically adjusting the regression weights of the constraint terms in the normalized wasserstein distance (NWD) loss function, D-NWD implements an optimal localization strategy for small targets, thereby improving the model's localization efficiency for them. Finally, the one-time aggregated feature reuse reparameterized convolution (FRRO) was proposed. This feature reuse structure prevents information loss for small targets while accelerating model inference efficiency. Experimental results demonstrate that DAES-Net achieves the highest mean average precision (MAP) across four public small object detection datasets, outperforming existing state-of-the-art methods. This highlights the significant contribution of this research to the field of small object detection.
由于目标特征弱、背景复杂等因素,遥感图像中的小目标检测一直是一个难题。现有的方法主要关注于提高检测效率,对于小目标的检测精度往往不理想。本研究提出了用于遥感图像小目标检测的双路径自适应特征高程系统(DAES-net),在保持合理检测效率的同时,显著提高了小目标检测精度,有效克服了这一挑战。DAES-net首先集成了专有的双路自校准模块(DSM)。该模块通过全局建模和局部去噪对特征融合进行优化,在增强全局特征相关性的同时减少冗余,为检测系统提供更精确的融合特征。其次,设计动态归一化wasserstein距离(D-NWD)损失函数,实现微小目标更精确的定位;D-NWD通过动态调整归一化wasserstein距离(NWD)损失函数中约束项的回归权值,实现对小目标的最优定位策略,从而提高模型对小目标的定位效率。最后,提出了一次性聚合特征重用重参数化卷积(FRRO)算法。这种特征重用结构防止了小目标的信息丢失,同时提高了模型推理效率。实验结果表明,DAES-Net在四个公共小目标检测数据集上实现了最高的平均精度(MAP),优于现有的最先进的方法。这凸显了本研究对小目标检测领域的重大贡献。
{"title":"Dual-path adaptive feature elevation system for detecting small targets in remote sensing imagery","authors":"Liangjun Xu,&nbsp;Hui Ma","doi":"10.1016/j.engappai.2026.114132","DOIUrl":"10.1016/j.engappai.2026.114132","url":null,"abstract":"<div><div>Detecting small targets in remote sensing imagery has long been a challenge due to factors such as weak target features and complex backgrounds. Existing methods primarily focus on improving detection efficiency, often resulting in suboptimal accuracy for small targets. This study proposes the dual-path adaptive feature elevation system (DAES-net) for detecting small targets in remote sensing imagery, which significantly enhances small target detection accuracy while maintaining reasonable detection efficiency, effectively overcoming this challenge. DAES-net first integrates a proprietary dual-path self-calibration module (DSM). This module optimizes feature fusion through global modeling and local denoising, enhancing global feature correlation while reducing redundancy to provide more precise fused features for the detection system. Second, the dynamic normalized wasserstein distance (D-NWD) loss function was designed to achieve more precise localization of minute targets. By dynamically adjusting the regression weights of the constraint terms in the normalized wasserstein distance (NWD) loss function, D-NWD implements an optimal localization strategy for small targets, thereby improving the model's localization efficiency for them. Finally, the one-time aggregated feature reuse reparameterized convolution (FRRO) was proposed. This feature reuse structure prevents information loss for small targets while accelerating model inference efficiency. Experimental results demonstrate that DAES-Net achieves the highest mean average precision (MAP) across four public small object detection datasets, outperforming existing state-of-the-art methods. This highlights the significant contribution of this research to the field of small object detection.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114132"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-granularity alignment and cross-modal reasoning for fake news video explanation 假新闻视频解释的多粒度对齐和跨模态推理
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114153
Chao Cheng , Weiwei Jiang
Fake news video explanation generation aims to provide accurate and insightful explanations through in-depth analysis of news video content. However, existing methods typically align video context with overall descriptions and generate explanations via multi-modal fusion, often neglecting the rich details of key semantic elements such as nouns and verbs. To address this limitation, this paper proposes a unified Artificial Intelligence (AI) framework named Multi-Granularity Alignment and Reasoning (MGAR). MGAR not only focuses on the semantic alignment of overall descriptions but also delves into the semantic elements in language, particularly nouns and verbs, and aligns them with frame-level and motion-level features of fake news videos for multi-granularity reasoning. Additionally, we design a unified residual-structured multi-granularity language module that employs a context exchange mechanism (e.g., word-level and sentence-level) to adapt to semantic understanding at different granularity. Extensive experiments on the FakeVE dataset demonstrate the superiority of MGAR, achieving improvements of +10.1% BLEU-1 and +11.1% ROUGE-L over state-of-the-art baselines, showcasing the potential of AI applications in combating false information.
假新闻视频解释生成旨在通过对新闻视频内容的深入分析,提供准确、有见地的解释。然而,现有的方法通常将视频上下文与总体描述对齐,并通过多模态融合生成解释,往往忽略了关键语义元素(如名词和动词)的丰富细节。为了解决这一限制,本文提出了一个统一的人工智能(AI)框架,称为多粒度对齐和推理(MGAR)。MGAR不仅关注整体描述的语义对齐,而且深入研究语言中的语义元素,特别是名词和动词,并将其与假新闻视频的帧级和动作级特征对齐,进行多粒度推理。此外,我们还设计了一个统一的残馀结构多粒度语言模块,该模块采用上下文交换机制(如词级和句子级)来适应不同粒度的语义理解。在FakeVE数据集上进行的大量实验证明了MGAR的优越性,在最先进的基线上实现了+10.1%的blue -1和+11.1%的ROUGE-L的改进,展示了人工智能应用在打击虚假信息方面的潜力。
{"title":"Multi-granularity alignment and cross-modal reasoning for fake news video explanation","authors":"Chao Cheng ,&nbsp;Weiwei Jiang","doi":"10.1016/j.engappai.2026.114153","DOIUrl":"10.1016/j.engappai.2026.114153","url":null,"abstract":"<div><div>Fake news video explanation generation aims to provide accurate and insightful explanations through in-depth analysis of news video content. However, existing methods typically align video context with overall descriptions and generate explanations via multi-modal fusion, often neglecting the rich details of key semantic elements such as nouns and verbs. To address this limitation, this paper proposes a unified Artificial Intelligence (AI) framework named Multi-Granularity Alignment and Reasoning (MGAR). MGAR not only focuses on the semantic alignment of overall descriptions but also delves into the semantic elements in language, particularly nouns and verbs, and aligns them with frame-level and motion-level features of fake news videos for multi-granularity reasoning. Additionally, we design a unified residual-structured multi-granularity language module that employs a context exchange mechanism (e.g., word-level and sentence-level) to adapt to semantic understanding at different granularity. Extensive experiments on the FakeVE dataset demonstrate the superiority of MGAR, achieving improvements of +10.1% BLEU-1 and +11.1% ROUGE-L over state-of-the-art baselines, showcasing the potential of AI applications in combating false information.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114153"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced nitrogen-oxides prediction in biomass combustion via a dual-channel neural network with flame imaging and residual attention 基于火焰成像和残余注意力的双通道神经网络增强生物质燃烧中氮氧化物预测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114090
Runfang Hao , Mingyu Wang , Shengjun Chang , Li Qin , Yongqiang Cheng , Gang Lu
Reliable and accurate monitoring of nitrogen oxides (NOx) in flue gas is crucial for emission control in electrical power generation plants due to environmental concerns. Traditional data-driven methods for NOx prediction are often based on single-channel strategies and use multiple variables from the combustion process, which show insufficient feature correlation, making reliable and accurate NOx prediction very difficult. To tackle these limitations, this study proposes a Dual-Channel Deep Neural Network incorporating a Residual Attention Mechanism (DCDNN-RAM) which integrates flame visual features with a residual attention mechanism. An Enhanced Laplacian of Gaussian (ELG) filtering algorithm is employed to optimize the preprocessing of flame images and reduce significantly feature discrimination. An innovative heterogeneous dual-channel parallel architecture is developed, where the primary channel utilizes small convolutional kernels to extract local detail features while the secondary channel employs large kernels to capture global contextual information, coupled with a spatial-frequency collaborative feature extraction module for effective fusion of deep local and shallow global features. Notably, the incorporated dual residual attention mechanism (RAM) effectively enhances key feature representation via channel-spatial adaptive weight allocation. Experimental validation under oxy-biomass combustion conditions demonstrates that the proposed model achieves a coefficient of determination (R2) of 0.946 with a mean absolute error (MAE) of 4.26, outperforming four benchmark single-channel models with MAE reductions of 63.71%, 51.2%, 38.44%, and 24.33%, respectively. This study provides a promising solution for the reliable and accurate prediction of NOx emissions, and thus offers important practical value for promoting cleaner production and supporting the carbon neutrality goal of the power generation industry.
由于环境问题,可靠和准确地监测烟气中的氮氧化物(NOx)对于发电厂的排放控制至关重要。传统的数据驱动NOx预测方法往往基于单通道策略,使用燃烧过程中的多个变量,特征相关性不足,难以实现可靠、准确的NOx预测。为了解决这些限制,本研究提出了一种结合残余注意机制的双通道深度神经网络(DCDNN-RAM),该网络将火焰视觉特征与残余注意机制相结合。采用增强的拉普拉斯高斯滤波算法对火焰图像的预处理进行了优化,显著降低了特征识别。开发了一种创新的异构双通道并行架构,其中主通道利用小卷积核提取局部细节特征,副通道利用大卷积核捕获全局上下文信息,再加上空间-频率协同特征提取模块,有效融合深度局部特征和浅全局特征。值得注意的是,采用双剩余注意机制(RAM),通过信道空间自适应权重分配,有效地增强了关键特征的表示。在全氧生物质燃烧条件下的实验验证表明,该模型的决定系数(R2)为0.946,平均绝对误差(MAE)为4.26,优于4个基准单通道模型,平均绝对误差分别降低了63.71%、51.2%、38.44%和24.33%。本研究为实现NOx排放的可靠、准确预测提供了一种有前景的解决方案,对促进清洁生产、支持发电行业实现碳中和目标具有重要的实用价值。
{"title":"Enhanced nitrogen-oxides prediction in biomass combustion via a dual-channel neural network with flame imaging and residual attention","authors":"Runfang Hao ,&nbsp;Mingyu Wang ,&nbsp;Shengjun Chang ,&nbsp;Li Qin ,&nbsp;Yongqiang Cheng ,&nbsp;Gang Lu","doi":"10.1016/j.engappai.2026.114090","DOIUrl":"10.1016/j.engappai.2026.114090","url":null,"abstract":"<div><div>Reliable and accurate monitoring of nitrogen oxides (NOx) in flue gas is crucial for emission control in electrical power generation plants due to environmental concerns. Traditional data-driven methods for NOx prediction are often based on single-channel strategies and use multiple variables from the combustion process, which show insufficient feature correlation, making reliable and accurate NOx prediction very difficult. To tackle these limitations, this study proposes a Dual-Channel Deep Neural Network incorporating a Residual Attention Mechanism (DCDNN-RAM) which integrates flame visual features with a residual attention mechanism. An Enhanced Laplacian of Gaussian (ELG) filtering algorithm is employed to optimize the preprocessing of flame images and reduce significantly feature discrimination. An innovative heterogeneous dual-channel parallel architecture is developed, where the primary channel utilizes small convolutional kernels to extract local detail features while the secondary channel employs large kernels to capture global contextual information, coupled with a spatial-frequency collaborative feature extraction module for effective fusion of deep local and shallow global features. Notably, the incorporated dual residual attention mechanism (RAM) effectively enhances key feature representation via channel-spatial adaptive weight allocation. Experimental validation under oxy-biomass combustion conditions demonstrates that the proposed model achieves a coefficient of determination (R<sup>2</sup>) of 0.946 with a mean absolute error (MAE) of 4.26, outperforming four benchmark single-channel models with MAE reductions of 63.71%, 51.2%, 38.44%, and 24.33%, respectively. This study provides a promising solution for the reliable and accurate prediction of NOx emissions, and thus offers important practical value for promoting cleaner production and supporting the carbon neutrality goal of the power generation industry.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114090"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Seeing the unseen: Semantic segmentation and uncertainty quantification for delamination detection in building facades 看到看不见的:建筑立面分层检测的语义分割和不确定性量化
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114129
Yuebo Meng , Guotong Yin , Songtao Ye , Qiaoqiao Wang , Guanghui Liu , Xiaohan Li , Xiaojiao Geng
Accurate detection of delamination in building facades is critical for prolonging service life and ensuring structural safety. Current inspection methodologies heavily rely on manual interpretation, lacking efficiency and intelligent robustness. While infrared thermography provides a non-destructive means for detecting subsurface delamination, its accuracy is often compromised by low thermal contrast under uncontrolled conditions and the absence of uncertainty quantification in deep learning models. To address these limitations, this paper proposes TIHSNet, a novel delamination detection framework based on semantic segmentation and uncertainty quantification. Specifically, a physics-informed thermal gradient attention module is introduced to emphasize thermodynamically meaningful gradients and enable accurate delamination boundary delineation. Subsequently, a dual output mechanism is proposed to simultaneously generate prediction and uncertainty maps, enabling quantitative assessment of predictive reliability and identification of regions requiring expert review. To further enhance spatial localization, visible light images are integrated to capture tile boundary information and support spatial classification of delamination. Experiments were conducted on a self constructed dataset comprising 2102 infrared thermography and visible light images collected from reinforced concrete and brick masonry walls. The results demonstrate that TIHSNet achieves a precision of 96.1%, surpassing traditional thresholding methods with a 27.9% gain, and further outperforming existing deep learning approaches by 10.5%. The uncertainty quantification results further validate the model’s robustness and its ability to support reliable decision making in real world inspection scenarios.
准确检测建筑外立面的脱层对延长使用寿命和确保结构安全至关重要。目前的检测方法严重依赖人工解释,缺乏效率和智能鲁棒性。虽然红外热成像为探测地下分层提供了一种非破坏性的手段,但在不受控制的条件下,其准确性往往受到低热对比度和深度学习模型中缺乏不确定性量化的影响。为了解决这些问题,本文提出了一种基于语义分割和不确定性量化的分层检测框架TIHSNet。具体来说,引入了一个物理通知的热梯度注意模块,以强调热力学上有意义的梯度,并实现准确的分层边界划定。随后,提出了一种双输出机制来同时生成预测和不确定性图,从而能够定量评估预测可靠性并识别需要专家审查的区域。为了进一步增强空间定位能力,我们将可见光图像整合在一起,捕捉瓷砖边界信息,支持分层的空间分类。实验以自建的2102张钢筋混凝土和砖砌体墙体的红外热像图和可见光图像为数据集。结果表明,TIHSNet达到了96.1%的精度,比传统的阈值方法提高了27.9%,并进一步比现有的深度学习方法提高了10.5%。不确定性量化结果进一步验证了模型的鲁棒性及其在实际检验场景中支持可靠决策的能力。
{"title":"Seeing the unseen: Semantic segmentation and uncertainty quantification for delamination detection in building facades","authors":"Yuebo Meng ,&nbsp;Guotong Yin ,&nbsp;Songtao Ye ,&nbsp;Qiaoqiao Wang ,&nbsp;Guanghui Liu ,&nbsp;Xiaohan Li ,&nbsp;Xiaojiao Geng","doi":"10.1016/j.engappai.2026.114129","DOIUrl":"10.1016/j.engappai.2026.114129","url":null,"abstract":"<div><div>Accurate detection of delamination in building facades is critical for prolonging service life and ensuring structural safety. Current inspection methodologies heavily rely on manual interpretation, lacking efficiency and intelligent robustness. While infrared thermography provides a non-destructive means for detecting subsurface delamination, its accuracy is often compromised by low thermal contrast under uncontrolled conditions and the absence of uncertainty quantification in deep learning models. To address these limitations, this paper proposes TIHSNet, a novel delamination detection framework based on semantic segmentation and uncertainty quantification. Specifically, a physics-informed thermal gradient attention module is introduced to emphasize thermodynamically meaningful gradients and enable accurate delamination boundary delineation. Subsequently, a dual output mechanism is proposed to simultaneously generate prediction and uncertainty maps, enabling quantitative assessment of predictive reliability and identification of regions requiring expert review. To further enhance spatial localization, visible light images are integrated to capture tile boundary information and support spatial classification of delamination. Experiments were conducted on a self constructed dataset comprising 2102 infrared thermography and visible light images collected from reinforced concrete and brick masonry walls. The results demonstrate that TIHSNet achieves a precision of 96.1%, surpassing traditional thresholding methods with a 27.9% gain, and further outperforming existing deep learning approaches by 10.5%. The uncertainty quantification results further validate the model’s robustness and its ability to support reliable decision making in real world inspection scenarios.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114129"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent control framework for Unmanned Aerial Vehicle autonomous docking based on Linear Active Disturbance Rejection Control and improved Particle Swarm Optimization 基于线性自抗扰和改进粒子群优化的无人机自主对接智能控制框架
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.114070
Mingzhi Shao , Xin Liu , Wenchao Cui , Chengmeng Sun , Haiwen Yuan
Autonomous aerial docking of Unmanned Aerial Vehicle (UAV) is essential for aerial refueling, payload replacement, and cooperative operations, yet existing methods often exhibit low docking accuracy, weak disturbance rejection, and empirical parameter tuning. To overcome these limitations, this study proposes an intelligent control framework that integrates Linear Active Disturbance Rejection Control (LADRC) with an Improved Particle Swarm Optimization (IPSO) algorithm. First, a six degree of freedom dynamic model of the UAV and cone sleeve system is developed, incorporating wind disturbance, turbulence, and parameter perturbations. Second, the LADRC method realizes decoupled control of altitude, lateral, and velocity channels, ensuring robust dynamic compensation. Third, the IPSO algorithm, an Artificial Intelligence (AI) based optimization approach, is employed to adaptively tune the controller bandwidth and observer gains. This AI enhanced parameter learning process improves the generalization capability of LADRC under varying flight conditions. Simulation and scaled flight experiments demonstrate that the proposed AI driven LADRC achieves stable docking under fifty percent perturbations, with a trajectory root mean square error of 0.04 m and a relative velocity error of 0.03 m per second. Compared with conventional controllers, the tracking error is reduced by up to 38 percent. These results confirm that combining LADRC with AI based optimization offers a robust and precise solution for UAV autonomous aerial docking in complex and uncertain environments.
无人机的自主空中对接对于空中加油、载荷替换和协同作战至关重要,但现有的对接方法往往存在对接精度低、抗干扰能力弱、经验参数可调等问题。为了克服这些限制,本研究提出了一种集成线性自抗扰控制(LADRC)和改进粒子群优化(IPSO)算法的智能控制框架。首先,建立了考虑风扰动、湍流和参数扰动的六自由度无人机与锥套系统动力学模型;其次,LADRC方法实现了高度通道、横向通道和速度通道的解耦控制,保证了鲁棒动态补偿。第三,采用基于人工智能(AI)的优化方法IPSO算法自适应调节控制器带宽和观测器增益。这种人工智能增强的参数学习过程提高了LADRC在不同飞行条件下的泛化能力。仿真和比例飞行实验表明,人工智能驱动的LADRC在50%扰动下实现了稳定对接,轨迹均方根误差为0.04 m / s,相对速度误差为0.03 m / s。与传统控制器相比,跟踪误差降低了38%。这些结果证实,将LADRC与基于人工智能的优化相结合,为复杂和不确定环境下的无人机自主空中对接提供了鲁棒和精确的解决方案。
{"title":"Intelligent control framework for Unmanned Aerial Vehicle autonomous docking based on Linear Active Disturbance Rejection Control and improved Particle Swarm Optimization","authors":"Mingzhi Shao ,&nbsp;Xin Liu ,&nbsp;Wenchao Cui ,&nbsp;Chengmeng Sun ,&nbsp;Haiwen Yuan","doi":"10.1016/j.engappai.2026.114070","DOIUrl":"10.1016/j.engappai.2026.114070","url":null,"abstract":"<div><div>Autonomous aerial docking of Unmanned Aerial Vehicle (UAV) is essential for aerial refueling, payload replacement, and cooperative operations, yet existing methods often exhibit low docking accuracy, weak disturbance rejection, and empirical parameter tuning. To overcome these limitations, this study proposes an intelligent control framework that integrates Linear Active Disturbance Rejection Control (LADRC) with an Improved Particle Swarm Optimization (IPSO) algorithm. First, a six degree of freedom dynamic model of the UAV and cone sleeve system is developed, incorporating wind disturbance, turbulence, and parameter perturbations. Second, the LADRC method realizes decoupled control of altitude, lateral, and velocity channels, ensuring robust dynamic compensation. Third, the IPSO algorithm, an Artificial Intelligence (AI) based optimization approach, is employed to adaptively tune the controller bandwidth and observer gains. This AI enhanced parameter learning process improves the generalization capability of LADRC under varying flight conditions. Simulation and scaled flight experiments demonstrate that the proposed AI driven LADRC achieves stable docking under fifty percent perturbations, with a trajectory root mean square error of 0.04 m and a relative velocity error of 0.03 m per second. Compared with conventional controllers, the tracking error is reduced by up to 38 percent. These results confirm that combining LADRC with AI based optimization offers a robust and precise solution for UAV autonomous aerial docking in complex and uncertain environments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114070"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-shot transfer learning for laser welding prediction 激光焊接预测的少射次迁移学习
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.engappai.2026.113983
Luchen Wu , Shijie Wu , Hongxin Hu , Hao Sun , Shuang Ma , Zhenya Wang
Laser wire filling welding is a key joining technique in the manufacturing of aluminum battery packs for new energy vehicles, yet predictive modeling of melt pool geometry remains limited by scarce experimental data. A two-stage transfer learning framework is developed by integrating multiphysics numerical simulation, data augmentation, and Bayesian neural network (BNN). High-fidelity multiphysics simulations within the experimental process window are generated to expand parameter space coverage and to provide physics-informed data for model pretraining. Limited experimental samples are augmented using a Wasserstein Generative Adversarial Network with gradient penalty applied to laser power and wire feed speed. Gaussian perturbations on travel speed are introduced to represent measurement uncertainties. A shallow BNN is pretrained on simulated samples and fine-tuned on the augmented experimental dataset using physics-consistent regularization and partial layer-freezing strategies. The augmentation strategy is evaluated through leave-one-out cross-validation on eight experimental samples, and generalization is examined using a separate test under previously unobserved travel-speed conditions. After inverse normalization, the framework achieves root mean square errors of 0.027 mm for melt pool depth and 0.025 mm for width, with coefficients of determination of 0.788 and 0.741, respectively. Uncertainty-aware quantitative analysis based on Sobol sensitivity indices and reliability assessment is conducted after model validation to characterize dominant parameter influences and to identify high-confidence process windows under limited data conditions. The proposed framework provides a general simulation-informed and uncertainty-aware learning strategy for manufacturing processes with severely limited experimental data.
激光填丝焊接是新能源汽车铝电池组的关键连接技术,但由于实验数据的缺乏,熔池几何形状的预测建模仍然受到限制。通过集成多物理场数值模拟、数据增强和贝叶斯神经网络(BNN),构建了一个两阶段迁移学习框架。在实验过程窗口内生成高保真的多物理场模拟,以扩大参数空间覆盖范围,并为模型预训练提供物理信息数据。使用Wasserstein生成对抗网络增强有限的实验样本,并对激光功率和送丝速度施加梯度惩罚。引入高斯摄动对行进速度的影响来表示测量的不确定性。浅层BNN在模拟样本上进行预训练,并使用物理一致正则化和部分层冻结策略对增强实验数据集进行微调。通过对8个实验样本的留一交叉验证来评估增强策略,并在先前未观察到的行驶速度条件下使用单独的测试来检验泛化。经反归一化后,框架对熔池深度和宽度的均方根误差分别为0.027 mm和0.025 mm,决定系数分别为0.788和0.741。在模型验证后,进行基于Sobol敏感性指标和可靠性评估的不确定性感知定量分析,以表征主导参数的影响,并在有限数据条件下识别高置信度的过程窗口。该框架为实验数据严重受限的制造过程提供了一种通用的仿真信息和不确定性感知学习策略。
{"title":"Few-shot transfer learning for laser welding prediction","authors":"Luchen Wu ,&nbsp;Shijie Wu ,&nbsp;Hongxin Hu ,&nbsp;Hao Sun ,&nbsp;Shuang Ma ,&nbsp;Zhenya Wang","doi":"10.1016/j.engappai.2026.113983","DOIUrl":"10.1016/j.engappai.2026.113983","url":null,"abstract":"<div><div>Laser wire filling welding is a key joining technique in the manufacturing of aluminum battery packs for new energy vehicles, yet predictive modeling of melt pool geometry remains limited by scarce experimental data. A two-stage transfer learning framework is developed by integrating multiphysics numerical simulation, data augmentation, and Bayesian neural network (BNN). High-fidelity multiphysics simulations within the experimental process window are generated to expand parameter space coverage and to provide physics-informed data for model pretraining. Limited experimental samples are augmented using a Wasserstein Generative Adversarial Network with gradient penalty applied to laser power and wire feed speed. Gaussian perturbations on travel speed are introduced to represent measurement uncertainties. A shallow BNN is pretrained on simulated samples and fine-tuned on the augmented experimental dataset using physics-consistent regularization and partial layer-freezing strategies. The augmentation strategy is evaluated through leave-one-out cross-validation on eight experimental samples, and generalization is examined using a separate test under previously unobserved travel-speed conditions. After inverse normalization, the framework achieves root mean square errors of 0.027 mm for melt pool depth and 0.025 mm for width, with coefficients of determination of 0.788 and 0.741, respectively. Uncertainty-aware quantitative analysis based on Sobol sensitivity indices and reliability assessment is conducted after model validation to characterize dominant parameter influences and to identify high-confidence process windows under limited data conditions. The proposed framework provides a general simulation-informed and uncertainty-aware learning strategy for manufacturing processes with severely limited experimental data.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113983"},"PeriodicalIF":8.0,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rephrasing detection in machine generated content using deep learning transformers and feature engineering with local agnostic interpretability 在机器生成内容中使用深度学习转换器和具有局部不可知可解释性的特征工程进行改写检测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-02-10 DOI: 10.1016/j.engappai.2026.114056
Syeda Hira Amjad , Hikmat Ullah Khan , Ali Daud , Anam Naz , Aseel Smerat
Artificial Intelligence Content Generation (AIGC) has revolutionized how content is produced worldwide for various types of data using AI tools. Identification of rephrased content and separating it from human written content is an active research area. However, several AI tools use various writing styles to rephrase AIGC which makes it more difficult to detect. To address this new research challenge, this study explores a comprehensive set of content‐based linguistic features ranging from raw quantity metrics to higher‐order measures of vocabulary complexity, grammatical complexity, and specificity-expressiveness to capture the complex patterns. The applied methodology explores transformer‐based model called Distillation Bidirectional Encoder Representations from Transformers (DistilBERT) that integrates with self‐attention mechanisms to encode long‐range dependencies within text. The empirical analysis demonstrates feature‐exploration by exploring parts of speech tagging diversity, Flesch–Kincaid readability scoring, word entropy calculations, and affective term counts. The data split carried out using holdout method by taking 80% training and 20% testing, ensuring that no rephrased variants of the same source appeared which preventing parallel-example leakage. Model performance is assessed by using accuracy, precision, recall, and F1-scores on the hold-out test set, with consistent results observed across repeated runs under fixed random seeds. Quantitatively, the DistilBERT model achieves the highest overall classification accuracy at 93%, outperforming both the classical transformer baseline and all sequential models. Qualitatively, to support model interpretability, explainable AI techniques including locally interpretable model-agnostic explanations produce local explanations that highlight the top six features influencing each style prediction.
人工智能内容生成(AIGC)已经彻底改变了使用人工智能工具在全球范围内为各种类型的数据生成内容的方式。识别改写内容并将其从人类书面内容中分离出来是一个活跃的研究领域。然而,一些人工智能工具使用不同的写作风格来重新表述AIGC,这使得检测起来更加困难。为了应对这一新的研究挑战,本研究探索了一套全面的基于内容的语言特征,从原始数量指标到词汇复杂性、语法复杂性和特异性表达的高阶测量,以捕捉复杂的模式。应用方法探索了基于变压器的蒸馏双向编码器表示(蒸馏器)模型,该模型集成了自关注机制,对文本中的远程依赖进行编码。实证分析通过探索词性标记多样性、Flesch-Kincaid可读性评分、词熵计算和情感术语计数来展示特征探索。采用holdout方法进行数据分割,采用80%的训练+ 20%的测试,保证了同一来源的数据不出现改写的变体,防止了并行样例泄漏。通过使用准确度、精密度、召回率和保留测试集的f1分数来评估模型性能,在固定随机种子下重复运行观察到一致的结果。在数量上,蒸馏器模型达到了最高的总体分类精度,达到93%,优于经典的变压器基线和所有顺序模型。定性地说,为了支持模型的可解释性,可解释的人工智能技术(包括局部可解释的模型不可知论解释)产生局部解释,突出影响每种风格预测的前六个特征。
{"title":"Rephrasing detection in machine generated content using deep learning transformers and feature engineering with local agnostic interpretability","authors":"Syeda Hira Amjad ,&nbsp;Hikmat Ullah Khan ,&nbsp;Ali Daud ,&nbsp;Anam Naz ,&nbsp;Aseel Smerat","doi":"10.1016/j.engappai.2026.114056","DOIUrl":"10.1016/j.engappai.2026.114056","url":null,"abstract":"<div><div>Artificial Intelligence Content Generation (AIGC) has revolutionized how content is produced worldwide for various types of data using AI tools. Identification of rephrased content and separating it from human written content is an active research area. However, several AI tools use various writing styles to rephrase AIGC which makes it more difficult to detect. To address this new research challenge, this study explores a comprehensive set of content‐based linguistic features ranging from raw quantity metrics to higher‐order measures of vocabulary complexity, grammatical complexity, and specificity-expressiveness to capture the complex patterns. The applied methodology explores transformer‐based model called Distillation Bidirectional Encoder Representations from Transformers (DistilBERT) that integrates with self‐attention mechanisms to encode long‐range dependencies within text. The empirical analysis demonstrates feature‐exploration by exploring parts of speech tagging diversity, Flesch–Kincaid readability scoring, word entropy calculations, and affective term counts. The data split carried out using holdout method by taking 80% training and 20% testing, ensuring that no rephrased variants of the same source appeared which preventing parallel-example leakage. Model performance is assessed by using accuracy, precision, recall, and F1-scores on the hold-out test set, with consistent results observed across repeated runs under fixed random seeds. Quantitatively, the DistilBERT model achieves the highest overall classification accuracy at 93%, outperforming both the classical transformer baseline and all sequential models. Qualitatively, to support model interpretability, explainable AI techniques including locally interpretable model-agnostic explanations produce local explanations that highlight the top six features influencing each style prediction.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114056"},"PeriodicalIF":8.0,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1