首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
Lightweight spatio-temporal residual neural network and transformer architecture with positional gating for video-based smoke and fire detection 基于位置门控的轻型时空残差神经网络和变压器结构用于视频烟雾和火灾探测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-06 DOI: 10.1016/j.engappai.2026.113996
Rafaqat Alam Khan , Usama Ijaz Bajwa , Rana Hammad Raza , Muhammad Umar Farooq
The occurrence of fire incidents is considered one of the common hazards which not only risks human lives, but also impacts economy and environment. Detecting fire and smoke in its initial stages is highly important to prevent them from becoming uncontrollable. Conventional sensor-based detectors have limitations such as geographic area coverage, time required to reach the sensor, and false alarm rates. However, traditional sensor-based detectors are being substituted with smart video-based detectors. These provide effective monitoring, detection and detailed analysis of smoke and fires in both indoor/outdoor environments. This study introduced a real-time automated artificial intelligence (AI)-based video model for early-stage detection of smoke and fire, effectively mitigating false alarms caused by clouds, fogs or other fire-colored backgrounds or objects. The model Dual Attention Multi-Resolution Three-Dimensional Network with Positional Gating Unit (DAMR3DNet_PGU) was trained using hybrid Spatio-Temporal Residual Neural Network and Transformer architecture (Transformer) with Positional Gating on a wide range of unique smoke and fire patterns sourced from publicly available benchmark video datasets. Experiment results illustrated significant improvements in True Positive Rate (TPR), True Negative Rate (TNR), False Positive (FP), False Negative (FN), false alarm and accuracy, when compared with various state-of-the-art methods. The efficacy of the proposed DAMR3DNet_PGU method utilizing conventional closed-circuit television (CCTV) cameras for fire and smoke detection was affirmed. The proposed technique demonstrated robust performance across multiple datasets. It achieved high accuracy rates for smoke and fire detection, while significantly reducing false negatives, false alarm and with lightweight model compared to existing approaches.
火灾事故的发生被认为是危害人类生命安全、影响经济和环境的常见灾害之一。在火灾和烟雾的最初阶段探测到它们对于防止它们变得无法控制是非常重要的。传统的基于传感器的探测器有局限性,如地理区域覆盖、到达传感器所需的时间和误报率。然而,传统的基于传感器的探测器正在被基于智能视频的探测器所取代。这些系统提供室内/室外环境中烟雾和火灾的有效监测、探测和详细分析。该研究引入了一种基于实时自动化人工智能(AI)的视频模型,用于烟雾和火灾的早期检测,有效地减轻了由云、雾或其他火色背景或物体引起的误报。采用混合时空残差神经网络和具有位置门控的变压器架构(Transformer)对来自公开可用的基准视频数据集的各种独特的烟雾和火灾模式进行训练,建立了带有位置门控单元的双注意力多分辨率三维网络模型(DAMR3DNet_PGU)。实验结果表明,与各种最先进的方法相比,该方法在真阳性率(TPR)、真阴性率(TNR)、假阳性(FP)、假阴性(FN)、误报警和准确性方面有显著提高。证实了DAMR3DNet_PGU方法利用传统闭路电视(CCTV)摄像机进行火灾和烟雾探测的有效性。所提出的技术在多个数据集上表现出稳健的性能。与现有方法相比,它实现了烟雾和火灾探测的高准确率,同时显着减少了误报和误报警,并且模型轻巧。
{"title":"Lightweight spatio-temporal residual neural network and transformer architecture with positional gating for video-based smoke and fire detection","authors":"Rafaqat Alam Khan ,&nbsp;Usama Ijaz Bajwa ,&nbsp;Rana Hammad Raza ,&nbsp;Muhammad Umar Farooq","doi":"10.1016/j.engappai.2026.113996","DOIUrl":"10.1016/j.engappai.2026.113996","url":null,"abstract":"<div><div>The occurrence of fire incidents is considered one of the common hazards which not only risks human lives, but also impacts economy and environment. Detecting fire and smoke in its initial stages is highly important to prevent them from becoming uncontrollable. Conventional sensor-based detectors have limitations such as geographic area coverage, time required to reach the sensor, and false alarm rates. However, traditional sensor-based detectors are being substituted with smart video-based detectors. These provide effective monitoring, detection and detailed analysis of smoke and fires in both indoor/outdoor environments. This study introduced a real-time automated artificial intelligence (AI)-based video model for early-stage detection of smoke and fire, effectively mitigating false alarms caused by clouds, fogs or other fire-colored backgrounds or objects. The model Dual Attention Multi-Resolution Three-Dimensional Network with Positional Gating Unit (DAMR3DNet_PGU) was trained using hybrid Spatio-Temporal Residual Neural Network and Transformer architecture (Transformer) with Positional Gating on a wide range of unique smoke and fire patterns sourced from publicly available benchmark video datasets. Experiment results illustrated significant improvements in True Positive Rate (TPR), True Negative Rate (TNR), False Positive (FP), False Negative (FN), false alarm and accuracy, when compared with various state-of-the-art methods. The efficacy of the proposed DAMR3DNet_PGU method utilizing conventional closed-circuit television (CCTV) cameras for fire and smoke detection was affirmed. The proposed technique demonstrated robust performance across multiple datasets. It achieved high accuracy rates for smoke and fire detection, while significantly reducing false negatives, false alarm and with lightweight model compared to existing approaches.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113996"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-path adaptive feature elevation system for detecting small targets in remote sensing imagery 用于遥感图像小目标检测的双路径自适应特征高程系统
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-11 DOI: 10.1016/j.engappai.2026.114132
Liangjun Xu, Hui Ma
Detecting small targets in remote sensing imagery has long been a challenge due to factors such as weak target features and complex backgrounds. Existing methods primarily focus on improving detection efficiency, often resulting in suboptimal accuracy for small targets. This study proposes the dual-path adaptive feature elevation system (DAES-net) for detecting small targets in remote sensing imagery, which significantly enhances small target detection accuracy while maintaining reasonable detection efficiency, effectively overcoming this challenge. DAES-net first integrates a proprietary dual-path self-calibration module (DSM). This module optimizes feature fusion through global modeling and local denoising, enhancing global feature correlation while reducing redundancy to provide more precise fused features for the detection system. Second, the dynamic normalized wasserstein distance (D-NWD) loss function was designed to achieve more precise localization of minute targets. By dynamically adjusting the regression weights of the constraint terms in the normalized wasserstein distance (NWD) loss function, D-NWD implements an optimal localization strategy for small targets, thereby improving the model's localization efficiency for them. Finally, the one-time aggregated feature reuse reparameterized convolution (FRRO) was proposed. This feature reuse structure prevents information loss for small targets while accelerating model inference efficiency. Experimental results demonstrate that DAES-Net achieves the highest mean average precision (MAP) across four public small object detection datasets, outperforming existing state-of-the-art methods. This highlights the significant contribution of this research to the field of small object detection.
由于目标特征弱、背景复杂等因素,遥感图像中的小目标检测一直是一个难题。现有的方法主要关注于提高检测效率,对于小目标的检测精度往往不理想。本研究提出了用于遥感图像小目标检测的双路径自适应特征高程系统(DAES-net),在保持合理检测效率的同时,显著提高了小目标检测精度,有效克服了这一挑战。DAES-net首先集成了专有的双路自校准模块(DSM)。该模块通过全局建模和局部去噪对特征融合进行优化,在增强全局特征相关性的同时减少冗余,为检测系统提供更精确的融合特征。其次,设计动态归一化wasserstein距离(D-NWD)损失函数,实现微小目标更精确的定位;D-NWD通过动态调整归一化wasserstein距离(NWD)损失函数中约束项的回归权值,实现对小目标的最优定位策略,从而提高模型对小目标的定位效率。最后,提出了一次性聚合特征重用重参数化卷积(FRRO)算法。这种特征重用结构防止了小目标的信息丢失,同时提高了模型推理效率。实验结果表明,DAES-Net在四个公共小目标检测数据集上实现了最高的平均精度(MAP),优于现有的最先进的方法。这凸显了本研究对小目标检测领域的重大贡献。
{"title":"Dual-path adaptive feature elevation system for detecting small targets in remote sensing imagery","authors":"Liangjun Xu,&nbsp;Hui Ma","doi":"10.1016/j.engappai.2026.114132","DOIUrl":"10.1016/j.engappai.2026.114132","url":null,"abstract":"<div><div>Detecting small targets in remote sensing imagery has long been a challenge due to factors such as weak target features and complex backgrounds. Existing methods primarily focus on improving detection efficiency, often resulting in suboptimal accuracy for small targets. This study proposes the dual-path adaptive feature elevation system (DAES-net) for detecting small targets in remote sensing imagery, which significantly enhances small target detection accuracy while maintaining reasonable detection efficiency, effectively overcoming this challenge. DAES-net first integrates a proprietary dual-path self-calibration module (DSM). This module optimizes feature fusion through global modeling and local denoising, enhancing global feature correlation while reducing redundancy to provide more precise fused features for the detection system. Second, the dynamic normalized wasserstein distance (D-NWD) loss function was designed to achieve more precise localization of minute targets. By dynamically adjusting the regression weights of the constraint terms in the normalized wasserstein distance (NWD) loss function, D-NWD implements an optimal localization strategy for small targets, thereby improving the model's localization efficiency for them. Finally, the one-time aggregated feature reuse reparameterized convolution (FRRO) was proposed. This feature reuse structure prevents information loss for small targets while accelerating model inference efficiency. Experimental results demonstrate that DAES-Net achieves the highest mean average precision (MAP) across four public small object detection datasets, outperforming existing state-of-the-art methods. This highlights the significant contribution of this research to the field of small object detection.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114132"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable and disentangled image editing by manipulating the semantic latent space in diffusion models 利用扩散模型中的语义潜在空间进行可解释和解纠缠的图像编辑
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-10 DOI: 10.1016/j.engappai.2026.114149
Tian Qiu , Qianmu Li
Diffusion models (DMs) have gained prominence in image manipulation, surpassing traditional methods like Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) in quality and realism. However, performing precise attribute manipulation and interpretability directly in their high-dimensional Gaussian noise space X remains challenging. This limitation motivates a more controllable and interpretable editing framework for practical attribute-level manipulation. To overcome this, we propose leveraging an intermediate semantic latent space H along with generated semantic attention masks, achieving efficient and high-fidelity image editing with enhanced disentanglement. Our method facilitates targeted editing in the spatial domain, enabling the modification of specific attributes without affecting unrelated regions of the image. Specifically, we develop a remapper network to map textual prompt embeddings into semantic latent representations within H, ensuring editing operations closely align with textual prompts. To further improve the disentanglement and editing efficiency, we design an attention module with three different attention mask strategies applied to the adjusted latent representation. The attention mask intuitively explains the area that DMs focus on during image editing at each time step. We conduct extensive experiments on a variety of datasets, including human faces, dogs, and oil paintings. Both qualitative and quantitative results demonstrate the superiority of our approach over state-of-the-art diffusion-based editing baselines in terms of editing quality, target alignment, and reduced non-target drift.
扩散模型(dm)在图像处理方面取得了突出的成就,在质量和真实感方面超越了传统的方法,如变分自编码器(VAEs)和生成对抗网络(GANs)。然而,在其高维高斯噪声空间X中直接执行精确的属性操作和可解释性仍然具有挑战性。这个限制激发了一个更可控和可解释的编辑框架,用于实际的属性级操作。为了克服这个问题,我们提出利用中间语义潜在空间H和生成的语义注意掩模,通过增强的解纠缠实现高效和高保真的图像编辑。我们的方法便于在空间域中进行有针对性的编辑,可以在不影响图像无关区域的情况下修改特定属性。具体来说,我们开发了一个重标注网络,将文本提示嵌入映射到H中的语义潜在表示,确保编辑操作与文本提示紧密一致。为了进一步提高解纠缠和编辑效率,我们设计了一个注意模块,将三种不同的注意掩模策略应用于调整后的潜在表征。注意遮罩直观地解释了dm在每个时间步的图像编辑过程中关注的区域。我们在各种数据集上进行了广泛的实验,包括人脸、狗和油画。定性和定量结果都证明了我们的方法在编辑质量、目标对准和减少非目标漂移方面优于最先进的基于扩散的编辑基线。
{"title":"Interpretable and disentangled image editing by manipulating the semantic latent space in diffusion models","authors":"Tian Qiu ,&nbsp;Qianmu Li","doi":"10.1016/j.engappai.2026.114149","DOIUrl":"10.1016/j.engappai.2026.114149","url":null,"abstract":"<div><div>Diffusion models (DMs) have gained prominence in image manipulation, surpassing traditional methods like Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) in quality and realism. However, performing precise attribute manipulation and interpretability directly in their high-dimensional Gaussian noise space <span><math><mi>X</mi></math></span> remains challenging. This limitation motivates a more controllable and interpretable editing framework for practical attribute-level manipulation. To overcome this, we propose leveraging an intermediate semantic latent space <span><math><mi>H</mi></math></span> along with generated semantic attention masks, achieving efficient and high-fidelity image editing with enhanced disentanglement. Our method facilitates targeted editing in the spatial domain, enabling the modification of specific attributes without affecting unrelated regions of the image. Specifically, we develop a remapper network to map textual prompt embeddings into semantic latent representations within <span><math><mi>H</mi></math></span>, ensuring editing operations closely align with textual prompts. To further improve the disentanglement and editing efficiency, we design an attention module with three different attention mask strategies applied to the adjusted latent representation. The attention mask intuitively explains the area that DMs focus on during image editing at each time step. We conduct extensive experiments on a variety of datasets, including human faces, dogs, and oil paintings. Both qualitative and quantitative results demonstrate the superiority of our approach over state-of-the-art diffusion-based editing baselines in terms of editing quality, target alignment, and reduced non-target drift.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114149"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-term cooperative path planning for stratospheric airships based on hierarchical multi-agent reinforcement learning 基于分层多智能体强化学习的平流层飞艇长期协同路径规划
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-10 DOI: 10.1016/j.engappai.2026.114156
Chao Lv , Ming Zhu , Xiao Guo , Jiajun Ou , Baojin Zheng , Liran Sun
Stratospheric airships are increasingly used for long-term collaborative tasks, requiring efficient path planning for multiple airships. Traditional methods struggle with collaborative optimization and state space explosion in such tasks. To address these issues, this paper presents a hierarchical cooperative airship path planning (HiCAPP). This HiCAPP employs a dual-layer control architecture, with the high-level controller responsible for task allocation and the low-level controller concentrating on path planning. Experimental results show that HiCAPP outperforms traditional multi-agent reinforcement learning methods in two critical metrics: average remaining energy and average distance to the task center. Additionally, through experiments with varying numbers of agents, task durations, and disturbances, HiCAPP has demonstrated robustness and scalability. These results confirm its effectiveness in long-term cooperative monitoring tasks and highlight the advantages of hierarchical decision-making in multi-agent systems.
平流层飞艇越来越多地用于长期协同任务,这需要多个飞艇进行有效的路径规划。在这类任务中,传统方法难以解决协同优化和状态空间爆炸问题。为了解决这些问题,本文提出了一种分层协同飞艇路径规划方法。该HiCAPP采用双层控制架构,高层控制器负责任务分配,低层控制器专注于路径规划。实验结果表明,HiCAPP在平均剩余能量和到任务中心的平均距离两个关键指标上优于传统的多智能体强化学习方法。此外,通过不同数量的代理、任务持续时间和干扰的实验,HiCAPP已经证明了鲁棒性和可扩展性。这些结果证实了该方法在长期协同监测任务中的有效性,突出了分层决策在多智能体系统中的优势。
{"title":"Long-term cooperative path planning for stratospheric airships based on hierarchical multi-agent reinforcement learning","authors":"Chao Lv ,&nbsp;Ming Zhu ,&nbsp;Xiao Guo ,&nbsp;Jiajun Ou ,&nbsp;Baojin Zheng ,&nbsp;Liran Sun","doi":"10.1016/j.engappai.2026.114156","DOIUrl":"10.1016/j.engappai.2026.114156","url":null,"abstract":"<div><div>Stratospheric airships are increasingly used for long-term collaborative tasks, requiring efficient path planning for multiple airships. Traditional methods struggle with collaborative optimization and state space explosion in such tasks. To address these issues, this paper presents a hierarchical cooperative airship path planning (HiCAPP). This HiCAPP employs a dual-layer control architecture, with the high-level controller responsible for task allocation and the low-level controller concentrating on path planning. Experimental results show that HiCAPP outperforms traditional multi-agent reinforcement learning methods in two critical metrics: average remaining energy and average distance to the task center. Additionally, through experiments with varying numbers of agents, task durations, and disturbances, HiCAPP has demonstrated robustness and scalability. These results confirm its effectiveness in long-term cooperative monitoring tasks and highlight the advantages of hierarchical decision-making in multi-agent systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114156"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-exposure high dynamic range reconstruction by incorporating imaging knowledge 结合成像知识进行多曝光高动态范围重建
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-12 DOI: 10.1016/j.engappai.2026.114177
Hu Wang , Mao Ye , Dengyan Luo , Yan Gan
The existing photographic equipment is not able to capture scenes of the natural world very well. Thus, the problem of reconstructing high dynamic range (HDR) images from multi-exposure low dynamic range (LDR) images arises because these images have different details. The existing methods do not fully leverage imaging knowledge in the LDR image generation pipeline, resulting in design redundancy and inefficient resource utilization. We propose a new Multi-Exposure HDR reconstruction by incorporating Imaging Knowledge (MEIK) for efficient HDR image reconstruction. Our method consists of two parts: fusion of LDR features and reconstruction of HDR feature. Due to object motion and exposure time effects, LDR features with different exposures need to be fused. A Multi-Exposure Information Aggregation (MEIA) module is proposed to fuse LDR features based on Mamba. After that, an Inverse imaging Knowledge-Driven (IKD) cluster is employed to reconstruct the HDR feature, which is a cascade of IKD blocks at different scales. The IKD block consists of three parts: HDR information recovery, imaging parameter adjustment, and noise suppression, used to simulate the mathematical formula for multi-exposure HDR imaging. Experimental results demonstrate that the proposed MEIK model outperforms existing state-of-the-art models and exhibits strong scalability.
现有的摄影设备不能很好地捕捉自然世界的景色。因此,由于多曝光低动态范围(LDR)图像具有不同的细节,因此产生了从这些图像重建高动态范围(HDR)图像的问题。现有方法没有充分利用LDR图像生成管道中的成像知识,导致设计冗余和资源利用率低下。我们提出了一种新的多曝光HDR重建方法,该方法结合了成像知识(MEIK)来实现高效的HDR图像重建。我们的方法包括两个部分:LDR特征融合和HDR特征重建。由于物体运动和曝光时间的影响,不同曝光的LDR特征需要融合。提出了一种基于Mamba的多曝光信息聚合(MEIA)模块来融合LDR特征。然后,利用逆成像知识驱动(IKD)聚类重构HDR特征,这是一个不同尺度的IKD块级联。IKD块由HDR信息恢复、成像参数调整和噪声抑制三部分组成,用于模拟多曝光HDR成像的数学公式。实验结果表明,所提出的MEIK模型优于现有的先进模型,具有较强的可扩展性。
{"title":"Multi-exposure high dynamic range reconstruction by incorporating imaging knowledge","authors":"Hu Wang ,&nbsp;Mao Ye ,&nbsp;Dengyan Luo ,&nbsp;Yan Gan","doi":"10.1016/j.engappai.2026.114177","DOIUrl":"10.1016/j.engappai.2026.114177","url":null,"abstract":"<div><div>The existing photographic equipment is not able to capture scenes of the natural world very well. Thus, the problem of reconstructing high dynamic range (HDR) images from multi-exposure low dynamic range (LDR) images arises because these images have different details. The existing methods do not fully leverage imaging knowledge in the LDR image generation pipeline, resulting in design redundancy and inefficient resource utilization. We propose a new Multi-Exposure HDR reconstruction by incorporating Imaging Knowledge (MEIK) for efficient HDR image reconstruction. Our method consists of two parts: fusion of LDR features and reconstruction of HDR feature. Due to object motion and exposure time effects, LDR features with different exposures need to be fused. A Multi-Exposure Information Aggregation (MEIA) module is proposed to fuse LDR features based on Mamba. After that, an Inverse imaging Knowledge-Driven (IKD) cluster is employed to reconstruct the HDR feature, which is a cascade of IKD blocks at different scales. The IKD block consists of three parts: HDR information recovery, imaging parameter adjustment, and noise suppression, used to simulate the mathematical formula for multi-exposure HDR imaging. Experimental results demonstrate that the proposed MEIK model outperforms existing state-of-the-art models and exhibits strong scalability.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114177"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Contextual Multimodal Federated Transformer with dual distillation for decentralized chronic obstructive pulmonary disease related lung pathology classification 一个上下文多模式联合变压器与双蒸馏分散慢性阻塞性肺疾病相关的肺病理分类
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-07 DOI: 10.1016/j.engappai.2026.114046
Ayesha Jabbar, Huang Jianjun, Muhammad Kashif Jabbar
The use of multimodal data within decentralized healthcare presents an opportunity to improve artificial intelligence (AI)–based classification of lung pathology related to chronic obstructive pulmonary disease (COPD) while maintaining patient privacy. Nonetheless, heterogeneous client modalities, non-independent and non-identically distributed (non-IID) data, and equity challenges in aggregating models remain unresolved. To address these issues, we propose CMF-Former-D (Contextual Multimodal Federated Transformer with Dual-Level Distillation), a privacy-preserving decentralized learning method for peer-to-peer training across heterogeneous edge devices. CMF-Former-D employs an attention-based fusion mechanism that integrates audio features from respiratory sounds and image features from chest X-rays. We further introduce PeerMesh-Distill, a server-free protocol that enables decentralized knowledge sharing with equitable distribution of model updates across heterogeneous client sites, and FairWeight-Gossip, a strategy that promotes fair update aggregation among clients. The model is dynamically adapted to client hardware and modality constraints using resource-aware configurations. CMF-Former-D achieves 98.3% accuracy, a macro-averaged F1-score (macro-F1) of 0.987, and an area under the receiver operating characteristic curve (AUROC) of 0.972 on a synthetic proxy-aligned benchmark. End-to-end multimodal inference latency is 150 ms (ms) on a central processing unit (CPU) (batch size = 1), while graphics processing unit (GPU) latency is reported separately on an edge GPU under batch size = 16. Statistical tests indicate significant improvements (p-value (p) <0.01), and client-level analysis shows reduced performance disparities across clients.
在分散式医疗保健中使用多模式数据为改进基于人工智能(AI)的与慢性阻塞性肺疾病(COPD)相关的肺部病理分类提供了机会,同时维护了患者隐私。尽管如此,异构客户端模式、非独立和非同分布(非iid)数据以及聚合模型中的公平性挑战仍未得到解决。为了解决这些问题,我们提出了CMF-Former-D (context Multimodal Federated Transformer with Dual-Level Distillation),这是一种保护隐私的分散学习方法,用于跨异构边缘设备的点对点训练。CMF-Former-D采用基于注意力的融合机制,将呼吸声音的音频特征和胸部x光片的图像特征集成在一起。我们进一步介绍了peermesh -蒸馏,这是一种无服务器协议,可以通过在异构客户端站点之间公平分配模型更新来实现分散的知识共享,以及FairWeight-Gossip,这是一种促进客户端之间公平更新聚合的策略。该模型使用资源感知配置动态地适应客户机硬件和模态约束。在合成代理校准基准上,CMF-Former-D的准确率为98.3%,宏观平均f1评分(macro-F1)为0.987,接收者工作特征曲线下面积(AUROC)为0.972。端到端多模态推理延迟在中央处理单元(CPU)(批大小= 1)上为150毫秒(ms),而在边缘GPU(批大小= 16)上单独报告图形处理单元(GPU)延迟。统计测试表明有显著的改进(p值(p) <0.01),客户级分析表明客户端之间的性能差异减少了。
{"title":"A Contextual Multimodal Federated Transformer with dual distillation for decentralized chronic obstructive pulmonary disease related lung pathology classification","authors":"Ayesha Jabbar,&nbsp;Huang Jianjun,&nbsp;Muhammad Kashif Jabbar","doi":"10.1016/j.engappai.2026.114046","DOIUrl":"10.1016/j.engappai.2026.114046","url":null,"abstract":"<div><div>The use of multimodal data within decentralized healthcare presents an opportunity to improve artificial intelligence (AI)–based classification of lung pathology related to chronic obstructive pulmonary disease (COPD) while maintaining patient privacy. Nonetheless, heterogeneous client modalities, non-independent and non-identically distributed (non-IID) data, and equity challenges in aggregating models remain unresolved. To address these issues, we propose CMF-Former-D (Contextual Multimodal Federated Transformer with Dual-Level Distillation), a privacy-preserving decentralized learning method for peer-to-peer training across heterogeneous edge devices. CMF-Former-D employs an attention-based fusion mechanism that integrates audio features from respiratory sounds and image features from chest X-rays. We further introduce PeerMesh-Distill, a server-free protocol that enables decentralized knowledge sharing with equitable distribution of model updates across heterogeneous client sites, and FairWeight-Gossip, a strategy that promotes fair update aggregation among clients. The model is dynamically adapted to client hardware and modality constraints using resource-aware configurations. CMF-Former-D achieves 98.3% accuracy, a macro-averaged F1-score (macro-F1) of 0.987, and an area under the receiver operating characteristic curve (AUROC) of 0.972 on a synthetic proxy-aligned benchmark. End-to-end multimodal inference latency is 150 ms (ms) on a central processing unit (CPU) (batch size = 1), while graphics processing unit (GPU) latency is reported separately on an edge GPU under batch size = 16. Statistical tests indicate significant improvements (<span><math><mi>p</mi></math></span>-value (p) <span><math><mrow><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span>), and client-level analysis shows reduced performance disparities across clients.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114046"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart indoor occupancy detection based on optimized camera placement, multi-view de-duplication, and large language model semantic understanding 基于优化摄像头放置、多视图去重复和大语言模型语义理解的智能室内占用检测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-11 DOI: 10.1016/j.engappai.2026.114157
Deli Liu, Xiaoping Zhou, Dongxiao Chen, Yu Li
Accurate occupancy detection in indoor environments is essential for optimizing energy use, enhancing occupant comfort, and ensuring safety in smart buildings. This study aims to design and validate an end-to-end framework that not only counts occupants reliably but also generates rich semantic descriptions of their behaviors and spatial interactions. We propose a four-stage methodology: (1) multi-objective optimization of camera placement through field-of-view analysis and grid modeling to maximize coverage and minimize blind spots; (2) on-device human detection using a fine-tuned You Only Look Once version 8 (YOLOv8) model; (3) cross-camera identity tracking using Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT) to assign unique global identifiers and eliminate duplicate counts; and (4) a multimodal large language model (LLM) that consumes annotated, identity-aware multi-view images to produce coherent natural-language summaries and structured outputs detailing occupant numbers, actions, and locations. Extensive evaluations conducted on a diverse multi-view dataset—including challenging scenarios of heavy occlusion and clothing changes—demonstrate the robustness and real-time applicability of the proposed framework. The key contribution of this work is the first demonstration of integrating identity-aware, multi-camera de-duplication with large language model–driven scene interpretation, enabling automated, actionable insights that extend beyond simple occupancy counts. This novel combination advances intelligent building management by providing both precise occupancy analytics and contextual understanding to support adaptive control and energy-efficient operation.
在智能建筑中,室内环境的准确占用检测对于优化能源使用、提高居住者舒适度和确保安全至关重要。本研究旨在设计并验证一个端到端框架,该框架不仅可以可靠地计算占用者数量,还可以生成他们的行为和空间交互的丰富语义描述。我们提出了一种四阶段的方法:(1)通过视场分析和网格建模对摄像机的放置进行多目标优化,以最大化覆盖范围和最小化盲点;(2)设备上的人工检测,使用经过微调的You Only Look Once version 8 (YOLOv8)模型;(3)使用深度关联度量(DeepSORT)的简单在线和实时跟踪跨相机身份跟踪来分配唯一的全局标识符并消除重复计数;(4)一个多模态大语言模型(LLM),它使用带注释的、身份感知的多视图图像来生成连贯的自然语言摘要和结构化输出,详细说明居住者的数量、行动和位置。在不同的多视图数据集上进行了广泛的评估,包括严重遮挡和服装变化的挑战性场景,证明了所提出框架的鲁棒性和实时适用性。这项工作的关键贡献是首次展示了将身份感知、多摄像头重复数据删除与大型语言模型驱动的场景解释相结合,实现了自动化、可操作的洞察,而不仅仅是简单的占用计数。这种新颖的组合通过提供精确的占用分析和上下文理解来推进智能建筑管理,以支持自适应控制和节能操作。
{"title":"Smart indoor occupancy detection based on optimized camera placement, multi-view de-duplication, and large language model semantic understanding","authors":"Deli Liu,&nbsp;Xiaoping Zhou,&nbsp;Dongxiao Chen,&nbsp;Yu Li","doi":"10.1016/j.engappai.2026.114157","DOIUrl":"10.1016/j.engappai.2026.114157","url":null,"abstract":"<div><div>Accurate occupancy detection in indoor environments is essential for optimizing energy use, enhancing occupant comfort, and ensuring safety in smart buildings. This study aims to design and validate an end-to-end framework that not only counts occupants reliably but also generates rich semantic descriptions of their behaviors and spatial interactions. We propose a four-stage methodology: (1) multi-objective optimization of camera placement through field-of-view analysis and grid modeling to maximize coverage and minimize blind spots; (2) on-device human detection using a fine-tuned You Only Look Once version 8 (YOLOv8) model; (3) cross-camera identity tracking using Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT) to assign unique global identifiers and eliminate duplicate counts; and (4) a multimodal large language model (LLM) that consumes annotated, identity-aware multi-view images to produce coherent natural-language summaries and structured outputs detailing occupant numbers, actions, and locations. Extensive evaluations conducted on a diverse multi-view dataset—including challenging scenarios of heavy occlusion and clothing changes—demonstrate the robustness and real-time applicability of the proposed framework. The key contribution of this work is the first demonstration of integrating identity-aware, multi-camera de-duplication with large language model–driven scene interpretation, enabling automated, actionable insights that extend beyond simple occupancy counts. This novel combination advances intelligent building management by providing both precise occupancy analytics and contextual understanding to support adaptive control and energy-efficient operation.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114157"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on a residual learning based neural-kernel framework with applications in short-term load forecasting 残差学习神经核框架在短期负荷预测中的应用研究
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-11 DOI: 10.1016/j.engappai.2026.113989
Wangyi Xu , Yushu Xiang , Xin Ma , Wangpeng Li
Short-term load forecasting is essential for power system operation, yet it remains challenging due to the non-stationary nature of load data and the difficulty of capturing complex nonlinear relationships. To address this issue, a residual learning–based neural kernel framework is proposed for short–term load forecasting. The framework integrates a Fourier kernel-based neural kernel module into a deep residual network as a residual function. The Fourier kernel enables automatic identification and separation of periodic components and long-term trends in load data, while the non-parametric property of the kernel model helps reduce model complexity. Meanwhile, the shortcut connections in the residual network effectively alleviate the vanishing gradient problem, ensuring stable and efficient model training. To further improve model performance, the Artificial Bee Colony (ABC) algorithm is employed for hyperparameter optimization, allowing efficient approximation of the global optimum. In addition, a novel Theil UII-S loss function is introduced to enhance the model’s sensitivity to abnormal load fluctuations through adaptive gradient regulation. Experimental results on four real-world power datasets demonstrate that the proposed model outperforms 23 benchmark methods in terms of prediction accuracy. Ablation studies further verify the individual contributions of the Fourier kernel, the loss function, and the ABC algorithm, providing useful insights for future research.
短期负荷预测对电力系统运行至关重要,但由于负荷数据的非平稳性质和难以捕捉复杂的非线性关系,短期负荷预测仍然具有挑战性。针对这一问题,提出了一种基于残差学习的短期负荷预测神经核框架。该框架将基于傅里叶核的神经核模块作为残差函数集成到深度残差网络中。傅里叶核可以自动识别和分离负荷数据中的周期性成分和长期趋势,而核模型的非参数特性有助于降低模型的复杂性。同时,残差网络中的快捷连接有效缓解了梯度消失问题,保证了模型训练的稳定高效。为了进一步提高模型的性能,采用人工蜂群(Artificial Bee Colony, ABC)算法进行超参数优化,可以有效地逼近全局最优。此外,引入了一种新的Theil ui - s损失函数,通过自适应梯度调节来提高模型对负荷异常波动的灵敏度。在4个实际电力数据集上的实验结果表明,该模型的预测精度优于23种基准方法。消融研究进一步验证了傅里叶核、损失函数和ABC算法的各自贡献,为未来的研究提供了有用的见解。
{"title":"Research on a residual learning based neural-kernel framework with applications in short-term load forecasting","authors":"Wangyi Xu ,&nbsp;Yushu Xiang ,&nbsp;Xin Ma ,&nbsp;Wangpeng Li","doi":"10.1016/j.engappai.2026.113989","DOIUrl":"10.1016/j.engappai.2026.113989","url":null,"abstract":"<div><div>Short-term load forecasting is essential for power system operation, yet it remains challenging due to the non-stationary nature of load data and the difficulty of capturing complex nonlinear relationships. To address this issue, a residual learning–based neural kernel framework is proposed for short–term load forecasting. The framework integrates a Fourier kernel-based neural kernel module into a deep residual network as a residual function. The Fourier kernel enables automatic identification and separation of periodic components and long-term trends in load data, while the non-parametric property of the kernel model helps reduce model complexity. Meanwhile, the shortcut connections in the residual network effectively alleviate the vanishing gradient problem, ensuring stable and efficient model training. To further improve model performance, the Artificial Bee Colony (ABC) algorithm is employed for hyperparameter optimization, allowing efficient approximation of the global optimum. In addition, a novel Theil UII-S loss function is introduced to enhance the model’s sensitivity to abnormal load fluctuations through adaptive gradient regulation. Experimental results on four real-world power datasets demonstrate that the proposed model outperforms 23 benchmark methods in terms of prediction accuracy. Ablation studies further verify the individual contributions of the Fourier kernel, the loss function, and the ABC algorithm, providing useful insights for future research.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 113989"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Q-learning-driven adaptive rewiring for cooperative control in heterogeneous networks 基于q学习的异构网络协同控制自适应重构
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-09 DOI: 10.1016/j.engappai.2026.114024
Yi-Ning Weng , Hsuan-Wei Lee
Cooperation emergence in multi-agent systems represents a fundamental statistical physics problem where microscopic learning rules drive macroscopic collective behavior transitions. We propose a Q-learning-based variant of adaptive rewiring that builds on mechanisms studied in the literature. This method combines temporal difference learning with network restructuring so that agents can optimize strategies and social connections based on interaction histories. Through neighbor-specific Q-learning, agents develop sophisticated partnership management strategies that enable cooperator cluster formation, creating spatial separation between cooperative and defective regions. Using power-law networks that reflect real-world heterogeneous connectivity patterns, we evaluate emergent behaviors under varying rewiring constraint levels, revealing distinct cooperation regimes across parameter space, characterized by qualitative changes in macroscopic cooperation behavior. Our systematic analysis identifies three behavioral regimes: a permissive regime (low constraints) enabling rapid cooperative cluster formation, an intermediate regime with sensitive dependence on dilemma strength, and a patient regime (high constraints) where strategic accumulation gradually optimizes network structure. Comparative analysis against Bush–Mosteller stimulus–response learning demonstrates that Q-learning’s temporal credit assignment capabilities produce superior cooperation outcomes, particularly under intermediate rewiring constraints where long-term relationship assessment becomes crucial. Simulation results show that while moderate constraints create transition-like zones that suppress cooperation, fully adaptive rewiring enhances cooperation levels through systematic exploration of favorable network configurations. Quantitative analysis reveals that increased rewiring frequency drives large-scale cluster formation. Our results establish a new paradigm for understanding intelligence-driven cooperation pattern formation in complex adaptive systems, revealing how machine learning serves as an alternative driving force for spontaneous organization in multi-agent networks.
多智能体系统中的合作出现是一个基本的统计物理问题,微观学习规则驱动宏观集体行为转变。我们提出了一种基于q学习的自适应重新布线变体,该变体建立在文献中研究的机制之上。该方法将时间差异学习与网络重构相结合,使智能体能够基于交互历史优化策略和社会连接。通过特定于邻居的q学习,智能体发展出复杂的伙伴关系管理策略,使合作者集群形成,在合作区域和缺陷区域之间建立空间分离。利用反映现实世界异构连接模式的幂律网络,我们评估了不同重新连接约束水平下的紧急行为,揭示了跨参数空间的不同合作机制,其特征是宏观合作行为的质变。我们的系统分析确定了三种行为机制:允许机制(低约束)能够快速形成合作集群,对困境强度有敏感依赖的中间机制,以及战略积累逐渐优化网络结构的耐心机制(高约束)。与布什-莫斯特勒刺激-反应学习的对比分析表明,q -学习的时间信用分配能力产生了更好的合作结果,特别是在长期关系评估变得至关重要的中间重新布线约束下。仿真结果表明,适度约束会产生抑制合作的过渡区,而完全自适应重布线通过系统地探索有利的网络配置来提高合作水平。定量分析表明,重布线频率的增加推动了大规模星团的形成。我们的研究结果为理解复杂自适应系统中智能驱动的合作模式形成建立了一个新的范式,揭示了机器学习如何作为多智能体网络中自发组织的另一种驱动力。
{"title":"Q-learning-driven adaptive rewiring for cooperative control in heterogeneous networks","authors":"Yi-Ning Weng ,&nbsp;Hsuan-Wei Lee","doi":"10.1016/j.engappai.2026.114024","DOIUrl":"10.1016/j.engappai.2026.114024","url":null,"abstract":"<div><div>Cooperation emergence in multi-agent systems represents a fundamental statistical physics problem where microscopic learning rules drive macroscopic collective behavior transitions. We propose a Q-learning-based variant of adaptive rewiring that builds on mechanisms studied in the literature. This method combines temporal difference learning with network restructuring so that agents can optimize strategies and social connections based on interaction histories. Through neighbor-specific Q-learning, agents develop sophisticated partnership management strategies that enable cooperator cluster formation, creating spatial separation between cooperative and defective regions. Using power-law networks that reflect real-world heterogeneous connectivity patterns, we evaluate emergent behaviors under varying rewiring constraint levels, revealing distinct cooperation regimes across parameter space, characterized by qualitative changes in macroscopic cooperation behavior. Our systematic analysis identifies three behavioral regimes: a permissive regime (low constraints) enabling rapid cooperative cluster formation, an intermediate regime with sensitive dependence on dilemma strength, and a patient regime (high constraints) where strategic accumulation gradually optimizes network structure. Comparative analysis against Bush–Mosteller stimulus–response learning demonstrates that Q-learning’s temporal credit assignment capabilities produce superior cooperation outcomes, particularly under intermediate rewiring constraints where long-term relationship assessment becomes crucial. Simulation results show that while moderate constraints create transition-like zones that suppress cooperation, fully adaptive rewiring enhances cooperation levels through systematic exploration of favorable network configurations. Quantitative analysis reveals that increased rewiring frequency drives large-scale cluster formation. Our results establish a new paradigm for understanding intelligence-driven cooperation pattern formation in complex adaptive systems, revealing how machine learning serves as an alternative driving force for spontaneous organization in multi-agent networks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114024"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-based target fencing control for delay-tolerant unmanned aerial vehicle swarm 基于学习的容延迟无人机群目标围栏控制
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-02-06 DOI: 10.1016/j.engappai.2026.114069
Hao Yu, Xiu-xia Yang, Yi Zhang, Wen-qiang Yao
This study focuses on the cooperative fencing mission for unmanned aerial vehicle (UAV) swarm under communication delays, proposing an adaptive self-organized control framework based on a Radial Basis Function-Brain Emotional Learning-Based Intelligent Controller (RBF-BELBIC). Firstly, a fixed-time convergent observer is developed to realize simultaneous estimation of multiple states of the target, achieving precise estimation independent of initial states through dual-channel Hurwitz polynomial configuration. Secondly, a self-organized distributed control scheme integrating consensus term, navigation term, and potential field term is constructed. This strategy enables the UAV swarm to autonomously generate a dynamic fencing convex hull around the target, eliminating the dependency on predefined geometric configurations while guaranteeing collision avoidance. Thirdly, a dual-layer intelligent robust controller driven by the RBF-BELBIC network is designed to tackle the control lag effects caused by communication delays. This architecture establishes a hierarchical structure where the RBF network serves as an upper layer for online gain optimization, and the BELBIC acts as a lower reactive control layer, thereby enabling simultaneous disturbance compensation and dynamic control policy adaptation. Closed-loop stability is analytically established using Lyapunov theory. Simulations verify that the proposed control strategy extends the tolerable delay bound by an order of magnitude over conventional methods (from 100 ms to 1000 ms). Concurrently, it reduces fencing position and velocity errors by 99.36% and 97.45%, compared to single-layer learning networks under large delays, demonstrating superior robustness in complex environments.
针对通信延迟条件下无人机(UAV)群协同围防任务,提出了一种基于径向基函数-基于大脑情绪学习的智能控制器(RBF-BELBIC)自适应自组织控制框架。首先,开发了一种固定时间收敛观测器,实现了目标的多状态同时估计,通过双通道Hurwitz多项式配置实现了不依赖于初始状态的精确估计;其次,构造了共识项、导航项和势场项相结合的自组织分布式控制方案;该策略使无人机群能够在目标周围自主生成动态围栏凸壳,在保证避碰的同时消除对预定义几何构型的依赖。第三,设计了由RBF-BELBIC网络驱动的双层智能鲁棒控制器,解决了通信延迟带来的控制滞后效应。该体系结构建立了一个分层结构,其中RBF网络作为在线增益优化的上层,BELBIC作为下层的无功控制层,从而同时实现干扰补偿和动态控制策略自适应。利用李雅普诺夫理论解析建立了闭环稳定性。仿真验证了所提出的控制策略比传统方法(从100 ms到1000 ms)延长了一个数量级的可容忍延迟。同时,与大延迟下的单层学习网络相比,该方法将击剑的位置和速度误差分别降低了99.36%和97.45%,在复杂环境下表现出优越的鲁棒性。
{"title":"Learning-based target fencing control for delay-tolerant unmanned aerial vehicle swarm","authors":"Hao Yu,&nbsp;Xiu-xia Yang,&nbsp;Yi Zhang,&nbsp;Wen-qiang Yao","doi":"10.1016/j.engappai.2026.114069","DOIUrl":"10.1016/j.engappai.2026.114069","url":null,"abstract":"<div><div>This study focuses on the cooperative fencing mission for unmanned aerial vehicle (UAV) swarm under communication delays, proposing an adaptive self-organized control framework based on a Radial Basis Function-Brain Emotional Learning-Based Intelligent Controller (RBF-BELBIC). Firstly, a fixed-time convergent observer is developed to realize simultaneous estimation of multiple states of the target, achieving precise estimation independent of initial states through dual-channel Hurwitz polynomial configuration. Secondly, a self-organized distributed control scheme integrating consensus term, navigation term, and potential field term is constructed. This strategy enables the UAV swarm to autonomously generate a dynamic fencing convex hull around the target, eliminating the dependency on predefined geometric configurations while guaranteeing collision avoidance. Thirdly, a dual-layer intelligent robust controller driven by the RBF-BELBIC network is designed to tackle the control lag effects caused by communication delays. This architecture establishes a hierarchical structure where the RBF network serves as an upper layer for online gain optimization, and the BELBIC acts as a lower reactive control layer, thereby enabling simultaneous disturbance compensation and dynamic control policy adaptation. Closed-loop stability is analytically established using Lyapunov theory. Simulations verify that the proposed control strategy extends the tolerable delay bound by an order of magnitude over conventional methods (from 100 ms to 1000 ms). Concurrently, it reduces fencing position and velocity errors by 99.36% and 97.45%, compared to single-layer learning networks under large delays, demonstrating superior robustness in complex environments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"169 ","pages":"Article 114069"},"PeriodicalIF":8.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1