首页 > 最新文献

Expert Systems with Applications最新文献

英文 中文
A point cloud completion network integrating Mamba and transformer architectures 集成Mamba和变压器架构的点云完井网络
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2025-12-15 DOI: 10.1016/j.eswa.2025.130826
Weichao Wu , Yongyang Xu , Zhong Xie
Point cloud completion aims to reconstruct complete structures from incomplete point clouds by extracting fine-grained local details and global features. Current state-of-the-art methods rely on Transformer architectures, which suffer from quadratic complexity, leading to high computational costs and trade-offs in resolution and feature extraction. To address this limitation, we propose a novel point cloud completion network that integrates the Mamba model, a state space framework with linear complexity, for feature extraction in the encoding phase. Our approach replaces the self-attention module with Mamba and introduces a multi-scale encoding network to enhance the extraction and fusion of features from incomplete point clouds. A cross-attention decoding module processes centre points and incomplete features to predict a complete point cloud. Experiments on synthetic and real-world datasets show that our method achieves comparable performance to existing state-of-the-art approaches on benchmark datasets, achieving an average CDL1 score of 6.50 on the PCN dataset. In addition, our method demonstrates superior accuracy when processing large-volume point cloud data, highlighting Mamba’s effectiveness in handling such challenges compared with Transformer-based models.
点云补全的目的是通过提取细粒度的局部细节和全局特征,从不完整的点云中重建完整的结构。当前最先进的方法依赖于Transformer架构,它具有二次复杂度,导致高计算成本和分辨率和特征提取的权衡。为了解决这一限制,我们提出了一种新的点云补全网络,该网络集成了具有线性复杂性的状态空间框架Mamba模型,用于编码阶段的特征提取。该方法用Mamba取代了自关注模块,并引入了多尺度编码网络,增强了对不完整点云特征的提取和融合。交叉注意解码模块处理中心点和不完整的特征来预测一个完整的点云。在合成数据集和真实世界数据集上的实验表明,我们的方法在基准数据集上达到了与现有最先进方法相当的性能,在PCN数据集上实现了6.50的平均CDL1分数。此外,我们的方法在处理大容量点云数据时显示出卓越的准确性,与基于transformer的模型相比,突出了Mamba在处理此类挑战方面的有效性。
{"title":"A point cloud completion network integrating Mamba and transformer architectures","authors":"Weichao Wu ,&nbsp;Yongyang Xu ,&nbsp;Zhong Xie","doi":"10.1016/j.eswa.2025.130826","DOIUrl":"10.1016/j.eswa.2025.130826","url":null,"abstract":"<div><div>Point cloud completion aims to reconstruct complete structures from incomplete point clouds by extracting fine-grained local details and global features. Current state-of-the-art methods rely on Transformer architectures, which suffer from quadratic complexity, leading to high computational costs and trade-offs in resolution and feature extraction. To address this limitation, we propose a novel point cloud completion network that integrates the Mamba model, a state space framework with linear complexity, for feature extraction in the encoding phase. Our approach replaces the self-attention module with Mamba and introduces a multi-scale encoding network to enhance the extraction and fusion of features from incomplete point clouds. A cross-attention decoding module processes centre points and incomplete features to predict a complete point cloud. Experiments on synthetic and real-world datasets show that our method achieves comparable performance to existing state-of-the-art approaches on benchmark datasets, achieving an average CDL1 score of 6.50 on the PCN dataset. In addition, our method demonstrates superior accuracy when processing large-volume point cloud data, highlighting Mamba’s effectiveness in handling such challenges compared with Transformer-based models.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 130826"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coherence-aware and snap-triggered: A novel mechanism for audio-visual cooperative tasks 连贯感知和快照触发:一种新的视听合作任务机制
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-07 DOI: 10.1016/j.eswa.2026.131559
Cunhan Guo, Heyan Huang, Ruiqi Hu, Danjie Han
Audio-Visual Cooperative tasks underpin multimodal scene understanding and compel models to reconcile continuous temporal evolution with abrupt sensory transitions. We propose the Coherence-Aware and Snap-Triggered mechanism (CAST) mechanism, a plug-in temporal refinement layer without perturbing backbone parameters or demanding additional modalities. The Exponential Memory based Coherence-Aware module attenuates distant frame contributions through an exponentially decaying weight envelope, thereby preventing the persistent influence of obsolete disruptions. Complementarily, the Optical Flow based Snap-Triggered Module module registers instantaneous motion discontinuities and reallocates attention toward nascent events. Operating in concert, these modules yield a representation that remains coherent across smooth transitions yet responsive to sudden perturbations. Empirical evaluation across multiple AVC benchmarks demonstrates consistent superiority over established baselines, corroborating that CAST enhances temporal fidelity and, by extension, the reliability of downstream multimodal decisions.
视听合作任务支持多模态场景理解,并迫使模型协调连续的时间演变与突然的感觉转变。我们提出了一致性感知和快照触发机制(CAST)机制,这是一种不干扰骨干参数或要求额外模式的插件时间优化层。基于指数内存的相干感知模块通过指数衰减权重包络来衰减远端帧贡献,从而防止过时中断的持续影响。此外,基于光流的快照触发模块模块记录瞬时运动不连续并将注意力重新分配给新生事件。这些模块协同工作,产生了一种表示,在平稳过渡期间保持连贯,但对突然的扰动做出反应。对多个AVC基准的实证评估表明,CAST优于已建立的基线,证实了CAST提高了时间保真度,进而提高了下游多式联运决策的可靠性。
{"title":"Coherence-aware and snap-triggered: A novel mechanism for audio-visual cooperative tasks","authors":"Cunhan Guo,&nbsp;Heyan Huang,&nbsp;Ruiqi Hu,&nbsp;Danjie Han","doi":"10.1016/j.eswa.2026.131559","DOIUrl":"10.1016/j.eswa.2026.131559","url":null,"abstract":"<div><div>Audio-Visual Cooperative tasks underpin multimodal scene understanding and compel models to reconcile continuous temporal evolution with abrupt sensory transitions. We propose the Coherence-Aware and Snap-Triggered mechanism (CAST) mechanism, a plug-in temporal refinement layer without perturbing backbone parameters or demanding additional modalities. The Exponential Memory based Coherence-Aware module attenuates distant frame contributions through an exponentially decaying weight envelope, thereby preventing the persistent influence of obsolete disruptions. Complementarily, the Optical Flow based Snap-Triggered Module module registers instantaneous motion discontinuities and reallocates attention toward nascent events. Operating in concert, these modules yield a representation that remains coherent across smooth transitions yet responsive to sudden perturbations. Empirical evaluation across multiple AVC benchmarks demonstrates consistent superiority over established baselines, corroborating that CAST enhances temporal fidelity and, by extension, the reliability of downstream multimodal decisions.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131559"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking the low-cost barrier: a memory-augmented reactive navigation system for UAVs in cluttered indoor environments 打破低成本障碍:用于杂乱室内环境中的无人机的记忆增强反应导航系统
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131469
Jiale Quan , Weijun Hu , Xianlong Ma , Gang Chen
Achieving robust indoor autonomous flight for Unmanned Aerial Vehicles (UAVs) under strict hardware and computational constraints remains a formidable challenge. Conventional solutions relying on high-end sensors or global mapping are often inapplicable to resource-constrained micro-UAVs. In this paper, we propose a mapless integrated navigation framework aimed at achieving stable flight using a low-cost single-line 2D LiDAR. To address the limitations of sparse sensing, we propose a window-neighborhood-based denoising filtering algorithm and a velocity estimation-based motion distortion correction module. The system combines a risk-aware local planner and a short-sighted trajectory memory mechanism to navigate through cluttered spaces. The system operates in an O(N) loop with sub-millisecond latency. To overcome the local minima inherent in reactive planning, a deadlock escape layer is introduced, which formalizes navigation difficulty through trajectory entropy analysis, and generates recovery waypoints using discrete polar coordinate search. Validation through high-fidelity simulations and real-world experiments show that the system is capable of collision-free navigation at speeds up to 6 m/s, using low-cost sensors. This work provides an efficient solution for deploying intelligent aerial robots in perception-constrained indoor environments.
在严格的硬件和计算限制下,实现无人飞行器(uav)的强大室内自主飞行仍然是一个艰巨的挑战。依靠高端传感器或全局映射的传统解决方案往往不适用于资源受限的微型无人机。在本文中,我们提出了一种无地图集成导航框架,旨在使用低成本单线2D激光雷达实现稳定飞行。为了解决稀疏感知的局限性,我们提出了一种基于窗邻域的去噪滤波算法和一种基于速度估计的运动畸变校正模块。该系统结合了具有风险意识的本地规划师和短视轨迹记忆机制,可以在杂乱的空间中导航。系统运行在一个O(N)循环与亚毫秒的延迟。为了克服响应式规划固有的局部最小值问题,引入了死锁逃逸层,通过轨迹熵分析形式化导航难度,并利用离散极坐标搜索生成恢复路点。通过高保真仿真和真实世界实验验证,该系统能够使用低成本传感器,以高达6米/秒的速度进行无碰撞导航。这项工作为在感知受限的室内环境中部署智能空中机器人提供了一种有效的解决方案。
{"title":"Breaking the low-cost barrier: a memory-augmented reactive navigation system for UAVs in cluttered indoor environments","authors":"Jiale Quan ,&nbsp;Weijun Hu ,&nbsp;Xianlong Ma ,&nbsp;Gang Chen","doi":"10.1016/j.eswa.2026.131469","DOIUrl":"10.1016/j.eswa.2026.131469","url":null,"abstract":"<div><div>Achieving robust indoor autonomous flight for Unmanned Aerial Vehicles (UAVs) under strict hardware and computational constraints remains a formidable challenge. Conventional solutions relying on high-end sensors or global mapping are often inapplicable to resource-constrained micro-UAVs. In this paper, we propose a mapless integrated navigation framework aimed at achieving stable flight using a low-cost single-line 2D LiDAR. To address the limitations of sparse sensing, we propose a window-neighborhood-based denoising filtering algorithm and a velocity estimation-based motion distortion correction module. The system combines a risk-aware local planner and a short-sighted trajectory memory mechanism to navigate through cluttered spaces. The system operates in an O(N) loop with sub-millisecond latency. To overcome the local minima inherent in reactive planning, a deadlock escape layer is introduced, which formalizes navigation difficulty through trajectory entropy analysis, and generates recovery waypoints using discrete polar coordinate search. Validation through high-fidelity simulations and real-world experiments show that the system is capable of collision-free navigation at speeds up to 6 m/s, using low-cost sensors. This work provides an efficient solution for deploying intelligent aerial robots in perception-constrained indoor environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131469"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IVYA-FMGRU: A frequency-domain context interaction model with bio-inspired optimization for significant wave height prediction IVYA-FMGRU:一个具有生物启发优化的频域上下文相互作用模型,用于显著波高预测
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-06 DOI: 10.1016/j.eswa.2026.131534
Xiujing Gao , Yongfeng Xie , Fanchao Lin , Chiwang Lin , Hongwu Huang , Ziru Wang
Accurate prediction of significant wave heights is crucial for the safety of marine structures and ships. Traditional models struggle to capture the key frequency and periodic characteristics in wave height data. To address this issue, a novel Ivy Algorithm-Fast Fourier Transform Mogrifier Gated Recurrent Unit (IVYA-FMGRU) model is proposed, which integrates the gated recurrent unit (GRU) with the fast Fourier transform (FFT) and Mogrifier operations. The FFT extracts periodic features, the Mogrifier enhances the interaction between the GRU and frequency information, and the Ivy algorithm (IVYA), a bio-inspired optimization method, optimizes the model parameters. In addition, random forest (RF) is employed for feature selection. Experimental results show that the IVYA-FMGRU model achieves R2 scores of 0.8505, 0.8683, and 0.8910 on datasets 46027, 46083, and 46084, respectively outperforming other baseline models. Furthermore, error statistical analysis across different wave height intervals confirms the model’s accuracy and stability within each interval, demonstrating its superior performance and generalization capability in wave height prediction.
有效浪高的准确预测对海洋结构物和船舶的安全至关重要。传统模式难以捕捉波高数据中的关键频率和周期特征。为了解决这一问题,提出了一种新的Ivy算法-快速傅里叶变换Mogrifier门控循环单元(IVYA-FMGRU)模型,该模型将门控循环单元(GRU)与快速傅里叶变换(FFT)和Mogrifier运算相结合。FFT提取周期特征,Mogrifier增强GRU和频率信息之间的交互作用,Ivy算法(IVYA)是一种仿生优化方法,用于优化模型参数。此外,采用随机森林(RF)进行特征选择。实验结果表明,IVYA-FMGRU模型在数据集46027、46083和46084上的R2得分分别为0.8505、0.8683和0.8910,优于其他基线模型。通过不同波高区间的误差统计分析,证实了模型在每个区间内的准确性和稳定性,证明了模型在波高预测方面的优越性能和泛化能力。
{"title":"IVYA-FMGRU: A frequency-domain context interaction model with bio-inspired optimization for significant wave height prediction","authors":"Xiujing Gao ,&nbsp;Yongfeng Xie ,&nbsp;Fanchao Lin ,&nbsp;Chiwang Lin ,&nbsp;Hongwu Huang ,&nbsp;Ziru Wang","doi":"10.1016/j.eswa.2026.131534","DOIUrl":"10.1016/j.eswa.2026.131534","url":null,"abstract":"<div><div>Accurate prediction of significant wave heights is crucial for the safety of marine structures and ships. Traditional models struggle to capture the key frequency and periodic characteristics in wave height data. To address this issue, a novel Ivy Algorithm-Fast Fourier Transform Mogrifier Gated Recurrent Unit (IVYA-FMGRU) model is proposed, which integrates the gated recurrent unit (GRU) with the fast Fourier transform (FFT) and Mogrifier operations. The FFT extracts periodic features, the Mogrifier enhances the interaction between the GRU and frequency information, and the Ivy algorithm (IVYA), a bio-inspired optimization method, optimizes the model parameters. In addition, random forest (RF) is employed for feature selection. Experimental results show that the IVYA-FMGRU model achieves R<sup>2</sup> scores of 0.8505, 0.8683, and 0.8910 on datasets 46027, 46083, and 46084, respectively outperforming other baseline models. Furthermore, error statistical analysis across different wave height intervals confirms the model’s accuracy and stability within each interval, demonstrating its superior performance and generalization capability in wave height prediction.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131534"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An intelligent approach to maritime autonomous surface ship performance evaluation 海上自主水面舰艇性能评估的智能方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-10 DOI: 10.1016/j.eswa.2026.131631
Changyuan Chen, Chuanbo Duan, Yipeng Wang, Guiyang Zhang
Maritime Autonomous Surface Ships (MASS) require reliable performance evaluation methods to ensure safe and efficient operation, yet existing research and regulations lack an integrated approach for automatic assessment of maneuvering, path planning, obstacle avoidance, and motion control performance. To fill this gap, this study proposes a comprehensive assessment tool that integrates assessment aspects, test scenarios, key performance indicators (KPIs), and evaluation criteria. Three assessment modules were developed: the Maneuvering Assessment Module (MAM), the Path planning and Obstacle avoidance Assessment Module (POAM), and the Motion Control Assessment Module (MCAM). The applicability of the proposed assessment tool is studied through simulations, including maneuvering tests under three water-depth conditions, comparative evaluation of ten path-planning algorithms, and an analysis of path-following control performance. The results indicate that maneuvering performance is poorer in shallow water (water depth-draft ratio h/T=1.4) compared with deep (h/T=10.0) and medium-deep water (h/T=2.0). Moreover, the proposed Tuned Fast Marching Square (TFMS) method generates safer and more cost-effective paths than Fast Marching Method (FMM) and FMS. These findings confirm that the assessment tool can provide quantitative and reproducible evaluations of MASS performance. The developed tool offers a practical platform for both researchers and practitioners, with potential extensions toward environmentally oriented (“green”) performance assessments in future work.
海上自主水面舰艇(MASS)需要可靠的性能评估方法来确保安全高效运行,但现有的研究和法规缺乏对机动、路径规划、避障和运动控制性能进行自动评估的综合方法。为了填补这一空白,本研究提出了一种综合评估工具,该工具集成了评估方面、测试场景、关键绩效指标(kpi)和评估标准。开发了三个评估模块:机动评估模块(MAM)、路径规划与避障评估模块(POAM)和运动控制评估模块(MCAM)。通过三种水深条件下的机动试验、十种路径规划算法的比较评价和路径跟踪控制性能分析等仿真研究了所提出的评估工具的适用性。结果表明:在深水(h/T=10.0)和中深水(h/T=2.0)工况下,浅水(水深吃水比h/T=1.4)工况下的机动性能较差;此外,与快速行军法(FMM)和快速行军法(FMS)相比,调谐快速行军方阵法(TFMS)生成的路径更安全,成本效益更高。这些发现证实,评估工具可以提供定量和可重复的质量绩效评估。开发的工具为研究人员和实践者提供了一个实用的平台,在未来的工作中有可能扩展到面向环境(“绿色”)的绩效评估。
{"title":"An intelligent approach to maritime autonomous surface ship performance evaluation","authors":"Changyuan Chen,&nbsp;Chuanbo Duan,&nbsp;Yipeng Wang,&nbsp;Guiyang Zhang","doi":"10.1016/j.eswa.2026.131631","DOIUrl":"10.1016/j.eswa.2026.131631","url":null,"abstract":"<div><div>Maritime Autonomous Surface Ships (MASS) require reliable performance evaluation methods to ensure safe and efficient operation, yet existing research and regulations lack an integrated approach for automatic assessment of maneuvering, path planning, obstacle avoidance, and motion control performance. To fill this gap, this study proposes a comprehensive assessment tool that integrates assessment aspects, test scenarios, key performance indicators (KPIs), and evaluation criteria. Three assessment modules were developed: the Maneuvering Assessment Module (MAM), the Path planning and Obstacle avoidance Assessment Module (POAM), and the Motion Control Assessment Module (MCAM). The applicability of the proposed assessment tool is studied through simulations, including maneuvering tests under three water-depth conditions, comparative evaluation of ten path-planning algorithms, and an analysis of path-following control performance. The results indicate that maneuvering performance is poorer in shallow water (water depth-draft ratio <span><math><mrow><mi>h</mi><mo>/</mo><mi>T</mi></mrow></math></span>=1.4) compared with deep (<span><math><mrow><mi>h</mi><mo>/</mo><mi>T</mi></mrow></math></span>=10.0) and medium-deep water (<span><math><mrow><mi>h</mi><mo>/</mo><mi>T</mi></mrow></math></span>=2.0). Moreover, the proposed Tuned Fast Marching Square (TFMS) method generates safer and more cost-effective paths than Fast Marching Method (FMM) and FMS. These findings confirm that the assessment tool can provide quantitative and reproducible evaluations of MASS performance. The developed tool offers a practical platform for both researchers and practitioners, with potential extensions toward environmentally oriented (“green”) performance assessments in future work.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131631"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards robust brain tumor segmentation under modality incompleteness: A contribution-optimized edge-enhanced network 模态不完备下稳健的脑肿瘤分割:一种贡献优化的边缘增强网络
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131396
Yanfeng He, Fangning Hu, Guoxiang Tong
Multimodal medical image segmentation plays a crucial role in disease diagnosis, as different MRI modalities provide complementary structural and lesion information. However, in clinical practice, the absence of certain modalities often leads to a significant decline in segmentation performance, limiting the application of multimodal methods. To address this issue, we propose a multimodal segmentation model called MECS-Net, which combines modality contribution optimization, edge enhancement, and efficient feature fusion. Based on four MRI modalities (Flair, T1ce, T1, T2), we further introduce edge features as auxiliary modalities to enhance the perception of critical structural boundaries. The model incorporates a modality contribution measurement mechanism to quantify the actual predictive value of each modality at the sample level and performs resampling training on low-contribution modalities to mitigate performance degradation caused by modality missing. The feature fusion module combines multi-head cross-attention and state space modeling (Mamba), where the former enhances fine-grained interactions between modalities and the latter models cross-modal global dependencies, synergistically improving semantic alignment and fusion effects. Extensive experiments on the BraTS 2020 dataset demonstrate that MECS-Net achieves outstanding performance under both complete and incomplete modality conditions. The Dice coefficients for WT (whole tumor area) and TC (tumor core area) reach 91.8% and 86.4%, respectively, under complete modality conditions, and average 86.7% and 79.1%, respectively, under incomplete modality conditions.
多模态医学图像分割在疾病诊断中起着至关重要的作用,因为不同的MRI模式提供了互补的结构和病变信息。然而,在临床实践中,某些模态的缺失往往导致分割性能显著下降,限制了多模态方法的应用。为了解决这一问题,我们提出了一种多模态分割模型MECS-Net,该模型结合了模态贡献优化、边缘增强和高效特征融合。基于四种MRI模式(Flair, T1ce, T1, T2),我们进一步引入边缘特征作为辅助模式来增强关键结构边界的感知。该模型结合了模态贡献测量机制,在样本水平上量化每个模态的实际预测值,并对低贡献模态进行重采样训练,以减轻模态缺失导致的性能下降。特征融合模块结合了多头交叉注意和状态空间建模(Mamba),前者增强了模态之间的细粒度交互,后者建模了跨模态的全局依赖,协同提高了语义对齐和融合效果。在BraTS 2020数据集上的大量实验表明,MECS-Net在完全和不完全模态条件下都取得了出色的性能。在完全模态条件下,WT(肿瘤全区)和TC(肿瘤核心区)的Dice系数分别达到91.8%和86.4%,在不完全模态条件下,其平均值分别为86.7%和79.1%。
{"title":"Towards robust brain tumor segmentation under modality incompleteness: A contribution-optimized edge-enhanced network","authors":"Yanfeng He,&nbsp;Fangning Hu,&nbsp;Guoxiang Tong","doi":"10.1016/j.eswa.2026.131396","DOIUrl":"10.1016/j.eswa.2026.131396","url":null,"abstract":"<div><div>Multimodal medical image segmentation plays a crucial role in disease diagnosis, as different MRI modalities provide complementary structural and lesion information. However, in clinical practice, the absence of certain modalities often leads to a significant decline in segmentation performance, limiting the application of multimodal methods. To address this issue, we propose a multimodal segmentation model called MECS-Net, which combines modality contribution optimization, edge enhancement, and efficient feature fusion. Based on four MRI modalities (Flair, T1ce, T1, T2), we further introduce edge features as auxiliary modalities to enhance the perception of critical structural boundaries. The model incorporates a modality contribution measurement mechanism to quantify the actual predictive value of each modality at the sample level and performs resampling training on low-contribution modalities to mitigate performance degradation caused by modality missing. The feature fusion module combines multi-head cross-attention and state space modeling (Mamba), where the former enhances fine-grained interactions between modalities and the latter models cross-modal global dependencies, synergistically improving semantic alignment and fusion effects. Extensive experiments on the BraTS 2020 dataset demonstrate that MECS-Net achieves outstanding performance under both complete and incomplete modality conditions. The Dice coefficients for WT (whole tumor area) and TC (tumor core area) reach 91.8% and 86.4%, respectively, under complete modality conditions, and average 86.7% and 79.1%, respectively, under incomplete modality conditions.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131396"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conflict-aware semi-supervised mutual learning for medical image segmentation 基于冲突感知的半监督互学习医学图像分割
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131544
Wenlong Hang , Beijing Wang , Shuang Liang , Qingfeng Zhang , Qiang Wu , Yukun Jin , Qiong Wang , Jing Qin
Semi-supervised learning (SSL) has shown promising performance in medical image segmentation by effectively utilizing extensive unlabeled images. However, inaccurate predictions of unlabeled images can significantly impair the segmentation performance of SSL models. Furthermore, most current SSL methods lack mechanisms to handle cognitive bias, causing the model easily overfit on inaccurate predictions and making self-correction challenging. In this work, we propose a conflict-aware semi-supervised mutual learning framework (CSSML), which integrates two different subnetworks and selectively utilizes conflicting pseudo-labels for mutual supervision to address these challenges. Specifically, we introduce two subnetworks with different architecture incorporating a conflict-aware distinct feature learning (CDFL) regularization to avoid the homogenization of subnetworks while promoting diversified predictions. To handle potential inaccurate predictions, we introduce a geometry-aware mutual pseudo supervision (GMPS) regularization to determine the reliability of conflicting pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The synergistic learning between CDFL and GMPS regularizations during the training process facilitates each subnetwork to selectively incorporates reliable knowledge from the other subnetwork, thereby helping the model overcome cognitive bias. Extensive experiments on three public medical image datasets demonstrate that the proposed CSSML achieves an average of 80.65% DSC, 87.83% Precision, and 14.48mm 95HD using only 20% labeled data, highlight-ing its superior performance. The code is available at: https://github.com/Mwnic-AI/CSSML.
半监督学习(SSL)通过有效地利用大量未标记图像,在医学图像分割中显示出良好的性能。然而,对未标记图像的不准确预测会严重损害SSL模型的分割性能。此外,大多数当前SSL方法缺乏处理认知偏差的机制,导致模型容易对不准确的预测进行过拟合,并使自我纠正变得困难。在这项工作中,我们提出了一个冲突感知半监督相互学习框架(CSSML),它集成了两个不同的子网,并有选择地利用冲突的伪标签进行相互监督来解决这些挑战。具体来说,我们引入了两个具有不同架构的子网,其中包含冲突感知的独特特征学习(CDFL)正则化,以避免子网的同质化,同时促进多样化的预测。为了处理潜在的不准确预测,我们引入了几何感知的相互伪监督(GMPS)正则化来确定未标记图像的冲突伪标签的可靠性,并有选择地利用两个子网中更可靠的伪标签来监督另一个子网。在训练过程中,CDFL和GMPS正则化之间的协同学习有助于每个子网选择性地吸收来自其他子网的可靠知识,从而帮助模型克服认知偏差。在三个公共医学图像数据集上的大量实验表明,仅使用20%的标记数据,CSSML的平均DSC为80.65%,精度为87.83%,95HD为14.48mm,突出了其优越的性能。代码可从https://github.com/Mwnic-AI/CSSML获得。
{"title":"Conflict-aware semi-supervised mutual learning for medical image segmentation","authors":"Wenlong Hang ,&nbsp;Beijing Wang ,&nbsp;Shuang Liang ,&nbsp;Qingfeng Zhang ,&nbsp;Qiang Wu ,&nbsp;Yukun Jin ,&nbsp;Qiong Wang ,&nbsp;Jing Qin","doi":"10.1016/j.eswa.2026.131544","DOIUrl":"10.1016/j.eswa.2026.131544","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) has shown promising performance in medical image segmentation by effectively utilizing extensive unlabeled images. However, inaccurate predictions of unlabeled images can significantly impair the segmentation performance of SSL models. Furthermore, most current SSL methods lack mechanisms to handle cognitive bias, causing the model easily overfit on inaccurate predictions and making self-correction challenging. In this work, we propose a conflict-aware semi-supervised mutual learning framework (CSSML), which integrates two different subnetworks and selectively utilizes conflicting pseudo-labels for mutual supervision to address these challenges. Specifically, we introduce two subnetworks with different architecture incorporating a conflict-aware distinct feature learning (CDFL) regularization to avoid the homogenization of subnetworks while promoting diversified predictions. To handle potential inaccurate predictions, we introduce a geometry-aware mutual pseudo supervision (GMPS) regularization to determine the reliability of conflicting pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The synergistic learning between CDFL and GMPS regularizations during the training process facilitates each subnetwork to selectively incorporates reliable knowledge from the other subnetwork, thereby helping the model overcome cognitive bias. Extensive experiments on three public medical image datasets demonstrate that the proposed CSSML achieves an average of 80.65% DSC, 87.83% Precision, and 14.48mm 95HD using only 20% labeled data, highlight-ing its superior performance. The code is available at: <span><span>https://github.com/Mwnic-AI/CSSML</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131544"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ITAdapter: Image-Tag adapter framework with retrieval knowledge enhancer for radiology report generation ITAdapter:带有检索知识增强器的图像标签适配器框架,用于放射学报告生成
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-06 DOI: 10.1016/j.eswa.2026.131494
Shuaipeng Ding , Jianan Shui , Mingyuan Ge , Mengnan Fan , Xin Li , Yijie Zhu , Mingyong Li
Automated radiology report generation has emerged as a crucial technology for improving clinical workflow efficiency and alleviating the documentation burden on radiologists. Current approaches predominantly employ encoder-decoder architectures, they often overemphasize text generation while neglecting two critical issues: inherent biases in textual data distribution that limit abnormal region descriptions, and inadequate cross- modal interaction. To address these challenges, we propose an innovative Image-Tag Adapter (ITAdapter) framework that dynamically balances visual information and diagnostic information during decoding, with particular attention to optimizing feature selection for different types of generated words. The framework incorporates two key components: a Retrieval Knowledge Enhancer (RKE) that utilizes pre-trained CLIP models’ cross-modal retrieval capability to obtain relevant clinical reports as diagnostic references, and an Image-Tag Adapter (ITA) that intelligently fuses visual information with diagnostic information from disease tags. For model optimization, we combine reinforcement learning with knowledge distillation to enable effective knowledge transfer through iterative training. Extensive experiments on IU X-ray and MIMIC-CXR benchmark datasets demonstrate our method’s effectiveness in generating more accurate and clinically relevant reports, achieving the highest performance scores: on IU X-ray, BLEU-1 = 0.536, BLEU-4 = 0.206 and METEOR = 0.220; on MIMIC-CXR, BLEU-1 = 0.411, BLEU-4 = 0.141 and METEOR = 0.152.
自动化放射学报告生成已经成为提高临床工作流程效率和减轻放射科医生文档负担的关键技术。当前的方法主要采用编码器-解码器架构,它们往往过分强调文本生成,而忽略了两个关键问题:文本数据分布中的固有偏差限制了异常区域描述,以及不充分的跨模态交互。为了解决这些挑战,我们提出了一个创新的图像标签适配器(ITAdapter)框架,该框架在解码过程中动态平衡视觉信息和诊断信息,特别注意优化不同类型生成词的特征选择。该框架包含两个关键组件:检索知识增强器(RKE)利用预先训练的CLIP模型的跨模式检索能力获取相关临床报告作为诊断参考,图像标签适配器(ITA)智能地将视觉信息与疾病标签的诊断信息融合在一起。在模型优化方面,我们将强化学习与知识蒸馏相结合,通过迭代训练实现有效的知识迁移。在IU x射线和MIMIC-CXR基准数据集上的大量实验表明,我们的方法在生成更准确和临床相关的报告方面是有效的,并获得了最高的性能分数:在IU x射线上,BLEU-1 = 0.536, BLEU-4 = 0.206和METEOR = 0.220;在MIMIC-CXR上,BLEU-1 = 0.411, BLEU-4 = 0.141, METEOR = 0.152。
{"title":"ITAdapter: Image-Tag adapter framework with retrieval knowledge enhancer for radiology report generation","authors":"Shuaipeng Ding ,&nbsp;Jianan Shui ,&nbsp;Mingyuan Ge ,&nbsp;Mengnan Fan ,&nbsp;Xin Li ,&nbsp;Yijie Zhu ,&nbsp;Mingyong Li","doi":"10.1016/j.eswa.2026.131494","DOIUrl":"10.1016/j.eswa.2026.131494","url":null,"abstract":"<div><div>Automated radiology report generation has emerged as a crucial technology for improving clinical workflow efficiency and alleviating the documentation burden on radiologists. Current approaches predominantly employ encoder-decoder architectures, they often overemphasize text generation while neglecting two critical issues: inherent biases in textual data distribution that limit abnormal region descriptions, and inadequate cross- modal interaction. To address these challenges, we propose an innovative Image-Tag Adapter (ITAdapter) framework that dynamically balances visual information and diagnostic information during decoding, with particular attention to optimizing feature selection for different types of generated words. The framework incorporates two key components: a Retrieval Knowledge Enhancer (RKE) that utilizes pre-trained CLIP models’ cross-modal retrieval capability to obtain relevant clinical reports as diagnostic references, and an Image-Tag Adapter (ITA) that intelligently fuses visual information with diagnostic information from disease tags. For model optimization, we combine reinforcement learning with knowledge distillation to enable effective knowledge transfer through iterative training. Extensive experiments on IU X-ray and MIMIC-CXR benchmark datasets demonstrate our method’s effectiveness in generating more accurate and clinically relevant reports, achieving the highest performance scores: on IU X-ray, BLEU-1 = 0.536, BLEU-4 = 0.206 and METEOR = 0.220; on MIMIC-CXR, BLEU-1 = 0.411, BLEU-4 = 0.141 and METEOR = 0.152.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131494"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TOM: An open-source tongue segmentation method with multi-teacher distillation and task-specific data augmentation TOM:一种基于多教师蒸馏和特定任务数据增强的开源舌头分割方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131499
Jiacheng Xie , Ziyang Zhang , Biplab Poudel , Congyu Guo , Yang Yu , Guanghui An , Xiaoting Tang , Lening Zhao , Chunhui Xu , Dong Xu
Tongue imaging serves as a valuable diagnostic modality, particularly in Traditional Chinese Medicine (TCM). The quality of tongue surface segmentation significantly affects the accuracy of tongue image classification and subsequent diagnosis in intelligent tongue diagnosis systems. However, existing research on tongue image segmentation exhibits significant limitations, including sensitivity to lighting and background noise, similarity in color with surrounding tissues, and a lack of robust and user-friendly segmentation tools. This paper proposes a tongue image segmentation method (TOM) based on multi-teacher knowledge distillation. By introducing a novel diffusion-based data augmentation method, we notably improved the generalization ability of the segmentation model while reducing its parameter size. Notably, after reducing the parameter count by 96.6% compared to the largest teacher models, the student model still achieves an impressive segmentation performance of 95.22% mIoU. Furthermore, we packaged and deployed the trained model as an online and offline segmentation tool (available at https://itongue.cn/), allowing TCM practitioners and researchers to use it without any programming experience. We also present a case study on TCM constitution classification using segmented tongue patches. Experimental results demonstrate that training with tongue patches yields higher classification performance and better interpretability than original tongue images. To the best of our knowledge, this is the first open-source and freely available tongue image segmentation tool.
舌头成像是一种有价值的诊断方式,特别是在中医(TCM)中。在智能舌头诊断系统中,舌面分割的质量直接影响到舌图像分类和后续诊断的准确性。然而,现有的舌头图像分割研究存在明显的局限性,包括对光线和背景噪声的敏感性,与周围组织的颜色相似性,以及缺乏鲁棒性和用户友好的分割工具。提出了一种基于多教师知识精馏的舌头图像分割方法。通过引入一种新的基于扩散的数据增强方法,在减小分割模型参数大小的同时,显著提高了分割模型的泛化能力。值得注意的是,与最大的教师模型相比,在减少了96.6%的参数数量后,学生模型仍然取得了95.22% mIoU的令人印象深刻的分割性能。此外,我们将训练好的模型打包并部署为在线和离线分割工具(可在https://itongue.cn/上获得),允许中医从业者和研究人员在没有任何编程经验的情况下使用它。我们还介绍了一个使用分段舌贴进行中医体质分类的案例研究。实验结果表明,与原始舌图相比,舌片训练具有更高的分类性能和更好的可解释性。据我们所知,这是第一个开源和免费提供的舌头图像分割工具。
{"title":"TOM: An open-source tongue segmentation method with multi-teacher distillation and task-specific data augmentation","authors":"Jiacheng Xie ,&nbsp;Ziyang Zhang ,&nbsp;Biplab Poudel ,&nbsp;Congyu Guo ,&nbsp;Yang Yu ,&nbsp;Guanghui An ,&nbsp;Xiaoting Tang ,&nbsp;Lening Zhao ,&nbsp;Chunhui Xu ,&nbsp;Dong Xu","doi":"10.1016/j.eswa.2026.131499","DOIUrl":"10.1016/j.eswa.2026.131499","url":null,"abstract":"<div><div>Tongue imaging serves as a valuable diagnostic modality, particularly in Traditional Chinese Medicine (TCM). The quality of tongue surface segmentation significantly affects the accuracy of tongue image classification and subsequent diagnosis in intelligent tongue diagnosis systems. However, existing research on tongue image segmentation exhibits significant limitations, including sensitivity to lighting and background noise, similarity in color with surrounding tissues, and a lack of robust and user-friendly segmentation tools. This paper proposes a <strong>to</strong>ngue image segmentation <strong>m</strong><strong>ethod</strong> (TOM) based on multi-teacher knowledge distillation. By introducing a novel diffusion-based data augmentation method, we notably improved the generalization ability of the segmentation model while reducing its parameter size. Notably, after reducing the parameter count by 96.6% compared to the largest teacher models, the student model still achieves an impressive segmentation performance of 95.22% mIoU. Furthermore, we packaged and deployed the trained model as an online and offline segmentation tool (available at <span><span>https://itongue.cn/</span><svg><path></path></svg></span>), allowing TCM practitioners and researchers to use it without any programming experience. We also present a case study on TCM constitution classification using segmented tongue patches. Experimental results demonstrate that training with tongue patches yields higher classification performance and better interpretability than original tongue images. To the best of our knowledge, this is the first open-source and freely available tongue image segmentation tool.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131499"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ZPD-guided adversarial learning for safety-critical autonomous driving zpd引导的安全关键型自动驾驶对抗学习
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-08 DOI: 10.1016/j.eswa.2026.131547
Wei Wu , Xiaohui Hou , Minggang Gan , Jie Chen
Ensuring the safety and robustness of autonomous vehicles (AVs) in complex and safety–critical driving scenarios remains a fundamental challenge in the advancement of autonomous driving technology. Traditional training methods often exhibit limitations in coping with uncertainty and rare extreme events encountered in real-world driving environments. To address these challenges, this paper proposes an adversarial learning framework guided by the Zone of Proximal Development (ZPD), aiming to enhance the adaptability and robustness of autonomous driving decision-making policies in complex environments. Specifically, the proposed approach embeds ZPD-inspired guidance into adversarial learning to generate safety–critical traffic interactions that are both extreme and learnable. To regulate adversarial behaviors and maintain a balance between challenge and solvability, the framework incorporates structured constraints based on the Ideal Return Ceiling (IRC) and fine-grained collision severity modeling. Furthermore, a Vehicle Potential Threat Level (VPTL) mechanism is employed to adaptively adjust adversarial training difficulty in accordance with the evolving capability of the ego vehicle, thereby facilitating continuous learning and policy adaptation. Experimental results indicate that, compared with representative baseline methods such as SAC and TD3, the proposed approach reduces the Damage Index by approximately 20–40% across a wide range of evaluation settings, while simultaneously lowering collision severity and maintaining task executability. These results suggest that the proposed framework provides a viable approach for improving safety-oriented learning behavior in complex traffic environments.
确保自动驾驶汽车(AVs)在复杂和安全关键驾驶场景中的安全性和鲁棒性仍然是自动驾驶技术进步的根本挑战。传统的训练方法在应对现实驾驶环境中遇到的不确定性和罕见的极端事件时往往表现出局限性。为了应对这些挑战,本文提出了一种基于近端发展区(Zone of Proximal Development, ZPD)的对抗学习框架,旨在增强自动驾驶决策策略在复杂环境下的适应性和鲁棒性。具体来说,提出的方法将受zpd启发的指导嵌入到对抗性学习中,以生成对安全至关重要的交通交互,这些交互既极端又可学习。为了调节对抗行为并保持挑战和可解决性之间的平衡,该框架结合了基于理想回报上限(IRC)和细粒度碰撞严重性建模的结构化约束。利用车辆潜在威胁等级(VPTL)机制,根据自我车辆的演化能力自适应调整对抗训练难度,实现持续学习和政策适应。实验结果表明,与具有代表性的基线方法(如SAC和TD3)相比,该方法在广泛的评估设置范围内将损伤指数降低了约20-40%,同时降低了碰撞严重性并保持了任务的可执行性。这些结果表明,所提出的框架为改善复杂交通环境中以安全为导向的学习行为提供了一种可行的方法。
{"title":"ZPD-guided adversarial learning for safety-critical autonomous driving","authors":"Wei Wu ,&nbsp;Xiaohui Hou ,&nbsp;Minggang Gan ,&nbsp;Jie Chen","doi":"10.1016/j.eswa.2026.131547","DOIUrl":"10.1016/j.eswa.2026.131547","url":null,"abstract":"<div><div>Ensuring the safety and robustness of autonomous vehicles (AVs) in complex and safety–critical driving scenarios remains a fundamental challenge in the advancement of autonomous driving technology. Traditional training methods often exhibit limitations in coping with uncertainty and rare extreme events encountered in real-world driving environments. To address these challenges, this paper proposes an adversarial learning framework guided by the Zone of Proximal Development (ZPD), aiming to enhance the adaptability and robustness of autonomous driving decision-making policies in complex environments. Specifically, the proposed approach embeds ZPD-inspired guidance into adversarial learning to generate safety–critical traffic interactions that are both extreme and learnable. To regulate adversarial behaviors and maintain a balance between challenge and solvability, the framework incorporates structured constraints based on the Ideal Return Ceiling (IRC) and fine-grained collision severity modeling. Furthermore, a Vehicle Potential Threat Level (VPTL) mechanism is employed to adaptively adjust adversarial training difficulty in accordance with the evolving capability of the ego vehicle, thereby facilitating continuous learning and policy adaptation. Experimental results indicate that, compared with representative baseline methods such as SAC and TD3, the proposed approach reduces the Damage Index by approximately 20–40% across a wide range of evaluation settings, while simultaneously lowering collision severity and maintaining task executability. These results suggest that the proposed framework provides a viable approach for improving safety-oriented learning behavior in complex traffic environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131547"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Expert Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1