首页 > 最新文献

Complex & Intelligent Systems最新文献

英文 中文
PHOENIX: A Hybrid Metaheuristic Framework for Multi-UAV Collaborative Trajectory Planning in Complex Three-Dimensional Environments 复杂三维环境下多无人机协同轨迹规划的混合元启发式框架
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-20 DOI: 10.1007/s40747-025-02196-x
Ershen Wang, Haolong Xu, Guipeng Ji, Tengli Yu, Song Xu, Fei Liu, Fan Li
{"title":"PHOENIX: A Hybrid Metaheuristic Framework for Multi-UAV Collaborative Trajectory Planning in Complex Three-Dimensional Environments","authors":"Ershen Wang, Haolong Xu, Guipeng Ji, Tengli Yu, Song Xu, Fei Liu, Fan Li","doi":"10.1007/s40747-025-02196-x","DOIUrl":"https://doi.org/10.1007/s40747-025-02196-x","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"56 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IRG-ResNet: distillation model for corn disease recognition IRG-ResNet:玉米病害识别的蒸馏模型
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-18 DOI: 10.1007/s40747-025-02203-1
Shaoqiu Zhu, Lujie Bai, Haitao Gao
{"title":"IRG-ResNet: distillation model for corn disease recognition","authors":"Shaoqiu Zhu, Lujie Bai, Haitao Gao","doi":"10.1007/s40747-025-02203-1","DOIUrl":"https://doi.org/10.1007/s40747-025-02203-1","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"4 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
StockCI: a hybrid model integrating CEEMDAN and informer for enhanced long-term stock price forecasting StockCI:集成CEEMDAN和informer的混合模型,用于增强长期股票价格预测
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-18 DOI: 10.1007/s40747-025-02209-9
Mo-Ce Gao
{"title":"StockCI: a hybrid model integrating CEEMDAN and informer for enhanced long-term stock price forecasting","authors":"Mo-Ce Gao","doi":"10.1007/s40747-025-02209-9","DOIUrl":"https://doi.org/10.1007/s40747-025-02209-9","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"30 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COLIN: complementary and competitive balanced learning network for multi-modal multi-label emotion recognition 多模态多标签情感识别的互补和竞争平衡学习网络
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-18 DOI: 10.1007/s40747-025-02198-9
Xiaoyu Liu, Ting Wang, Aixiang Cui, Xiaowen Zhang
{"title":"COLIN: complementary and competitive balanced learning network for multi-modal multi-label emotion recognition","authors":"Xiaoyu Liu, Ting Wang, Aixiang Cui, Xiaowen Zhang","doi":"10.1007/s40747-025-02198-9","DOIUrl":"https://doi.org/10.1007/s40747-025-02198-9","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"16 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Physics-informed neural network and momentum contrastive learning for battery state of health estimation 基于物理信息的神经网络和动量对比学习的电池健康状态估计
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-17 DOI: 10.1007/s40747-025-02194-z
Jiwoo Jung, Yipene Cedric Francois Bassole, Yunsick Sung
{"title":"Physics-informed neural network and momentum contrastive learning for battery state of health estimation","authors":"Jiwoo Jung, Yipene Cedric Francois Bassole, Yunsick Sung","doi":"10.1007/s40747-025-02194-z","DOIUrl":"https://doi.org/10.1007/s40747-025-02194-z","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"5 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ulcod-net: an ultra-lightweight camouflage object detection framework with gated multi-level feature fusion and dual-constraint refinement Ulcod-net:一种具有门控多级特征融合和双约束细化的超轻型伪装目标检测框架
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-16 DOI: 10.1007/s40747-025-02201-3
He Xiao, Ziyang Liu, Fugui Luo, Xue Chen, Liping Deng
In resource-constrained environments like embedded devices, unmanned platforms, and edge computing systems, lightweight camouflage object detection (LCOD) is critical for efficient and accurate target detection, as it effectively facilitates the extraction of discriminative features in challenging scenes where the target is visually blended into the background. Existing LCOD models reduce computational demands but often struggle to balance detection accuracy and parameter efficiency in complex scenarios. To address this, we propose ULCOD-Net, an ultra-lightweight COD framework integrating gate-based multi-feature fusion and dual-constraint (including boundary and region). Specifically, we introduce a lightweight boundary-region decoder (LBRD) to leverage initial region and boundary cues, enhancing object localization. A gate-based multi-level feature fusion module (GMFFM) enables multi-level feature interaction via an attention-based gating mechanism, improving global information propagation and compensating for the limited capacity of lightweight networks. Additionally, a region-constrained feature refinement module (RFRM) progressively refines multi-layer features to produce high-quality camouflage maps. Extensive experiments on four benchmark datasets demonstrate that ULCOD-Net, with only 2.5 million (M) parameters and 3.1 giga (G) computational complexity, achieves F-measure scores of 0.837, 0.758, 0.714, and 0.787 on CHAMELEON, CAMO, COD10K, and NC4K, respectively, outperforming existing lightweight COD models and even surpassing several state-of-the-art heavyweight methods. These results highlight ULCOD-Net’s significant potential for real-time application in resource-limited settings.
在资源受限的环境中,如嵌入式设备、无人平台和边缘计算系统,轻型伪装目标检测(LCOD)对于高效准确的目标检测至关重要,因为它可以有效地促进在目标视觉上融入背景的挑战性场景中提取判别特征。现有的LCOD模型降低了计算量,但在复杂场景下往往难以平衡检测精度和参数效率。为了解决这个问题,我们提出了ULCOD-Net,这是一个集成了基于门的多特征融合和双约束(包括边界和区域)的超轻量级COD框架。具体来说,我们引入了一个轻量级的边界区域解码器(LBRD)来利用初始区域和边界线索,增强目标定位。基于门的多级特征融合模块(GMFFM)通过基于注意力的门控机制实现多级特征交互,改善了全局信息传播并补偿了轻量级网络的有限容量。此外,区域约束特征细化模块(RFRM)逐步细化多层特征,生成高质量的伪装地图。在四个基准数据集上进行的大量实验表明,只有250万个参数和3.1千兆(G)计算复杂度的ULCOD-Net在变色龙、CAMO、COD10K和NC4K上的F-measure得分分别为0.837、0.758、0.714和0.787,优于现有的轻量级COD模型,甚至超过了几种最先进的重量级方法。这些结果突出了ULCOD-Net在资源有限的环境中实时应用的巨大潜力。
{"title":"Ulcod-net: an ultra-lightweight camouflage object detection framework with gated multi-level feature fusion and dual-constraint refinement","authors":"He Xiao, Ziyang Liu, Fugui Luo, Xue Chen, Liping Deng","doi":"10.1007/s40747-025-02201-3","DOIUrl":"https://doi.org/10.1007/s40747-025-02201-3","url":null,"abstract":"In resource-constrained environments like embedded devices, unmanned platforms, and edge computing systems, lightweight camouflage object detection (LCOD) is critical for efficient and accurate target detection, as it effectively facilitates the extraction of discriminative features in challenging scenes where the target is visually blended into the background. Existing LCOD models reduce computational demands but often struggle to balance detection accuracy and parameter efficiency in complex scenarios. To address this, we propose ULCOD-Net, an ultra-lightweight COD framework integrating gate-based multi-feature fusion and dual-constraint (including boundary and region). Specifically, we introduce a lightweight boundary-region decoder (LBRD) to leverage initial region and boundary cues, enhancing object localization. A gate-based multi-level feature fusion module (GMFFM) enables multi-level feature interaction via an attention-based gating mechanism, improving global information propagation and compensating for the limited capacity of lightweight networks. Additionally, a region-constrained feature refinement module (RFRM) progressively refines multi-layer features to produce high-quality camouflage maps. Extensive experiments on four benchmark datasets demonstrate that ULCOD-Net, with only 2.5 million (M) parameters and 3.1 giga (G) computational complexity, achieves F-measure scores of 0.837, 0.758, 0.714, and 0.787 on CHAMELEON, CAMO, COD10K, and NC4K, respectively, outperforming existing lightweight COD models and even surpassing several state-of-the-art heavyweight methods. These results highlight ULCOD-Net’s significant potential for real-time application in resource-limited settings.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"44 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing traffic flow prediction through multi-view attention mechanism and dilated convolutional networks 通过多视角注意机制和扩展卷积网络增强交通流预测
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-15 DOI: 10.1007/s40747-025-02146-7
Wei Li, Hao Wei, Xin Liu, Jialin Liu, Dazhi Zhan, Xiao Han, Wei Tao
Accurate traffic flow forecasting serves as a cornerstone for intelligent transportation systems, enabling proactive accident prevention and metropolitan mobility optimization. However, existing approaches face fundamental limitations in modeling the spatiotemporal heterogeneity of traffic dynamics, particularly in simultaneously addressing (1) the decaying significance of temporal dependencies across input sequences and prediction horizons, (2) multi-scale spatial interactions spanning local congestion patterns and global functional correlations, and (3) inter-sample temporal variance in evolving traffic states. To address these limitations, this paper proposes MVA-DCNet (Multi-View Attention Dilated Convolutional Network), a novel deep learning architecture incorporating a multidimensional temporal analysis framework that systematically examines temporal influence mechanisms through three complementary perspectives: inter-sample variance, intra-sequence temporal importance, and output sequence temporal propagation. The proposed model systematically addresses temporal data heterogeneity through three innovative mechanisms: variance-aware data augmentation, adaptive temporal attention, and decaying loss weighting. For enhanced spatial correlation modeling, we develop a dilated convolutional architecture with enhanced receptive field coverage and multi-scale spatial pattern recognition capabilities. Empirical validation on two urban traffic datasets demonstrates superior efficacy in capturing complex spatiotemporal evolution patterns, achieving relative reductions of 12.7% and 9.3% in Root Mean Square Error (RMSE) respectively compared with state-of-the-art benchmarks.
准确的交通流量预测是智能交通系统的基石,可以实现主动事故预防和城市交通优化。然而,现有的方法在模拟交通动态的时空异质性方面面临着根本性的局限性,特别是在同时解决(1)输入序列和预测范围之间的时间依赖性的衰减意义,(2)跨越局部拥堵模式和全局功能相关性的多尺度空间相互作用,以及(3)不断变化的交通状态的样本间时间方差。为了解决这些限制,本文提出了MVA-DCNet(多视图注意扩展卷积网络),这是一种新型的深度学习架构,包含一个多维时间分析框架,通过三个互补的角度系统地检查时间影响机制:样本间方差、序列内时间重要性和输出序列时间传播。该模型通过三种创新机制系统地解决了时间数据的异质性:方差感知数据增强、自适应时间关注和衰减损失加权。为了增强空间相关建模,我们开发了一个扩展的卷积架构,具有增强的感受野覆盖和多尺度空间模式识别能力。在两个城市交通数据集上的实证验证表明,该方法在捕获复杂时空演化模式方面具有卓越的效果,与最先进的基准相比,均方根误差(RMSE)分别相对降低了12.7%和9.3%。
{"title":"Enhancing traffic flow prediction through multi-view attention mechanism and dilated convolutional networks","authors":"Wei Li, Hao Wei, Xin Liu, Jialin Liu, Dazhi Zhan, Xiao Han, Wei Tao","doi":"10.1007/s40747-025-02146-7","DOIUrl":"https://doi.org/10.1007/s40747-025-02146-7","url":null,"abstract":"Accurate traffic flow forecasting serves as a cornerstone for intelligent transportation systems, enabling proactive accident prevention and metropolitan mobility optimization. However, existing approaches face fundamental limitations in modeling the spatiotemporal heterogeneity of traffic dynamics, particularly in simultaneously addressing (1) the decaying significance of temporal dependencies across input sequences and prediction horizons, (2) multi-scale spatial interactions spanning local congestion patterns and global functional correlations, and (3) inter-sample temporal variance in evolving traffic states. To address these limitations, this paper proposes MVA-DCNet (Multi-View Attention Dilated Convolutional Network), a novel deep learning architecture incorporating a multidimensional temporal analysis framework that systematically examines temporal influence mechanisms through three complementary perspectives: inter-sample variance, intra-sequence temporal importance, and output sequence temporal propagation. The proposed model systematically addresses temporal data heterogeneity through three innovative mechanisms: variance-aware data augmentation, adaptive temporal attention, and decaying loss weighting. For enhanced spatial correlation modeling, we develop a dilated convolutional architecture with enhanced receptive field coverage and multi-scale spatial pattern recognition capabilities. Empirical validation on two urban traffic datasets demonstrates superior efficacy in capturing complex spatiotemporal evolution patterns, achieving relative reductions of 12.7% and 9.3% in Root Mean Square Error (RMSE) respectively compared with state-of-the-art benchmarks.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"17 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145752822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ReqNet: an LLM-driven computational framework for automated requirements extraction from unstructured documents ReqNet:一个llm驱动的计算框架,用于从非结构化文档中自动提取需求
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-15 DOI: 10.1007/s40747-025-02143-w
Summra Saleem, Muhammad Nabeel Asim, Andreas Dengel
Within software development life-cycle, requirements guide the entire development process from inception to completion by ensuring alignment between stakeholder expectations and the final product. Requirements extraction from miscellaneous information is a challenging and complex task. Manual extraction of requirements is not only prone to human error but also contributes to increased project costs and delayed project timelines. To automate the requirement extraction process, researchers have investigated the potential of deep learning architectures, large language models (LLM) and generative language models such as ChatGPT and Gemini. However, the performance of requirements extraction could be further enhanced through the development of predictive pipelines by utilizing the combined potential of language models and deep learning architectures. To develop a powerful AI application for requirements extraction by utilizing the combined potential of LLMs and DL architectures, this study presents ReqNet framework. The framework encompasses 7 most widely used LLMs variants (small, large, Xlarge, XXlarge) and 2 DL architectures (LSTM, GRU). The framework facilitates the development of three distinct types predictive pipelines, namely standalone LLMs, LLMs + external classifiers and an ensemble of multiple LLMs representation + external classifiers. Extensive experimentation of 48 predictive pipelines across 2 public core datasets and 1 independent test set, demonstrates that predictive pipelines made up from LLMs and DL architectures generally exhibited superior performance compared to pipelines solely reliant on LLMs. In addition, a ensemble of three distinct LLMs (ALBERT, BERT and XLNet) and LSTM classifier achieved a 3% improvement in F1-score over state-of-the-art predictors on the PURE dataset, a 10% improvement on the Dronology dataset and a 3% improvement on the RFI independent test set.
在软件开发生命周期中,需求通过确保涉众期望和最终产品之间的一致性来指导从开始到完成的整个开发过程。从繁杂的信息中提取需求是一项具有挑战性和复杂性的任务。手动提取需求不仅容易出现人为错误,而且还会增加项目成本和延迟项目时间表。为了自动化需求提取过程,研究人员已经研究了深度学习架构、大型语言模型(LLM)和生成语言模型(如ChatGPT和Gemini)的潜力。然而,需求提取的性能可以通过利用语言模型和深度学习架构的组合潜力来开发预测管道来进一步增强。为了利用llm和DL架构的组合潜力开发强大的需求提取AI应用程序,本研究提出了ReqNet框架。该框架包含7种最广泛使用的llm变体(small, large, Xlarge, XXlarge)和2种DL架构(LSTM, GRU)。该框架促进了三种不同类型预测管道的开发,即独立的llm、llm +外部分类器和多个llm表示+外部分类器的集成。在2个公共核心数据集和1个独立测试集上对48个预测管道进行了广泛的实验,结果表明,与仅依赖于llm的管道相比,由llm和DL架构组成的预测管道通常表现出更好的性能。此外,三个不同的llm (ALBERT、BERT和XLNet)和LSTM分类器的集合在PURE数据集上的f1分数比最先进的预测器提高了3%,在Dronology数据集上提高了10%,在RFI独立测试集上提高了3%。
{"title":"ReqNet: an LLM-driven computational framework for automated requirements extraction from unstructured documents","authors":"Summra Saleem, Muhammad Nabeel Asim, Andreas Dengel","doi":"10.1007/s40747-025-02143-w","DOIUrl":"https://doi.org/10.1007/s40747-025-02143-w","url":null,"abstract":"Within software development life-cycle, requirements guide the entire development process from inception to completion by ensuring alignment between stakeholder expectations and the final product. Requirements extraction from miscellaneous information is a challenging and complex task. Manual extraction of requirements is not only prone to human error but also contributes to increased project costs and delayed project timelines. To automate the requirement extraction process, researchers have investigated the potential of deep learning architectures, large language models (LLM) and generative language models such as ChatGPT and Gemini. However, the performance of requirements extraction could be further enhanced through the development of predictive pipelines by utilizing the combined potential of language models and deep learning architectures. To develop a powerful AI application for requirements extraction by utilizing the combined potential of LLMs and DL architectures, this study presents ReqNet framework. The framework encompasses 7 most widely used LLMs variants (small, large, Xlarge, XXlarge) and 2 DL architectures (LSTM, GRU). The framework facilitates the development of three distinct types predictive pipelines, namely standalone LLMs, LLMs + external classifiers and an ensemble of multiple LLMs representation + external classifiers. Extensive experimentation of 48 predictive pipelines across 2 public core datasets and 1 independent test set, demonstrates that predictive pipelines made up from LLMs and DL architectures generally exhibited superior performance compared to pipelines solely reliant on LLMs. In addition, a ensemble of three distinct LLMs (ALBERT, BERT and XLNet) and LSTM classifier achieved a 3% improvement in F1-score over state-of-the-art predictors on the PURE dataset, a 10% improvement on the Dronology dataset and a 3% improvement on the RFI independent test set.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"1 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145753180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Seed perception learning for weakly supervised semantic segmentation 弱监督语义分割的种子感知学习
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-15 DOI: 10.1007/s40747-025-02152-9
Wanchun Sun, Shujia Li, Xinyu Duan
The core challenge in image-level weakly supervised semantic segmentation lies in generating high-quality object localization maps from simple image labels. Class Activation Map (CAM) produced by existing methods commonly suffer from two major flaws: incomplete coverage of target regions and severe background interference. To address these issues, we present a CAM-native perception-optimization framework for weakly supervised semantic segmentation. First, design a CAM generation mechanism guided by image-level weak supervision, which refines activated regions via discriminative region enhancement and spatial noise suppression. This process promotes fine-grained pixel clustering and improves the completeness of object localization. Second, introduce a spatial cue generator to enhance the adaptability of class representations, coupled with an inter-class relation propagation module that explicitly models inter-class relationships to suppress erroneous activations and significantly reduce spatial noise. Additionally, incorporate a dynamic contrastive matching strategy to eliminate background activations closely associated with the target object, ultimately producing class activation maps that are both complete and compact. Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 show that our method substantially outperforms existing weakly supervised approaches, validating the effectiveness of class-aware guidance and inter-class relational modeling in improving segmentation accuracy.
图像级弱监督语义分割的核心挑战在于如何从简单的图像标签生成高质量的目标定位图。现有方法生成的类激活图(Class Activation Map, CAM)存在两个主要缺陷:目标区域覆盖不完全和背景干扰严重。为了解决这些问题,我们提出了一个cam原生感知优化框架,用于弱监督语义分割。首先,设计图像级弱监督引导下的CAM生成机制,通过判别区域增强和空间噪声抑制来细化激活区域;该过程促进了细粒度的像素聚类,提高了目标定位的完整性。其次,引入空间线索生成器来增强类表示的适应性,再加上明确建模类间关系的类间关系传播模块,以抑制错误激活并显著降低空间噪声。此外,结合动态对比匹配策略来消除与目标对象密切相关的背景激活,最终生成既完整又紧凑的类激活映射。在PASCAL VOC 2012和MS COCO 2014上的大量实验表明,我们的方法大大优于现有的弱监督方法,验证了类感知引导和类间关系建模在提高分割精度方面的有效性。
{"title":"Seed perception learning for weakly supervised semantic segmentation","authors":"Wanchun Sun, Shujia Li, Xinyu Duan","doi":"10.1007/s40747-025-02152-9","DOIUrl":"https://doi.org/10.1007/s40747-025-02152-9","url":null,"abstract":"The core challenge in image-level weakly supervised semantic segmentation lies in generating high-quality object localization maps from simple image labels. Class Activation Map (CAM) produced by existing methods commonly suffer from two major flaws: incomplete coverage of target regions and severe background interference. To address these issues, we present a CAM-native perception-optimization framework for weakly supervised semantic segmentation. First, design a CAM generation mechanism guided by image-level weak supervision, which refines activated regions via discriminative region enhancement and spatial noise suppression. This process promotes fine-grained pixel clustering and improves the completeness of object localization. Second, introduce a spatial cue generator to enhance the adaptability of class representations, coupled with an inter-class relation propagation module that explicitly models inter-class relationships to suppress erroneous activations and significantly reduce spatial noise. Additionally, incorporate a dynamic contrastive matching strategy to eliminate background activations closely associated with the target object, ultimately producing class activation maps that are both complete and compact. Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 show that our method substantially outperforms existing weakly supervised approaches, validating the effectiveness of class-aware guidance and inter-class relational modeling in improving segmentation accuracy.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"148 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145752823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Llm-ga: A gradient-based multi-label adversarial attack by large language models Llm-ga:基于梯度的大型语言模型的多标签对抗性攻击
IF 5.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-12 DOI: 10.1007/s40747-025-02184-1
Yujiang Liu, Yamin Hu, Zhijian Chen, Shiyin Wang, Wenjian Luo
{"title":"Llm-ga: A gradient-based multi-label adversarial attack by large language models","authors":"Yujiang Liu, Yamin Hu, Zhijian Chen, Shiyin Wang, Wenjian Luo","doi":"10.1007/s40747-025-02184-1","DOIUrl":"https://doi.org/10.1007/s40747-025-02184-1","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"20 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145752824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Complex & Intelligent Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1