Existing weakly supervised semantic segmentation (WSSS) methods based on image-level labels always rely on class activation maps (CAMs), which measure the relationships between features and classifiers. However, CAMs only focus on the most discriminative regions of images, resulting in their poor coverage performance. We attribute this to the deficiency in the recognition ability of a single classifier and the negative impacts caused by magnitudes during the CAMs normalisation process. To address the aforementioned issues, we propose to construct selective multiple classifiers (SMC). During the training process, we extract multiple prototypes for each class and store them in the corresponding memory bank. These prototypes are divided into foreground and background prototypes, with the former used to identify foreground objects and the latter aimed at preventing the false activation of background pixels. As for the inference stage, multiple prototypes are adaptively selected from the memory bank for each image as SMC. Subsequently, CAMs are generated by measuring the angle between SMC and features. We enhance the recognition ability of classifiers by adaptively constructing multiple classifiers for each image, while only relying on angle measurement to generate CAMs can alleviate the suppression phenomenon caused by magnitudes. Furthermore, SMC can be integrated into other WSSS approaches to help generate better CAMs. Extensive experiments conducted on standard WSSS benchmarks such as PASCAL VOC 2012 and MS COCO 2014 demonstrate the superiority of our proposed method.
{"title":"Selective Multiple Classifiers for Weakly Supervised Semantic Segmentation","authors":"Zilin Guo, Dongyue Wu, Changxin Gao, Nong Sang","doi":"10.1049/cit2.70042","DOIUrl":"https://doi.org/10.1049/cit2.70042","url":null,"abstract":"<p>Existing weakly supervised semantic segmentation (WSSS) methods based on image-level labels always rely on class activation maps (CAMs), which measure the relationships between features and classifiers. However, CAMs only focus on the most discriminative regions of images, resulting in their poor coverage performance. We attribute this to the deficiency in the recognition ability of a single classifier and the negative impacts caused by magnitudes during the CAMs normalisation process. To address the aforementioned issues, we propose to construct selective multiple classifiers (SMC). During the training process, we extract multiple prototypes for each class and store them in the corresponding memory bank. These prototypes are divided into foreground and background prototypes, with the former used to identify foreground objects and the latter aimed at preventing the false activation of background pixels. As for the inference stage, multiple prototypes are adaptively selected from the memory bank for each image as SMC. Subsequently, CAMs are generated by measuring the angle between SMC and features. We enhance the recognition ability of classifiers by adaptively constructing multiple classifiers for each image, while only relying on angle measurement to generate CAMs can alleviate the suppression phenomenon caused by magnitudes. Furthermore, SMC can be integrated into other WSSS approaches to help generate better CAMs. Extensive experiments conducted on standard WSSS benchmarks such as PASCAL VOC 2012 and MS COCO 2014 demonstrate the superiority of our proposed method.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1688-1702"},"PeriodicalIF":7.3,"publicationDate":"2025-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, learning-based control for multi-robot systems (MRS) with obstacle avoidance has received increasing attention. The goals of formation control and obstacle avoidance could be intrinsically tied. As a result, developing a safe and near-optimal control policy with the actor-critic structure is challenging. Therefore, a hybrid distributed and decentralised asynchronous actor-critic reinforcement learning (Di-De-RL) technique is proposed to address this problem. First, we decompose the integrated formation control and collision avoidance problem into two successive ones. To solve them, we design a distributed reinforcement learning (Di-RL) algorithm that employs a neural network-based actor-critic structure for formation control, and a decentralised RL (De-RL) algorithm that incorporates a potential-field (PF)-based actor-critic structure for collision avoidance. In Di-RL, the actor-critic pairs are trained in a distributed manner to achieve near-optimal consensus formation control. With the trained policy of Di-RL fixed, the PF actor-critic pairs in De-RL are trained in a decentralised manner for safe collision avoidance. Such an asynchronous training design of the hybrid Di-RL and De-RL enables weight convergence and control safety in the learning process. The simulated and real-world experimental results demonstrate the effectiveness and enhanced performance of the approach in formation control with both static and dynamic obstacle avoidance, highlighting its advantages in resolving the conflict between the safety objective and optimal control.
{"title":"Hybrid Distributed and Decentralised Reinforcement Learning for Formation Control of Multi-Robots With Obstacle Avoidance","authors":"Yaoqian Peng, Xinglong Zhang, Haibin Xie, Xin Xu","doi":"10.1049/cit2.70002","DOIUrl":"https://doi.org/10.1049/cit2.70002","url":null,"abstract":"<p>Recently, learning-based control for multi-robot systems (MRS) with obstacle avoidance has received increasing attention. The goals of formation control and obstacle avoidance could be intrinsically tied. As a result, developing a safe and near-optimal control policy with the actor-critic structure is challenging. Therefore, a hybrid distributed and decentralised asynchronous actor-critic reinforcement learning (Di-De-RL) technique is proposed to address this problem. First, we decompose the integrated formation control and collision avoidance problem into two successive ones. To solve them, we design a distributed reinforcement learning (Di-RL) algorithm that employs a neural network-based actor-critic structure for formation control, and a decentralised RL (De-RL) algorithm that incorporates a potential-field (PF)-based actor-critic structure for collision avoidance. In Di-RL, the actor-critic pairs are trained in a distributed manner to achieve near-optimal consensus formation control. With the trained policy of Di-RL fixed, the PF actor-critic pairs in De-RL are trained in a decentralised manner for safe collision avoidance. Such an asynchronous training design of the hybrid Di-RL and De-RL enables weight convergence and control safety in the learning process. The simulated and real-world experimental results demonstrate the effectiveness and enhanced performance of the approach in formation control with both static and dynamic obstacle avoidance, highlighting its advantages in resolving the conflict between the safety objective and optimal control.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1337-1349"},"PeriodicalIF":7.3,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Functional brain networks have been used to diagnose brain disorders such as autism spectrum disorder (ASD) and attention-deficit/hyperactivity disorder (ADHD). However, existing methods not only fail to fully consider various levels of interaction information between brain regions, but also limit the transmission of information among unconnected regions, resulting in the node information loss and bias. To address these issues, we propose a causality-guided multi-view diffusion (CG-MVD) network, which can more comprehensively capture node information that is difficult to observe when aggregating direct neighbours alone. Specifically, our approach designs multi-view brain graphs and multi-hop causality graphs to represent multi-level node interactions and guide the diffusion of interaction information. Building on this, a multi-view diffusion graph attention module is put forward to learn node multi-dimensional embedding features by broadening the interaction range and extending the receptive field. Additionally, we propose a bilinear adaptive fusion module to generate and fuse connectivity-based features, addressing the challenge of high-dimensional node-level features and integrating richer feature information to enhance classification. Experimental results on the ADHD-200 and ABIDE-I datasets demonstrate the effectiveness of the CG-MVD network, achieving average accuracies of 79.47% and 80.90%, respectively, and surpassing state-of-the-art methods.
{"title":"A Prior Causality-Guided Multi-View Diffusion Network for Brain Disorder Classification","authors":"Xubin Wu, Yan Niu, Xia Li, Jie Xiang, Yidi Li","doi":"10.1049/cit2.70046","DOIUrl":"https://doi.org/10.1049/cit2.70046","url":null,"abstract":"<p>Functional brain networks have been used to diagnose brain disorders such as autism spectrum disorder (ASD) and attention-deficit/hyperactivity disorder (ADHD). However, existing methods not only fail to fully consider various levels of interaction information between brain regions, but also limit the transmission of information among unconnected regions, resulting in the node information loss and bias. To address these issues, we propose a causality-guided multi-view diffusion (CG-MVD) network, which can more comprehensively capture node information that is difficult to observe when aggregating direct neighbours alone. Specifically, our approach designs multi-view brain graphs and multi-hop causality graphs to represent multi-level node interactions and guide the diffusion of interaction information. Building on this, a multi-view diffusion graph attention module is put forward to learn node multi-dimensional embedding features by broadening the interaction range and extending the receptive field. Additionally, we propose a bilinear adaptive fusion module to generate and fuse connectivity-based features, addressing the challenge of high-dimensional node-level features and integrating richer feature information to enhance classification. Experimental results on the ADHD-200 and ABIDE-I datasets demonstrate the effectiveness of the CG-MVD network, achieving average accuracies of 79.47% and 80.90%, respectively, and surpassing state-of-the-art methods.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1731-1744"},"PeriodicalIF":7.3,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}