You He, Shulan Ruan, Dong Wang, Huchuan Lu, Zhi Li, Yang Liu, Xu Chen, Shaohui Li, Jie Zhao, Jiaxuan Liang
With the rapid development of large AI models, large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medicine and transportation. In this paper, we systematically expound on the intelligent decision-making technology and prospects driven by large AI models. Specifically, we first review the development of large AI models in recent years. Then, from the perspective of methods, we introduce important theories and technologies of large decision models, such as model architecture and model adaptation. Next, from the perspective of applications, we introduce the cutting-edge applications of large decision models in various fields, such as autonomous driving and knowledge decision-making. Finally, we discuss existing challenges, such as security issues, decision bias and hallucination phenomenon as well as future prospects, from both technology development and domain applications. We hope this review paper can help researchers understand the important progress of intelligent decision-making driven by large AI models.
{"title":"Intelligent Decision-Making Driven by Large AI Models: Progress, Challenges and Prospects","authors":"You He, Shulan Ruan, Dong Wang, Huchuan Lu, Zhi Li, Yang Liu, Xu Chen, Shaohui Li, Jie Zhao, Jiaxuan Liang","doi":"10.1049/cit2.70084","DOIUrl":"https://doi.org/10.1049/cit2.70084","url":null,"abstract":"<p>With the rapid development of large AI models, large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medicine and transportation. In this paper, we systematically expound on the intelligent decision-making technology and prospects driven by large AI models. Specifically, we first review the development of large AI models in recent years. Then, from the perspective of methods, we introduce important theories and technologies of large decision models, such as model architecture and model adaptation. Next, from the perspective of applications, we introduce the cutting-edge applications of large decision models in various fields, such as autonomous driving and knowledge decision-making. Finally, we discuss existing challenges, such as security issues, decision bias and hallucination phenomenon as well as future prospects, from both technology development and domain applications. We hope this review paper can help researchers understand the important progress of intelligent decision-making driven by large AI models.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1573-1592"},"PeriodicalIF":7.3,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145824863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinfu Liu, Zhongzien Jiang, Xinhua Xu, Wenhao Li, Mengyuan Liu, Hong Liu
Indoor scene semantic segmentation is essential for enabling robots to understand and interact with their environments effectively. However, numerous challenges remain unresolved, particularly in single-robot systems, which often struggle with the complexity and variability of indoor scenes. To address these limitations, we introduce a novel multi-robot collaborative framework based on multiplex interactive learning (MPIL) in which each robot specialises in a distinct visual task within a unified multitask architecture. During training, the framework employs task-specific decoders and cross-task feature sharing to enhance collaborative optimisation. At inference time, robots operate independently with optimised models, enabling scalable, asynchronous and efficient deployment in real-world scenarios. Specifically, MPIL employs specially designed modules that integrate RGB and depth data, refine feature representations and facilitate the simultaneous execution of multiple tasks, such as instance segmentation, scene classification and semantic segmentation. By leveraging these modules, distinct agents within multi-robot systems can effectively handle specialised tasks, thereby enhancing the overall system's flexibility and adaptability. This collaborative effort maximises the strengths of each robot, resulting in a more comprehensive understanding of environments. Extensive experiments on two public benchmark datasets demonstrate MPIL's competitive performance compared to state-of-the-art approaches, highlighting the effectiveness and robustness of our multi-robot system in complex indoor environments.
{"title":"Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning","authors":"Jinfu Liu, Zhongzien Jiang, Xinhua Xu, Wenhao Li, Mengyuan Liu, Hong Liu","doi":"10.1049/cit2.70066","DOIUrl":"https://doi.org/10.1049/cit2.70066","url":null,"abstract":"<p>Indoor scene semantic segmentation is essential for enabling robots to understand and interact with their environments effectively. However, numerous challenges remain unresolved, particularly in single-robot systems, which often struggle with the complexity and variability of indoor scenes. To address these limitations, we introduce a novel multi-robot collaborative framework based on multiplex interactive learning (MPIL) in which each robot specialises in a distinct visual task within a unified multitask architecture. During training, the framework employs task-specific decoders and cross-task feature sharing to enhance collaborative optimisation. At inference time, robots operate independently with optimised models, enabling scalable, asynchronous and efficient deployment in real-world scenarios. Specifically, MPIL employs specially designed modules that integrate RGB and depth data, refine feature representations and facilitate the simultaneous execution of multiple tasks, such as instance segmentation, scene classification and semantic segmentation. By leveraging these modules, distinct agents within multi-robot systems can effectively handle specialised tasks, thereby enhancing the overall system's flexibility and adaptability. This collaborative effort maximises the strengths of each robot, resulting in a more comprehensive understanding of environments. Extensive experiments on two public benchmark datasets demonstrate MPIL's competitive performance compared to state-of-the-art approaches, highlighting the effectiveness and robustness of our multi-robot system in complex indoor environments.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1646-1660"},"PeriodicalIF":7.3,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145848313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shumeng He, Jie Shen, Houqun Yang, Gaodi Xu, Laurence T. Yang
Change detection identifies dynamic changes in surface cover and feature status by comparing remote sensing images at different points in time, which is of wide application value in the fields of disaster early warning, urban management and ecological monitoring. Mainstream datasets are dominated by long-term datasets; to support short-term change detection, we collected a new dataset, HNU-CD, which contains some small and hard-to-identify change regions. A time correlation network (TCNet) is also proposed to address these challenges. First, foreground information is enhanced by interactively modelling foreground relations, while background noise is smoothed. Secondly, the temporal correlation between bit-time images is utilised to refine the feature representation and minimise false alarms due to irrelevant changes. Finally, a U-Net inspired architecture is adapted for dense upsampling to preserve details. TCNet demonstrates excellent performance on both the HNU-CD (Hainan University change detection dataset) dataset and three widely used public datasets, indicating that its generalisation capabilities have been enhanced. The ablation experiments provide a good demonstration of the ability to reduce the impact caused by pseudo-variation through temporal correlation modelling.
{"title":"A Temporal Correlation Networks Based on Interactive Modelling for Remote Sensing Images Change Detection","authors":"Shumeng He, Jie Shen, Houqun Yang, Gaodi Xu, Laurence T. Yang","doi":"10.1049/cit2.70080","DOIUrl":"https://doi.org/10.1049/cit2.70080","url":null,"abstract":"<p>Change detection identifies dynamic changes in surface cover and feature status by comparing remote sensing images at different points in time, which is of wide application value in the fields of disaster early warning, urban management and ecological monitoring. Mainstream datasets are dominated by long-term datasets; to support short-term change detection, we collected a new dataset, HNU-CD, which contains some small and hard-to-identify change regions. A time correlation network (TCNet) is also proposed to address these challenges. First, foreground information is enhanced by interactively modelling foreground relations, while background noise is smoothed. Secondly, the temporal correlation between bit-time images is utilised to refine the feature representation and minimise false alarms due to irrelevant changes. Finally, a U-Net inspired architecture is adapted for dense upsampling to preserve details. TCNet demonstrates excellent performance on both the HNU-CD (Hainan University change detection dataset) dataset and three widely used public datasets, indicating that its generalisation capabilities have been enhanced. The ablation experiments provide a good demonstration of the ability to reduce the impact caused by pseudo-variation through temporal correlation modelling.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1904-1918"},"PeriodicalIF":7.3,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70080","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixing Lan, Xin Xu, Jiahang Liu, Xinglong Zhang, Yang Lu, Long Cheng
Reinforcement learning (RL) has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties. In recent years, the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention. Due to the complex and dynamic operating environments of intelligent vehicles, it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions. This survey systematically examines the theoretical foundations, algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments. The major algorithm frameworks of RL are first introduced, and the recent advances in RL-based decision-making and control of intelligent vehicles are overviewed. In addition to self-learning decision and control approaches using state measurements, the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised. The open problems and directions for further research works are also discussed.
{"title":"A Survey on Reinforcement Learning for Optimal Decision-Making and Control of Intelligent Vehicles","authors":"Yixing Lan, Xin Xu, Jiahang Liu, Xinglong Zhang, Yang Lu, Long Cheng","doi":"10.1049/cit2.70073","DOIUrl":"https://doi.org/10.1049/cit2.70073","url":null,"abstract":"<p>Reinforcement learning (RL) has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties. In recent years, the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention. Due to the complex and dynamic operating environments of intelligent vehicles, it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions. This survey systematically examines the theoretical foundations, algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments. The major algorithm frameworks of RL are first introduced, and the recent advances in RL-based decision-making and control of intelligent vehicles are overviewed. In addition to self-learning decision and control approaches using state measurements, the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised. The open problems and directions for further research works are also discussed.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1593-1615"},"PeriodicalIF":7.3,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70073","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145848091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qilong Yuan, Enze Shi, Di Zhu, Xiaoshan Zhang, Kui Zhao, Dingwen Zhang, Tianming Liu, Shu Zhang
Electroencephalography (EEG) is a widely used neuroimaging technique for decoding brain states. Transformer is gaining attention in EEG signal decoding due to its powerful ability to capture global features. However, relying solely on a single feature extracted by the traditional transformer model to address the domain shift problem caused by the time variability and complexity of EEG signals is challenging. In this paper, we propose a novel Transferable Fusion Multi-band EEG Transformer (TF-MEET) to enhance the performance of cross-session decoding of EEG signals. TF-MEET is mainly divided into three parts: (1) transform the EEG signals into spatial images and band images; (2) design an encoder to obtain spatial features and band features for the two types of images, and comprehensive fusion features are obtained through the weight adaptive fusion module; (3) cross-session EEG signals decoding is achieved by aligning the joint distribution of different domain features and categories through multi-loss domain adversarial training. Experimental results demonstrate (1) TF-MEET outperforms other advanced transfer learning methods on two public EEG emotion recognition datasets, SEED and SEED_IV, achieving an accuracy of 91.68% on SEED and 76.21% on SEED_IV; (2) TF-MEET proves the effectiveness of the transferable fusion module; (3) TF-MEET can identify explainable activation areas in the brain. We demonstrate that TF-MEET can capture comprehensive, transferable and interpretable features in EEG signals and perform well in cross-session EEG signals decoding, which can promote the development of brain–computer interface system.
{"title":"TF-MEET: A Transferable Fusion Multi-Band Transformer for Cross-Session EEG Decoding","authors":"Qilong Yuan, Enze Shi, Di Zhu, Xiaoshan Zhang, Kui Zhao, Dingwen Zhang, Tianming Liu, Shu Zhang","doi":"10.1049/cit2.70056","DOIUrl":"https://doi.org/10.1049/cit2.70056","url":null,"abstract":"<p>Electroencephalography (EEG) is a widely used neuroimaging technique for decoding brain states. Transformer is gaining attention in EEG signal decoding due to its powerful ability to capture global features. However, relying solely on a single feature extracted by the traditional transformer model to address the domain shift problem caused by the time variability and complexity of EEG signals is challenging. In this paper, we propose a novel Transferable Fusion Multi-band EEG Transformer (TF-MEET) to enhance the performance of cross-session decoding of EEG signals. TF-MEET is mainly divided into three parts: (1) transform the EEG signals into spatial images and band images; (2) design an encoder to obtain spatial features and band features for the two types of images, and comprehensive fusion features are obtained through the weight adaptive fusion module; (3) cross-session EEG signals decoding is achieved by aligning the joint distribution of different domain features and categories through multi-loss domain adversarial training. Experimental results demonstrate (1) TF-MEET outperforms other advanced transfer learning methods on two public EEG emotion recognition datasets, SEED and SEED_IV, achieving an accuracy of 91.68% on SEED and 76.21% on SEED_IV; (2) TF-MEET proves the effectiveness of the transferable fusion module; (3) TF-MEET can identify explainable activation areas in the brain. We demonstrate that TF-MEET can capture comprehensive, transferable and interpretable features in EEG signals and perform well in cross-session EEG signals decoding, which can promote the development of brain–computer interface system.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1799-1812"},"PeriodicalIF":7.3,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Imran Khan, Javed Rashid, Anwar Ghani, Muhammad Shoaib Saleem, Muhammad Faheem, Humera Khan
Secure and automated sharing of medical information among different medical entities/stakeholders like patients, hospitals, doctors, law enforcement agencies, health insurance companies etc., in a standard format has always been a challenging problem. Current methods for ensuring compliance with medical privacy laws require specialists who are deeply familiar with these laws' complex requirements to verify the lawful exchange of medical information. This article introduces a Smart Medical Data Exchange Engine (SDEE) designed to automate the extracting of logical rules from medical privacy legislation using advanced techniques. These rules facilitate the secure extraction of information, safeguarding patient privacy and confidentiality. In addition, SMDEE can generate standardised clinical documents according to Health Level 7 (HL7) standards and also standardise the nomenclature of requested medical data, enabling accurate decision-making when accessing patient data. All access requests to patient information are processed through SMDEE to ensure authorised access. The proposed system's efficacy is evaluated using the Health Insurance Portability and Accountability Act (HIPAA), a fundamental privacy law in the United States. However, SMDEE's flexibility allows its application worldwide, accommodating various medical privacy laws. Beyond facilitating global information exchange, SMDEE aims to enhance international patients' timely and appropriate treatment.
{"title":"Access and Privacy Control for Healthcare Decision Support System: A Smart Medical Data Exchange Engine (SMDEE)","authors":"Imran Khan, Javed Rashid, Anwar Ghani, Muhammad Shoaib Saleem, Muhammad Faheem, Humera Khan","doi":"10.1049/cit2.70077","DOIUrl":"https://doi.org/10.1049/cit2.70077","url":null,"abstract":"<p>Secure and automated sharing of medical information among different medical entities/stakeholders like patients, hospitals, doctors, law enforcement agencies, health insurance companies etc., in a standard format has always been a challenging problem. Current methods for ensuring compliance with medical privacy laws require specialists who are deeply familiar with these laws' complex requirements to verify the lawful exchange of medical information. This article introduces a Smart Medical Data Exchange Engine (SDEE) designed to automate the extracting of logical rules from medical privacy legislation using advanced techniques. These rules facilitate the secure extraction of information, safeguarding patient privacy and confidentiality. In addition, SMDEE can generate standardised clinical documents according to Health Level 7 (HL7) standards and also standardise the nomenclature of requested medical data, enabling accurate decision-making when accessing patient data. All access requests to patient information are processed through SMDEE to ensure authorised access. The proposed system's efficacy is evaluated using the Health Insurance Portability and Accountability Act (HIPAA), a fundamental privacy law in the United States. However, SMDEE's flexibility allows its application worldwide, accommodating various medical privacy laws. Beyond facilitating global information exchange, SMDEE aims to enhance international patients' timely and appropriate treatment.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1616-1632"},"PeriodicalIF":7.3,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70077","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145848420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Continual learning aims to empower a model to learn new tasks continuously while reducing forgetting to retain previously learnt knowledge. In the context of receiving streaming data that are not constrained by the independent and identically distributed (IID) assumption, continual learning efficiently transforms and leverages previously learnt knowledge through various methodologies and completes the learning of new tasks. The generalisation performance and learning efficiency of the model are enhanced in a sequence of tasks. However, the class imbalance in continual learning scenarios critically undermines model performance. In particular, in the class-incremental scenario, the class imbalance results in a bias towards new task classes while degrading the performance on previous learnt classes, leading to catastrophic forgetting. In this paper, a novel method based on balanced contrast is proposed to solve the class-incremental continual learning. The method utilises gradient balancing to mitigate the impact of class imbalance in the class-incremental scenario. The method leverages contrastive learning and gradient modifications to facilitate balanced processing of data across different classes in continual learning. The method proposed in this paper surpasses the existing baseline approaches in the class-incremental learning scenario on standard image datasets such as CIFAR-100, CIFAR-10 and mini-ImageNet. The research results reveal that the proposed method effectively mitigates catastrophic forgetting of previously learnt classes, markedly improving the efficacy of continual learning and offering a powerful solution for further advancing continual learning performance.
{"title":"Balanced Contrast Class-Incremental Learning","authors":"Shiqi Yu, Luojun Lin, Yuanlong Yu","doi":"10.1049/cit2.70060","DOIUrl":"https://doi.org/10.1049/cit2.70060","url":null,"abstract":"<p>Continual learning aims to empower a model to learn new tasks continuously while reducing forgetting to retain previously learnt knowledge. In the context of receiving streaming data that are not constrained by the independent and identically distributed (IID) assumption, continual learning efficiently transforms and leverages previously learnt knowledge through various methodologies and completes the learning of new tasks. The generalisation performance and learning efficiency of the model are enhanced in a sequence of tasks. However, the class imbalance in continual learning scenarios critically undermines model performance. In particular, in the class-incremental scenario, the class imbalance results in a bias towards new task classes while degrading the performance on previous learnt classes, leading to catastrophic forgetting. In this paper, a novel method based on balanced contrast is proposed to solve the class-incremental continual learning. The method utilises gradient balancing to mitigate the impact of class imbalance in the class-incremental scenario. The method leverages contrastive learning and gradient modifications to facilitate balanced processing of data across different classes in continual learning. The method proposed in this paper surpasses the existing baseline approaches in the class-incremental learning scenario on standard image datasets such as CIFAR-100, CIFAR-10 and mini-ImageNet. The research results reveal that the proposed method effectively mitigates catastrophic forgetting of previously learnt classes, markedly improving the efficacy of continual learning and offering a powerful solution for further advancing continual learning performance.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1867-1879"},"PeriodicalIF":7.3,"publicationDate":"2025-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70060","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145848382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Class-incremental learning studies the problem of continually learning new classes from data streams. But networks suffer from catastrophic forgetting problems, forgetting past knowledge when acquiring new knowledge. Among different approaches, replay methods have shown exceptional promise for this challenge. But performance still baffles from two aspects: (i) data in imbalanced distribution and (ii) networks with semantic inconsistency. First, due to limited memory buffer, there exists imbalance between old and new classes. Direct optimisation would lead feature space skewed towards new classes, resulting in performance degradation on old classes. Second, existing methods normally leverage previous network to regularise the present network. However, the previous network is not trained on new classes, which means that these two networks are semantic inconsistent, leading to misleading guidance information. To address these two problems, we propose BCSD (BiaMix contrastive learning and memory similarity distillation). For imbalanced distribution, we design Biased MixUp, where mixed samples are in high weight from old classes and low weight from new classes. Thus, network learns to push decision boundaries towards new classes. We further leverage label information to construct contrastive learning in order to ensure discriminability. Meanwhile, for semantic inconsistency, we distill knowledge from the previous network by capturing the similarity of new classes in current tasks to old classes from the memory buffer and transfer that knowledge to the present network. Empirical results on various datasets demonstrate its effectiveness and efficiency.
{"title":"BiaMix Contrastive Learning and Memory Similarity Distillation in Class-Incremental Learning","authors":"Mang Ye, Wenke Huang, Zekun Shi, Zhiwei Ye, Bo Du","doi":"10.1049/cit2.70064","DOIUrl":"https://doi.org/10.1049/cit2.70064","url":null,"abstract":"<p>Class-incremental learning studies the problem of continually learning new classes from data streams. But networks suffer from catastrophic forgetting problems, forgetting past knowledge when acquiring new knowledge. Among different approaches, replay methods have shown exceptional promise for this challenge. But performance still baffles from two aspects: (i) data in imbalanced distribution and (ii) networks with semantic inconsistency. First, due to limited memory buffer, there exists imbalance between old and new classes. Direct optimisation would lead feature space skewed towards new classes, resulting in performance degradation on old classes. Second, existing methods normally leverage previous network to regularise the present network. However, the previous network is not trained on new classes, which means that these two networks are semantic inconsistent, leading to misleading guidance information. To address these two problems, we propose BCSD (BiaMix contrastive learning and memory similarity distillation). For imbalanced distribution, we design Biased MixUp, where mixed samples are in high weight from old classes and low weight from new classes. Thus, network learns to push decision boundaries towards new classes. We further leverage label information to construct contrastive learning in order to ensure discriminability. Meanwhile, for semantic inconsistency, we distill knowledge from the previous network by capturing the similarity of new classes in current tasks to old classes from the memory buffer and transfer that knowledge to the present network. Empirical results on various datasets demonstrate its effectiveness and efficiency.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1745-1758"},"PeriodicalIF":7.3,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70064","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Estimating probability density functions (PDFs) is critical in data analysis, particularly for complex multimodal distributions. traditional kernel density estimator (KDE) methods often face challenges in accurately capturing multimodal structures due to their uniform weighting scheme, leading to mode loss and degraded estimation accuracy. This paper presents the flexible kernel density estimator (F-KDE), a novel nonparametric approach designed to address these limitations. F-KDE introduces the concept of kernel unit inequivalence, assigning adaptive weights to each kernel unit, which better models local density variations in multimodal data. The method optimises an objective function that integrates estimation error and log-likelihood, using a particle swarm optimisation (PSO) algorithm that automatically determines optimal weights and bandwidths. Through extensive experiments on synthetic and real-world datasets, we demonstrated that (1) the weights and bandwidths in F-KDE stabilise as the optimisation algorithm iterates, (2) F-KDE effectively captures the multimodal characteristics and (3) F-KDE outperforms state-of-the-art density estimation methods regarding accuracy and robustness. The results confirm that F-KDE provides a valuable solution for accurately estimating multimodal PDFs.
{"title":"A Novel Flexible Kernel Density Estimator for Multimodal Probability Density Functions","authors":"Jia-Qi Chen, Yu-Lin He, Ying-Chao Cheng, Philippe Fournier-Viger, Ponnuthurai Nagaratnam Suganthan, Joshua Zhexue Huang","doi":"10.1049/cit2.70063","DOIUrl":"https://doi.org/10.1049/cit2.70063","url":null,"abstract":"<p>Estimating probability density functions (PDFs) is critical in data analysis, particularly for complex multimodal distributions. traditional kernel density estimator (KDE) methods often face challenges in accurately capturing multimodal structures due to their uniform weighting scheme, leading to mode loss and degraded estimation accuracy. This paper presents the flexible kernel density estimator (F-KDE), a novel nonparametric approach designed to address these limitations. F-KDE introduces the concept of <i>kernel unit inequivalence</i>, assigning adaptive weights to each kernel unit, which better models local density variations in multimodal data. The method optimises an objective function that integrates estimation error and log-likelihood, using a particle swarm optimisation (PSO) algorithm that automatically determines optimal weights and bandwidths. Through extensive experiments on synthetic and real-world datasets, we demonstrated that (1) the weights and bandwidths in F-KDE stabilise as the optimisation algorithm iterates, (2) F-KDE effectively captures the multimodal characteristics and (3) F-KDE outperforms state-of-the-art density estimation methods regarding accuracy and robustness. The results confirm that F-KDE provides a valuable solution for accurately estimating multimodal PDFs.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1759-1782"},"PeriodicalIF":7.3,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70063","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145824455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Photovoltaic (PV) systems are electrical systems designed to convert solar energy into electrical energy. As a crucial component of PV systems, harsh weather conditions, photovoltaic panel temperature and solar irradiance influence the power output of photovoltaic cells. Therefore, accurately identifying the parameters of PV models is essential for simulating, controlling and evaluating PV systems. In this study, we propose an enhanced weighted-mean-of-vectors optimisation (EINFO) for efficiently determining the unknown parameters in PV systems. EINFO introduces a Lambert W-based explicit objective function for the PV model, enhancing the computational accuracy of the algorithm's population fitness. This addresses the challenge of improving the metaheuristic algorithms' identification accuracy for unknown parameter identification in PV models. We experimentally apply EINFO to three types of PV models (single-diode, double-diode and PV-module models) to validate its accuracy and stability in parameter identification. The results demonstrate that EINFO achieves root mean square errors (RMSEs) of 7.7301E-04, 6.8553E-04 and 2.0608E-03 for the single-diode model, double-diode model and PV-module model, respectively, surpassing those obtained by using INFO algorithm as well as other methods in terms of convergence speed, accuracy and stability. Furthermore, comprehensive experimental findings on three commercial PV modules (ST40, SM55 and KC200GT) indicate that EINFO consistently maintains high accuracy across varying temperatures and irradiation levels. In conclusion, EINFO emerges as a highly competitive and practical approach for parameter identification in diverse types of PV models.
{"title":"Parameter Identification of Photovoltaic Models Using an Enhanced INFO Algorithm","authors":"Ying Chen, Peng Min, Huiling Chen, Cheng Tao, Zeye Long, Ali Asghar Heidari, Shuihua Wang, Yudong Zhang","doi":"10.1049/cit2.70065","DOIUrl":"https://doi.org/10.1049/cit2.70065","url":null,"abstract":"<p>Photovoltaic (PV) systems are electrical systems designed to convert solar energy into electrical energy. As a crucial component of PV systems, harsh weather conditions, photovoltaic panel temperature and solar irradiance influence the power output of photovoltaic cells. Therefore, accurately identifying the parameters of PV models is essential for simulating, controlling and evaluating PV systems. In this study, we propose an enhanced weighted-mean-of-vectors optimisation (EINFO) for efficiently determining the unknown parameters in PV systems. EINFO introduces a Lambert W-based explicit objective function for the PV model, enhancing the computational accuracy of the algorithm's population fitness. This addresses the challenge of improving the metaheuristic algorithms' identification accuracy for unknown parameter identification in PV models. We experimentally apply EINFO to three types of PV models (single-diode, double-diode and PV-module models) to validate its accuracy and stability in parameter identification. The results demonstrate that EINFO achieves root mean square errors (RMSEs) of 7.7301E-04, 6.8553E-04 and 2.0608E-03 for the single-diode model, double-diode model and PV-module model, respectively, surpassing those obtained by using INFO algorithm as well as other methods in terms of convergence speed, accuracy and stability. Furthermore, comprehensive experimental findings on three commercial PV modules (ST40, SM55 and KC200GT) indicate that EINFO consistently maintains high accuracy across varying temperatures and irradiation levels. In conclusion, EINFO emerges as a highly competitive and practical approach for parameter identification in diverse types of PV models.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1844-1866"},"PeriodicalIF":7.3,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145852640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}