首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
DIFF-FECG: A Conditional Diffusion-Based Method for Fetal ECG Extraction From Abdominal ECG DIFF-FECG:一种基于条件扩散的胎儿心电图提取方法
Pub Date : 2025-06-10 DOI: 10.1109/TAI.2025.3578007
Zhenqin Chen;Yiwei Lin;Qiong Luo;Jinshan Xu
Fetal electrocardiography (FECG) is a crucial tool for assessing fetal cardiac health and pregnancy status. Direct invasive FECG provides reliable fetal heart rate signals, but poses risks and is limited to use during labor. Conversely, non-invasive monitoring of the fetal heart is possible via abdominal electrocardiography (AECG), which detects fetal heart waveforms using electrodes positioned on the mother’s abdomen. However, this method is often subject to interference from maternal cardiac activity and other external sources. To address this issue, we propose a novel diffusion method, DIFF-FECG, aimed at improving the extraction of FECG signals from AECG recordings. This method leverages a condition-driven diffusion process to learn specific conditional probability distributions, enabling the effective separation of high-quality FECG signals from noisy AECG data. By adaptively managing the inherent non-Gaussian noise characteristics of MECG within the AECG, DIFF-FECG achieves more effective FECG reconstruction. Furthermore, the quality of the generated FECG signals is also enhanced by adding reconstruction loss and multiple reconstructions. Experimental results on two public databases demonstrate that the proposed DIFF-FECG method yields satisfactory results, with an average Pearson correlation coefficient of 0.922 for the estimated FECG. These findings underscore the potential of diffusion probabilistic models in advancing FECG signal extraction techniques, thereby contributing to improved fetal health monitoring.
胎儿心电图(FECG)是评估胎儿心脏健康和妊娠状态的重要工具。直接侵入性超声心动图提供可靠的胎儿心率信号,但存在风险,并限制在分娩期间使用。相反,通过腹部心电图(AECG)对胎儿心脏进行无创监测是可能的,腹部心电图使用放置在母亲腹部的电极检测胎儿心脏波形。然而,这种方法经常受到母亲心脏活动和其他外部来源的干扰。为了解决这个问题,我们提出了一种新的扩散方法,DIFF-FECG,旨在改进从AECG记录中提取FECG信号的方法。该方法利用条件驱动的扩散过程来学习特定的条件概率分布,从而能够有效地从噪声AECG数据中分离出高质量的FECG信号。DIFF-FECG通过自适应地处理AECG中meg固有的非高斯噪声特性,实现了更有效的feg重建。此外,通过增加重构损失和多次重构,提高了生成的FECG信号的质量。在两个公共数据库上的实验结果表明,所提出的DIFF-FECG方法取得了令人满意的结果,估计的FECG的平均Pearson相关系数为0.922。这些发现强调了扩散概率模型在推进FECG信号提取技术方面的潜力,从而有助于改善胎儿健康监测。
{"title":"DIFF-FECG: A Conditional Diffusion-Based Method for Fetal ECG Extraction From Abdominal ECG","authors":"Zhenqin Chen;Yiwei Lin;Qiong Luo;Jinshan Xu","doi":"10.1109/TAI.2025.3578007","DOIUrl":"https://doi.org/10.1109/TAI.2025.3578007","url":null,"abstract":"Fetal electrocardiography (FECG) is a crucial tool for assessing fetal cardiac health and pregnancy status. Direct invasive FECG provides reliable fetal heart rate signals, but poses risks and is limited to use during labor. Conversely, non-invasive monitoring of the fetal heart is possible via abdominal electrocardiography (AECG), which detects fetal heart waveforms using electrodes positioned on the mother’s abdomen. However, this method is often subject to interference from maternal cardiac activity and other external sources. To address this issue, we propose a novel diffusion method, DIFF-FECG, aimed at improving the extraction of FECG signals from AECG recordings. This method leverages a condition-driven diffusion process to learn specific conditional probability distributions, enabling the effective separation of high-quality FECG signals from noisy AECG data. By adaptively managing the inherent non-Gaussian noise characteristics of MECG within the AECG, DIFF-FECG achieves more effective FECG reconstruction. Furthermore, the quality of the generated FECG signals is also enhanced by adding reconstruction loss and multiple reconstructions. Experimental results on two public databases demonstrate that the proposed DIFF-FECG method yields satisfactory results, with an average Pearson correlation coefficient of 0.922 for the estimated FECG. These findings underscore the potential of diffusion probabilistic models in advancing FECG signal extraction techniques, thereby contributing to improved fetal health monitoring.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"534-546"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Successive Halving Based Online Ensemble Selection for Concept-Drift Adaptation 基于连续减半的概念漂移自适应在线集成选择
Pub Date : 2025-06-10 DOI: 10.1109/TAI.2025.3578305
Jobin Wilson;Santanu Chaudhury;Brejesh Lall
Ensemble learning is one of the most successful approaches for concept-drift adaptation due to its versatility and high predictive performance. However, a practical challenge in using ensembles for high-speed data stream mining is the associated large computational cost. In this article, we introduce a computationally efficient heterogeneous ensemble classifier named successive halving ensemble (SUHEN) which adapts to concept-drift using online ensemble selection. We model ensemble selection as a fixed budget best arm identification bandit problem and solve it using successive halving algorithm (SHA). SUHEN identifies a single best performing member for a stream segment and utilizes it for training and prediction until a drift is detected. Upon detecting drift, SHA identifies the new best performer for the segment. As stream characteristics evolve, manually choosing a fixed SHA budget would be challenging. To this end, we extend SUHEN by posing budget selection as a hyperparameter tuning problem and solve it using meta-learning. Our evaluation on 20 benchmark datasets reveal that SUHEN provides accuracy statistically at par with state-of-the-art ensemble algorithms, while providing significant computational resource savings. This makes our proposal attractive for high-speed stream mining problems in resource-constrained settings.
集成学习由于其通用性和较高的预测性能,是最成功的概念漂移自适应方法之一。然而,使用集成进行高速数据流挖掘的一个实际挑战是相关的大量计算成本。本文介绍了一种计算效率高的异构集成分类器,即连续减半集成(SUHEN),它采用在线集成选择来适应概念漂移。我们将集成选择建模为一个固定预算的最佳武器识别问题,并使用连续减半算法(SHA)来解决它。SUHEN为流段识别一个表现最好的成员,并利用它进行训练和预测,直到检测到漂移。在检测到漂移后,SHA为该段识别新的最佳性能。随着流特征的演变,手动选择固定的SHA预算将具有挑战性。为此,我们通过将预算选择作为一个超参数调优问题来扩展SUHEN,并使用元学习来解决它。我们对20个基准数据集的评估表明,SUHEN在统计上的准确性与最先进的集成算法相当,同时节省了大量的计算资源。这使得我们的建议对资源受限环境下的高速流采矿问题具有吸引力。
{"title":"Successive Halving Based Online Ensemble Selection for Concept-Drift Adaptation","authors":"Jobin Wilson;Santanu Chaudhury;Brejesh Lall","doi":"10.1109/TAI.2025.3578305","DOIUrl":"https://doi.org/10.1109/TAI.2025.3578305","url":null,"abstract":"Ensemble learning is one of the most successful approaches for concept-drift adaptation due to its versatility and high predictive performance. However, a practical challenge in using ensembles for high-speed data stream mining is the associated large computational cost. In this article, we introduce a computationally efficient heterogeneous ensemble classifier named successive halving ensemble (SUHEN) which adapts to concept-drift using online ensemble selection. We model ensemble selection as a fixed budget best arm identification bandit problem and solve it using successive halving algorithm (SHA). SUHEN identifies a single best performing member for a stream segment and utilizes it for training and prediction until a drift is detected. Upon detecting drift, SHA identifies the new best performer for the segment. As stream characteristics evolve, manually choosing a fixed SHA budget would be challenging. To this end, we extend SUHEN by posing budget selection as a hyperparameter tuning problem and solve it using meta-learning. Our evaluation on 20 benchmark datasets reveal that SUHEN provides accuracy statistically at par with state-of-the-art ensemble algorithms, while providing significant computational resource savings. This makes our proposal attractive for high-speed stream mining problems in resource-constrained settings.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"547-561"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Event-Triggered Quantization-Based Predefined-Time Adaptive Fuzzy Control for Quadrotor Trajectory Tracking 基于事件触发量化的四旋翼飞行器轨迹跟踪自适应模糊控制
Pub Date : 2025-06-10 DOI: 10.1109/TAI.2025.3578011
Zhimin Zhou;Lin Zhao
In this letter, a predefined-time adaptive fuzzy trajectory tracking control based on an event-triggered quantization framework is proposed for a quadrotor with inertial uncertainty, full-state constraints, and actuator saturation. First, a double-threshold event-triggered quantization mechanism is proposed to adaptively adjust the discretization degree of the control signals, reducing the communication burden while balancing the control accuracy. Subsequently, the computational complexity and filter error problems are solved by constructing the command filter and filter error compensation mechanism. The unknown nonlinear dynamics of the quadrotor are handled through the approximation capability of an adaptive fuzzy logic system. In addition, an auxiliary signal and a smooth approximation function are combined to cope with actuator saturation. Using Lyapunov theory, the predefined-time stability of the system under full-state constraints is proven. Finally, the validity and superiority of the proposed algorithm have been verified through the simulation example.
针对具有惯性不确定性、全状态约束和执行器饱和的四旋翼飞行器,提出了一种基于事件触发量化框架的预定义时间自适应模糊轨迹跟踪控制方法。首先,提出一种双阈值事件触发量化机制,自适应调整控制信号的离散化程度,在平衡控制精度的同时减少通信负担;随后,通过构建命令滤波器和滤波误差补偿机制,解决了计算复杂度和滤波误差问题。利用自适应模糊逻辑系统的逼近能力处理未知的非线性动力学问题。此外,将辅助信号和光滑逼近函数相结合,以应对执行器饱和。利用李雅普诺夫理论,证明了系统在全状态约束下的预定义时间稳定性。最后,通过仿真算例验证了所提算法的有效性和优越性。
{"title":"Event-Triggered Quantization-Based Predefined-Time Adaptive Fuzzy Control for Quadrotor Trajectory Tracking","authors":"Zhimin Zhou;Lin Zhao","doi":"10.1109/TAI.2025.3578011","DOIUrl":"https://doi.org/10.1109/TAI.2025.3578011","url":null,"abstract":"In this letter, a predefined-time adaptive fuzzy trajectory tracking control based on an event-triggered quantization framework is proposed for a quadrotor with inertial uncertainty, full-state constraints, and actuator saturation. First, a double-threshold event-triggered quantization mechanism is proposed to adaptively adjust the discretization degree of the control signals, reducing the communication burden while balancing the control accuracy. Subsequently, the computational complexity and filter error problems are solved by constructing the command filter and filter error compensation mechanism. The unknown nonlinear dynamics of the quadrotor are handled through the approximation capability of an adaptive fuzzy logic system. In addition, an auxiliary signal and a smooth approximation function are combined to cope with actuator saturation. Using Lyapunov theory, the predefined-time stability of the system under full-state constraints is proven. Finally, the validity and superiority of the proposed algorithm have been verified through the simulation example.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"596-605"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Adversarial Training to Improve Uncertainty Quantification 利用对抗训练提高不确定性量化
Pub Date : 2025-06-10 DOI: 10.1109/TAI.2025.3578004
Kuan Huang;Meng Xu;Yingfeng Wang
The success of adversarial attack methods suggests a small input change may mislead a trained machine-learning model. For example, changing one pixel of an image may cause the trained model to misclassify this updated image. Uncertainty quantification is crucial for detecting misclassifications; hence, precise uncertainty quantification, meaning uncertainty estimates that closely align with prediction correctness, is essential. We assume that misclassified samples should exhibit high uncertainty while correctly classified samples should exhibit low uncertainty. To evaluate the performance of uncertainty quantification, we investigate the task of uncertainty-based misclassification detection under adversarial attack conditions. Our findings suggest that existing uncertainty quantification methods are unable to accurately identify misclassified predictions resulting from adversarial attacks due to training issues. We propose a simple adversarial training strategy for improving uncertainty quantification. Our results show that adversarial training improves the reliability of uncertainty quantification by better aligning uncertainty with prediction correctness. Specifically, we observe consistent improvements in misclassification detection performance, measured by AUC-ROC and AUC-PR, across clean and adversarial samples.
对抗性攻击方法的成功表明,一个小的输入变化可能会误导一个训练有素的机器学习模型。例如,改变图像的一个像素可能会导致训练模型对更新后的图像进行错误分类。不确定度量化是检测错误分类的关键;因此,精确的不确定性量化,即与预测正确性密切相关的不确定性估计,是必不可少的。我们假设错误分类的样本应该表现出高不确定性,而正确分类的样本应该表现出低不确定性。为了评估不确定性量化的性能,我们研究了对抗性攻击条件下基于不确定性的误分类检测任务。我们的研究结果表明,由于训练问题,现有的不确定性量化方法无法准确识别由对抗性攻击导致的错误分类预测。我们提出了一种简单的对抗训练策略来改进不确定性量化。我们的研究结果表明,对抗训练通过更好地将不确定性与预测正确性结合起来,提高了不确定性量化的可靠性。具体来说,我们观察到在干净和对抗样本中,通过AUC-ROC和AUC-PR测量的错误分类检测性能的一致性改进。
{"title":"Using Adversarial Training to Improve Uncertainty Quantification","authors":"Kuan Huang;Meng Xu;Yingfeng Wang","doi":"10.1109/TAI.2025.3578004","DOIUrl":"https://doi.org/10.1109/TAI.2025.3578004","url":null,"abstract":"The success of adversarial attack methods suggests a small input change may mislead a trained machine-learning model. For example, changing one pixel of an image may cause the trained model to misclassify this updated image. Uncertainty quantification is crucial for detecting misclassifications; hence, precise uncertainty quantification, meaning uncertainty estimates that closely align with prediction correctness, is essential. We assume that misclassified samples should exhibit high uncertainty while correctly classified samples should exhibit low uncertainty. To evaluate the performance of uncertainty quantification, we investigate the task of uncertainty-based misclassification detection under adversarial attack conditions. Our findings suggest that existing uncertainty quantification methods are unable to accurately identify misclassified predictions resulting from adversarial attacks due to training issues. We propose a simple adversarial training strategy for improving uncertainty quantification. Our results show that adversarial training improves the reliability of uncertainty quantification by better aligning uncertainty with prediction correctness. Specifically, we observe consistent improvements in misclassification detection performance, measured by AUC-ROC and AUC-PR, across clean and adversarial samples.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"522-533"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Variational Autoencoder-Based Parameter Learning of Bayesian Network With Multiple Latent Variables 基于深度变分自编码器的多隐变量贝叶斯网络参数学习
Pub Date : 2025-06-06 DOI: 10.1109/TAI.2025.3577601
Xinran Wu;Kun Yue;Liang Duan;Hongbo Xie;Huashuai Liu
Intelligent systems could be increasingly powerful by applying probabilistic inferences over the dependence relations among observed and latent variables, which could be represented by the Bayesian network (BN) with multiple latent (BNML) variables. As the critical task in BNML construction, parameter learning is fulfilled by extending the classic EM algorithm in most of the existing methods, but the time complexity is exponential to the number of latent variables. To address this issue, we first propose to reduce the number of latent variables by training a vector quantized variational autoencoder (VQVAE). Specifically, we incorporate the initial probability parameters in conditional probability tables (CPTs) of BNML as the regularization term of VQVAE to guarantee that the probability parameters after reduction are similar (i.e., consistent) to those before reduction. Then, we incorporate efficient gradient calculations to augment the EM algorithm and propose the efficient algorithm for parameter learning of the BN with reduced latent (BNRL) variables. Finally, we present the efficient method for probabilistic inferences in BNRL by encoding evidence variable, decoding query variables and updating query variable values via backpropagation. Experimental results on real and synthetic BNs demonstrate that our method outperforms the state-of-the-art methods on efficiency and effectiveness.
通过对观测变量和潜在变量之间的依赖关系进行概率推理,智能系统可以变得越来越强大,这种推理可以用具有多个潜在变量的贝叶斯网络(BN)来表示。参数学习是构造BNML的关键任务,现有的大多数方法都是通过扩展经典的EM算法来完成参数学习,但时间复杂度与潜在变量的数量呈指数关系。为了解决这个问题,我们首先提出通过训练矢量量化变分自编码器(VQVAE)来减少潜在变量的数量。具体而言,我们将BNML条件概率表(CPTs)中的初始概率参数作为VQVAE的正则化项,以保证约简后的概率参数与约简前的概率参数相似(即一致)。然后,我们结合有效的梯度计算来增强EM算法,并提出了具有减少潜在变量(BNRL)的BN参数学习的有效算法。最后,通过对证据变量进行编码,对查询变量进行解码,并通过反向传播对查询变量值进行更新,提出了一种有效的BNRL概率推理方法。实验结果表明,我们的方法在效率和有效性上都优于目前最先进的方法。
{"title":"Deep Variational Autoencoder-Based Parameter Learning of Bayesian Network With Multiple Latent Variables","authors":"Xinran Wu;Kun Yue;Liang Duan;Hongbo Xie;Huashuai Liu","doi":"10.1109/TAI.2025.3577601","DOIUrl":"https://doi.org/10.1109/TAI.2025.3577601","url":null,"abstract":"Intelligent systems could be increasingly powerful by applying probabilistic inferences over the dependence relations among observed and latent variables, which could be represented by the Bayesian network (BN) with multiple latent (BNML) variables. As the critical task in BNML construction, parameter learning is fulfilled by extending the classic EM algorithm in most of the existing methods, but the time complexity is exponential to the number of latent variables. To address this issue, we first propose to reduce the number of latent variables by training a vector quantized variational autoencoder (VQVAE). Specifically, we incorporate the initial probability parameters in conditional probability tables (CPTs) of BNML as the regularization term of VQVAE to guarantee that the probability parameters after reduction are similar (i.e., consistent) to those before reduction. Then, we incorporate efficient gradient calculations to augment the EM algorithm and propose the efficient algorithm for parameter learning of the BN with reduced latent (BNRL) variables. Finally, we present the efficient method for probabilistic inferences in BNRL by encoding evidence variable, decoding query variables and updating query variable values via backpropagation. Experimental results on real and synthetic BNs demonstrate that our method outperforms the state-of-the-art methods on efficiency and effectiveness.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"497-511"},"PeriodicalIF":0.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Learning Based Collaborative Modeling of Heterogeneous Data for Few-Shot Fault Diagnosis 基于对比学习的异构数据协同建模在小故障诊断中的应用
Pub Date : 2025-06-06 DOI: 10.1109/TAI.2025.3577119
Kai Zhong;Hengchang Zhu;Xiaoming Zhang;Darong Huang;Min Han
Few-shot diagnosis has received extensive attention recently. Existing methods rarely consider the consistency within and between heterogeneous data, leading to suboptimal diagnosis performance. To address this issue, a contrastive learning based collaborative modeling for few-shot diagnosis is proposed. First of all, a heterogeneous data enhancement workflows with distribution consistency assessment is designed to acquire sufficient industrial process information, which can also mitigate the inconsistency between enhanced data and original data. Following this, convolutional networks with customized structures are used to extract the multimodal features from heterogeneous signals. After that, the collaborative modeling and diagnosis module is devised through the joint optimization of contrastive loss and cross entropy loss, which can shorten the distance of similar samples in feature space and retain cross structure consistency. Finally, the effectiveness and superiority of the proposed method are substantiated through simulated and the real world cases.
近年来,少针诊断受到了广泛的关注。现有方法很少考虑异构数据内部和异构数据之间的一致性,导致诊断性能欠佳。为了解决这一问题,提出了一种基于对比学习的小镜头诊断协同建模方法。首先,设计了一个具有分布一致性评估的异构数据增强工作流,以获取足够的工业过程信息,并减轻增强数据与原始数据之间的不一致性。然后,使用自定义结构的卷积网络从异构信号中提取多模态特征。然后,通过对比损失和交叉熵损失的联合优化设计协同建模与诊断模块,缩短相似样本在特征空间中的距离,保持交叉结构的一致性。最后,通过仿真和实际案例验证了所提方法的有效性和优越性。
{"title":"Contrastive Learning Based Collaborative Modeling of Heterogeneous Data for Few-Shot Fault Diagnosis","authors":"Kai Zhong;Hengchang Zhu;Xiaoming Zhang;Darong Huang;Min Han","doi":"10.1109/TAI.2025.3577119","DOIUrl":"https://doi.org/10.1109/TAI.2025.3577119","url":null,"abstract":"Few-shot diagnosis has received extensive attention recently. Existing methods rarely consider the consistency within and between heterogeneous data, leading to suboptimal diagnosis performance. To address this issue, a contrastive learning based collaborative modeling for few-shot diagnosis is proposed. First of all, a heterogeneous data enhancement workflows with distribution consistency assessment is designed to acquire sufficient industrial process information, which can also mitigate the inconsistency between enhanced data and original data. Following this, convolutional networks with customized structures are used to extract the multimodal features from heterogeneous signals. After that, the collaborative modeling and diagnosis module is devised through the joint optimization of contrastive loss and cross entropy loss, which can shorten the distance of similar samples in feature space and retain cross structure consistency. Finally, the effectiveness and superiority of the proposed method are substantiated through simulated and the real world cases.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"486-496"},"PeriodicalIF":0.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FG-KD: A Novel Forward Gradient-Based Framework for Teacher Knowledge Augmentation FG-KD:一种新的基于正向梯度的教师知识增强框架
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576087
Yang Yang;Chao Wang;Lei Gong;Min Wu;Zhenghua Chen;Xuehai Zhou
Knowledge distillation has become increasingly popular for training compact neural network models that can achieve comparable performance to larger models. In order to improve the effectiveness of knowledge distillation, enhancing the quality of the teacher knowledge is a crucial aspect to consider. While existing efforts have predominantly focused on optimizing the structure of teacher models and refining training procedures, we argue that there is untapped potential in further enhancing knowledge distillation through the augmentation of the teacher knowledge itself. In this article, we introduce FG-KD, a novel forward gradient-based framework specifically designed for augmenting teacher knowledge in knowledge distillation. FG-KD comprises two fundamental components: a feature reconstructor and a relation-aware enhancer. Both components employ a forward gradient-based approach to unlock the latent potential for enhancing teachers’ knowledge, thereby providing an enriched foundation for knowledge distillation. The feature reconstructor operates at the feature level, enabling the optimization of the teacher knowledge by enhancing the encoding of high-dimensional spaces. On the other hand, the relation-aware enhancer operates at the logit level, with a focus on identifying and reinforcing the interclass and intraclass relationships within the teacher knowledge. Through extensive experiments conducted on image recognition tasks, we demonstrate the effectiveness of FG-KD in improving the performance of various knowledge distillation techniques, regardless of the specific teacher–student model combinations.
知识蒸馏在训练紧凑的神经网络模型方面变得越来越流行,这些模型可以达到与大型模型相当的性能。为了提高知识蒸馏的有效性,提高教师知识的质量是必须考虑的一个重要方面。虽然现有的努力主要集中在优化教师模型结构和完善培训程序上,但我们认为,通过增加教师知识本身,进一步提高知识蒸馏的潜力尚未开发。本文介绍了一种新的基于正向梯度的框架FG-KD,该框架专门用于在知识蒸馏中增强教师知识。FG-KD包括两个基本组件:特征重构器和关系感知增强器。这两个组件都采用了基于正向梯度的方法来释放教师知识提升的潜在潜力,从而为知识升华提供了丰富的基础。特征重构器在特征层进行操作,通过增强高维空间的编码,实现对教师知识的优化。另一方面,关系意识增强者在逻辑层面上运作,重点是识别和加强教师知识中的班级间和班级内关系。通过对图像识别任务进行的大量实验,我们证明了FG-KD在提高各种知识蒸馏技术性能方面的有效性,而不考虑具体的师生模型组合。
{"title":"FG-KD: A Novel Forward Gradient-Based Framework for Teacher Knowledge Augmentation","authors":"Yang Yang;Chao Wang;Lei Gong;Min Wu;Zhenghua Chen;Xuehai Zhou","doi":"10.1109/TAI.2025.3576087","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576087","url":null,"abstract":"Knowledge distillation has become increasingly popular for training compact neural network models that can achieve comparable performance to larger models. In order to improve the effectiveness of knowledge distillation, enhancing the quality of the teacher knowledge is a crucial aspect to consider. While existing efforts have predominantly focused on optimizing the structure of teacher models and refining training procedures, we argue that there is untapped potential in further enhancing knowledge distillation through the augmentation of the teacher knowledge itself. In this article, we introduce FG-KD, a novel forward gradient-based framework specifically designed for augmenting teacher knowledge in knowledge distillation. FG-KD comprises two fundamental components: a feature reconstructor and a relation-aware enhancer. Both components employ a forward gradient-based approach to unlock the latent potential for enhancing teachers’ knowledge, thereby providing an enriched foundation for knowledge distillation. The feature reconstructor operates at the feature level, enabling the optimization of the teacher knowledge by enhancing the encoding of high-dimensional spaces. On the other hand, the relation-aware enhancer operates at the logit level, with a focus on identifying and reinforcing the interclass and intraclass relationships within the teacher knowledge. Through extensive experiments conducted on image recognition tasks, we demonstrate the effectiveness of FG-KD in improving the performance of various knowledge distillation techniques, regardless of the specific teacher–student model combinations.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"439-454"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Disentanglement for Tackling Popularity Bias in Sequential Recommendation 解决顺序推荐中流行偏差的因果解纠集
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3575554
An-An Liu;Yadong Zhao;Xin Wen;Rihao Chang;Weizhi Nie
Recommender systems typically exhibit severe popularity bias, with a few highly popular items receiving excessive exposure. Most existing studies tackle this bias in static settings. However, they neglect the dynamic nature of real-world recommendation scenarios and lack a thorough analysis into the root causes of bias, which makes it challenging to accurately model and mitigate the dynamically changing popularity bias and capture genuine user preferences. To this end, we propose a causal disentanglement sequential recommendation model (CDSRec) based on time series analysis and hidden variable separation. Our model leverages Markov chains to analyze historical interaction data within sequential recommendations, capturing the dynamic variations of item popularity and user preferences. Employing causal inference, we disentangle the potential factors implicated in popularity bias. Specifically, user–item interactions are primarily driven by personalized demands and item popularity. Through empirical analysis from a temporal perspective, we reveal that popularity has both positive and negative impacts, and attribute them to stable intrinsic quality factors and dynamic external interference factors. We construct a causal directed acyclic graph to elucidate the temporal correlations among different factors. Subsequently, we utilize historical interaction sequences and item-related attributes as auxiliary information to explicitly disentangle these factors as hidden variables. By reformulating the objective function to optimize the sequential VAE framework, our model effectively mitigates the negative impact of external interference factors. Extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model.
推荐系统通常表现出严重的人气偏差,一些非常受欢迎的项目会被过度曝光。大多数现有的研究都是在静态环境下解决这种偏见的。然而,他们忽视了现实世界推荐场景的动态性,缺乏对偏见根源的彻底分析,这使得准确建模和减轻动态变化的流行偏见并捕获真正的用户偏好变得具有挑战性。为此,我们提出了一种基于时间序列分析和隐变量分离的因果解纠缠顺序推荐模型(CDSRec)。我们的模型利用马尔可夫链来分析连续推荐中的历史交互数据,捕捉项目受欢迎程度和用户偏好的动态变化。采用因果推理,我们解开了受欢迎程度偏差的潜在因素。具体来说,用户与物品的交互主要是由个性化需求和物品受欢迎程度驱动的。通过时间视角的实证分析,我们发现人气既有正向影响,也有负向影响,并将其归因于稳定的内在品质因素和动态的外部干扰因素。我们构造了一个因果有向无环图来说明不同因素之间的时间相关性。随后,我们利用历史交互序列和项目相关属性作为辅助信息,明确地解开这些因素作为隐藏变量。通过重新制定目标函数来优化序列VAE框架,我们的模型有效地减轻了外部干扰因素的负面影响。在三个真实数据集上的大量实验结果证明了我们提出的模型的优越性。
{"title":"Causal Disentanglement for Tackling Popularity Bias in Sequential Recommendation","authors":"An-An Liu;Yadong Zhao;Xin Wen;Rihao Chang;Weizhi Nie","doi":"10.1109/TAI.2025.3575554","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575554","url":null,"abstract":"Recommender systems typically exhibit severe popularity bias, with a few highly popular items receiving excessive exposure. Most existing studies tackle this bias in static settings. However, they neglect the dynamic nature of real-world recommendation scenarios and lack a thorough analysis into the root causes of bias, which makes it challenging to accurately model and mitigate the dynamically changing popularity bias and capture genuine user preferences. To this end, we propose a causal disentanglement sequential recommendation model (CDSRec) based on time series analysis and hidden variable separation. Our model leverages Markov chains to analyze historical interaction data within sequential recommendations, capturing the dynamic variations of item popularity and user preferences. Employing causal inference, we disentangle the potential factors implicated in popularity bias. Specifically, user–item interactions are primarily driven by personalized demands and item popularity. Through empirical analysis from a temporal perspective, we reveal that popularity has both positive and negative impacts, and attribute them to stable intrinsic quality factors and dynamic external interference factors. We construct a causal directed acyclic graph to elucidate the temporal correlations among different factors. Subsequently, we utilize historical interaction sequences and item-related attributes as auxiliary information to explicitly disentangle these factors as hidden variables. By reformulating the objective function to optimize the sequential VAE framework, our model effectively mitigates the negative impact of external interference factors. Extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"426-438"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pFedBL: Federated Bayesian Learning With Personalized Prior pFedBL:具有个性化先验的联邦贝叶斯学习
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576201
Xinhui Yu;Arvin Tashakori;Liang Zou;Z. Jane Wang
Most existing federated learning (FL) frameworks use deterministic models as the task model, which may suffer from overfitting due to small-scale data at client sides. Since Bayesian learning (BL) can quantify the uncertainty associated with both model parameters and prediction outcomes, there have been efforts to integrate BL with FL and the global objective is transformed into posterior approximation using Bayesian optimization. Variational inference is commonly used in such efforts which utilize the global distribution as the prior for the optimization of local Bayesian neural networks (BNNs) and thus eliminates the need for assigning specific prior distributions for clients. However, due to statistical heterogeneity across clients, the global distribution, representing the collective knowledge of all clients, may not be precise as client prior. To address this concern, we propose a federated Bayesian learning framework with personalized priors (pFedBL) where each client is assigned with a local BNN. Specifically, we first introduce a KL-divergence-based distribution aggregation scheme to ensure the effectiveness of the global distribution. Meanwhile, under the mild assumption that the server has access to a general unlabeled dataset, the server uses predictions as well as predictive uncertainty of these data, derived from local BNNs, to construct feature distributions. These distributions are then provided to clients for fine-tuning the global distribution, resulting in personalized priors. In addition, to ensure optimal integration of local and global data insights, we design an adaptive $zeta$ strategy in the local objective function to balance the log-likelihood estimation term and the KL divergence term. We provide theoretical analysis regarding the upper bound of the averaged generalization error for the proposed pFedBL and experimental results demonstrate its effectiveness on three datasets under different problem settings.
大多数现有的联邦学习(FL)框架使用确定性模型作为任务模型,这可能会由于客户端的小规模数据而导致过拟合。由于贝叶斯学习(BL)可以量化与模型参数和预测结果相关的不确定性,因此人们一直在努力将BL与FL相结合,并使用贝叶斯优化将全局目标转化为后验逼近。变分推理通常用于利用全局分布作为局部贝叶斯神经网络(bnn)优化的先验,从而消除了为客户分配特定先验分布的需要。然而,由于客户之间的统计异质性,代表所有客户集体知识的全球分布可能不像客户之前那样精确。为了解决这个问题,我们提出了一个具有个性化先验的联邦贝叶斯学习框架(pFedBL),其中每个客户端都被分配了一个本地BNN。具体而言,我们首先引入了一种基于kl -散度的分布聚合方案,以确保全局分布的有效性。同时,在服务器可以访问一般未标记数据集的温和假设下,服务器使用来自本地bnn的预测以及这些数据的预测不确定性来构建特征分布。然后将这些分布提供给客户端,以便对全局分布进行微调,从而产生个性化的先验。此外,为了确保局部和全局数据洞察的最佳集成,我们在局部目标函数中设计了自适应$zeta$策略来平衡对数似然估计项和KL散度项。我们对所提出的pFedBL的平均泛化误差上界进行了理论分析,实验结果证明了该方法在不同问题设置下的三个数据集上的有效性。
{"title":"pFedBL: Federated Bayesian Learning With Personalized Prior","authors":"Xinhui Yu;Arvin Tashakori;Liang Zou;Z. Jane Wang","doi":"10.1109/TAI.2025.3576201","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576201","url":null,"abstract":"Most existing federated learning (FL) frameworks use deterministic models as the task model, which may suffer from overfitting due to small-scale data at client sides. Since Bayesian learning (BL) can quantify the uncertainty associated with both model parameters and prediction outcomes, there have been efforts to integrate BL with FL and the global objective is transformed into posterior approximation using Bayesian optimization. Variational inference is commonly used in such efforts which utilize the global distribution as the prior for the optimization of local Bayesian neural networks (BNNs) and thus eliminates the need for assigning specific prior distributions for clients. However, due to statistical heterogeneity across clients, the global distribution, representing the collective knowledge of all clients, may not be precise as client prior. To address this concern, we propose a federated Bayesian learning framework with personalized priors (pFedBL) where each client is assigned with a local BNN. Specifically, we first introduce a KL-divergence-based distribution aggregation scheme to ensure the effectiveness of the global distribution. Meanwhile, under the mild assumption that the server has access to a general unlabeled dataset, the server uses predictions as well as predictive uncertainty of these data, derived from local BNNs, to construct feature distributions. These distributions are then provided to clients for fine-tuning the global distribution, resulting in personalized priors. In addition, to ensure optimal integration of local and global data insights, we design an adaptive <inline-formula><tex-math>$zeta$</tex-math></inline-formula> strategy in the local objective function to balance the log-likelihood estimation term and the KL divergence term. We provide theoretical analysis regarding the upper bound of the averaged generalization error for the proposed pFedBL and experimental results demonstrate its effectiveness on three datasets under different problem settings.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"455-470"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Soft Parameter Sharing Model for Cross-Problem Generalization in Vehicle Routing Problems 车辆路径问题跨问题泛化的软参数共享模型
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576336
Yang Wang;Ya-Hui Jia;Wei-Neng Chen;Yi Mei
Neural combinatorial optimization (NCO) has achieved remarkable performance in solving individual vehicle routing problems (VRPs) by leveraging attention mechanisms. However, when generalizing across different problems, these methods perform poorly because the hard parameter sharing models they adopted are unable to capture the commonalities and peculiarities of different problems. To address this limitation, we propose a novel multitask NCO method called the soft parameter sharing model (SPSM) that incorporates multiple independent attention modules and a gating network. SPSM allows the model to learn both universal patterns and individualized requirements without explicitly designating any module as shared or task-specific. When solving a specific VRP, the gating network may decide the importance of the characteristics learned by each attention module. Additionally, we adopt the maximum entropy reinforcement learning to maintain the diversity of the model in the training process, which can prevent the model from being greedy for some dominant tasks or only for the training tasks. Experimental results demonstrate that SPSM significantly enhances zero-shot generalization performance across ten unseen VRP variants and real-world benchmark instances.
神经组合优化(NCO)利用注意机制在解决车辆个体路径问题方面取得了显著的效果。然而,当泛化到不同的问题时,这些方法的性能很差,因为它们采用的硬参数共享模型无法捕捉不同问题的共性和特殊性。为了解决这一限制,我们提出了一种新的多任务NCO方法,称为软参数共享模型(SPSM),它包含多个独立的注意力模块和一个门控网络。SPSM允许模型学习通用模式和个性化需求,而无需显式地将任何模块指定为共享的或特定于任务的。在求解特定的VRP时,门控网络可以决定每个注意模块学习到的特征的重要性。此外,我们采用最大熵强化学习来保持模型在训练过程中的多样性,防止模型贪心于某些优势任务或只贪心于训练任务。实验结果表明,SPSM显著提高了十种未见过的VRP变体和实际基准实例的零射击泛化性能。
{"title":"Soft Parameter Sharing Model for Cross-Problem Generalization in Vehicle Routing Problems","authors":"Yang Wang;Ya-Hui Jia;Wei-Neng Chen;Yi Mei","doi":"10.1109/TAI.2025.3576336","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576336","url":null,"abstract":"Neural combinatorial optimization (NCO) has achieved remarkable performance in solving individual vehicle routing problems (VRPs) by leveraging attention mechanisms. However, when generalizing across different problems, these methods perform poorly because the hard parameter sharing models they adopted are unable to capture the commonalities and peculiarities of different problems. To address this limitation, we propose a novel multitask NCO method called the soft parameter sharing model (SPSM) that incorporates multiple independent attention modules and a gating network. SPSM allows the model to learn both universal patterns and individualized requirements without explicitly designating any module as shared or task-specific. When solving a specific VRP, the gating network may decide the importance of the characteristics learned by each attention module. Additionally, we adopt the maximum entropy reinforcement learning to maintain the diversity of the model in the training process, which can prevent the model from being greedy for some dominant tasks or only for the training tasks. Experimental results demonstrate that SPSM significantly enhances zero-shot generalization performance across ten unseen VRP variants and real-world benchmark instances.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"471-485"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1