首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
FG-KD: A Novel Forward Gradient-Based Framework for Teacher Knowledge Augmentation FG-KD:一种新的基于正向梯度的教师知识增强框架
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576087
Yang Yang;Chao Wang;Lei Gong;Min Wu;Zhenghua Chen;Xuehai Zhou
Knowledge distillation has become increasingly popular for training compact neural network models that can achieve comparable performance to larger models. In order to improve the effectiveness of knowledge distillation, enhancing the quality of the teacher knowledge is a crucial aspect to consider. While existing efforts have predominantly focused on optimizing the structure of teacher models and refining training procedures, we argue that there is untapped potential in further enhancing knowledge distillation through the augmentation of the teacher knowledge itself. In this article, we introduce FG-KD, a novel forward gradient-based framework specifically designed for augmenting teacher knowledge in knowledge distillation. FG-KD comprises two fundamental components: a feature reconstructor and a relation-aware enhancer. Both components employ a forward gradient-based approach to unlock the latent potential for enhancing teachers’ knowledge, thereby providing an enriched foundation for knowledge distillation. The feature reconstructor operates at the feature level, enabling the optimization of the teacher knowledge by enhancing the encoding of high-dimensional spaces. On the other hand, the relation-aware enhancer operates at the logit level, with a focus on identifying and reinforcing the interclass and intraclass relationships within the teacher knowledge. Through extensive experiments conducted on image recognition tasks, we demonstrate the effectiveness of FG-KD in improving the performance of various knowledge distillation techniques, regardless of the specific teacher–student model combinations.
知识蒸馏在训练紧凑的神经网络模型方面变得越来越流行,这些模型可以达到与大型模型相当的性能。为了提高知识蒸馏的有效性,提高教师知识的质量是必须考虑的一个重要方面。虽然现有的努力主要集中在优化教师模型结构和完善培训程序上,但我们认为,通过增加教师知识本身,进一步提高知识蒸馏的潜力尚未开发。本文介绍了一种新的基于正向梯度的框架FG-KD,该框架专门用于在知识蒸馏中增强教师知识。FG-KD包括两个基本组件:特征重构器和关系感知增强器。这两个组件都采用了基于正向梯度的方法来释放教师知识提升的潜在潜力,从而为知识升华提供了丰富的基础。特征重构器在特征层进行操作,通过增强高维空间的编码,实现对教师知识的优化。另一方面,关系意识增强者在逻辑层面上运作,重点是识别和加强教师知识中的班级间和班级内关系。通过对图像识别任务进行的大量实验,我们证明了FG-KD在提高各种知识蒸馏技术性能方面的有效性,而不考虑具体的师生模型组合。
{"title":"FG-KD: A Novel Forward Gradient-Based Framework for Teacher Knowledge Augmentation","authors":"Yang Yang;Chao Wang;Lei Gong;Min Wu;Zhenghua Chen;Xuehai Zhou","doi":"10.1109/TAI.2025.3576087","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576087","url":null,"abstract":"Knowledge distillation has become increasingly popular for training compact neural network models that can achieve comparable performance to larger models. In order to improve the effectiveness of knowledge distillation, enhancing the quality of the teacher knowledge is a crucial aspect to consider. While existing efforts have predominantly focused on optimizing the structure of teacher models and refining training procedures, we argue that there is untapped potential in further enhancing knowledge distillation through the augmentation of the teacher knowledge itself. In this article, we introduce FG-KD, a novel forward gradient-based framework specifically designed for augmenting teacher knowledge in knowledge distillation. FG-KD comprises two fundamental components: a feature reconstructor and a relation-aware enhancer. Both components employ a forward gradient-based approach to unlock the latent potential for enhancing teachers’ knowledge, thereby providing an enriched foundation for knowledge distillation. The feature reconstructor operates at the feature level, enabling the optimization of the teacher knowledge by enhancing the encoding of high-dimensional spaces. On the other hand, the relation-aware enhancer operates at the logit level, with a focus on identifying and reinforcing the interclass and intraclass relationships within the teacher knowledge. Through extensive experiments conducted on image recognition tasks, we demonstrate the effectiveness of FG-KD in improving the performance of various knowledge distillation techniques, regardless of the specific teacher–student model combinations.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"439-454"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Disentanglement for Tackling Popularity Bias in Sequential Recommendation 解决顺序推荐中流行偏差的因果解纠集
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3575554
An-An Liu;Yadong Zhao;Xin Wen;Rihao Chang;Weizhi Nie
Recommender systems typically exhibit severe popularity bias, with a few highly popular items receiving excessive exposure. Most existing studies tackle this bias in static settings. However, they neglect the dynamic nature of real-world recommendation scenarios and lack a thorough analysis into the root causes of bias, which makes it challenging to accurately model and mitigate the dynamically changing popularity bias and capture genuine user preferences. To this end, we propose a causal disentanglement sequential recommendation model (CDSRec) based on time series analysis and hidden variable separation. Our model leverages Markov chains to analyze historical interaction data within sequential recommendations, capturing the dynamic variations of item popularity and user preferences. Employing causal inference, we disentangle the potential factors implicated in popularity bias. Specifically, user–item interactions are primarily driven by personalized demands and item popularity. Through empirical analysis from a temporal perspective, we reveal that popularity has both positive and negative impacts, and attribute them to stable intrinsic quality factors and dynamic external interference factors. We construct a causal directed acyclic graph to elucidate the temporal correlations among different factors. Subsequently, we utilize historical interaction sequences and item-related attributes as auxiliary information to explicitly disentangle these factors as hidden variables. By reformulating the objective function to optimize the sequential VAE framework, our model effectively mitigates the negative impact of external interference factors. Extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model.
推荐系统通常表现出严重的人气偏差,一些非常受欢迎的项目会被过度曝光。大多数现有的研究都是在静态环境下解决这种偏见的。然而,他们忽视了现实世界推荐场景的动态性,缺乏对偏见根源的彻底分析,这使得准确建模和减轻动态变化的流行偏见并捕获真正的用户偏好变得具有挑战性。为此,我们提出了一种基于时间序列分析和隐变量分离的因果解纠缠顺序推荐模型(CDSRec)。我们的模型利用马尔可夫链来分析连续推荐中的历史交互数据,捕捉项目受欢迎程度和用户偏好的动态变化。采用因果推理,我们解开了受欢迎程度偏差的潜在因素。具体来说,用户与物品的交互主要是由个性化需求和物品受欢迎程度驱动的。通过时间视角的实证分析,我们发现人气既有正向影响,也有负向影响,并将其归因于稳定的内在品质因素和动态的外部干扰因素。我们构造了一个因果有向无环图来说明不同因素之间的时间相关性。随后,我们利用历史交互序列和项目相关属性作为辅助信息,明确地解开这些因素作为隐藏变量。通过重新制定目标函数来优化序列VAE框架,我们的模型有效地减轻了外部干扰因素的负面影响。在三个真实数据集上的大量实验结果证明了我们提出的模型的优越性。
{"title":"Causal Disentanglement for Tackling Popularity Bias in Sequential Recommendation","authors":"An-An Liu;Yadong Zhao;Xin Wen;Rihao Chang;Weizhi Nie","doi":"10.1109/TAI.2025.3575554","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575554","url":null,"abstract":"Recommender systems typically exhibit severe popularity bias, with a few highly popular items receiving excessive exposure. Most existing studies tackle this bias in static settings. However, they neglect the dynamic nature of real-world recommendation scenarios and lack a thorough analysis into the root causes of bias, which makes it challenging to accurately model and mitigate the dynamically changing popularity bias and capture genuine user preferences. To this end, we propose a causal disentanglement sequential recommendation model (CDSRec) based on time series analysis and hidden variable separation. Our model leverages Markov chains to analyze historical interaction data within sequential recommendations, capturing the dynamic variations of item popularity and user preferences. Employing causal inference, we disentangle the potential factors implicated in popularity bias. Specifically, user–item interactions are primarily driven by personalized demands and item popularity. Through empirical analysis from a temporal perspective, we reveal that popularity has both positive and negative impacts, and attribute them to stable intrinsic quality factors and dynamic external interference factors. We construct a causal directed acyclic graph to elucidate the temporal correlations among different factors. Subsequently, we utilize historical interaction sequences and item-related attributes as auxiliary information to explicitly disentangle these factors as hidden variables. By reformulating the objective function to optimize the sequential VAE framework, our model effectively mitigates the negative impact of external interference factors. Extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"426-438"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pFedBL: Federated Bayesian Learning With Personalized Prior pFedBL:具有个性化先验的联邦贝叶斯学习
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576201
Xinhui Yu;Arvin Tashakori;Liang Zou;Z. Jane Wang
Most existing federated learning (FL) frameworks use deterministic models as the task model, which may suffer from overfitting due to small-scale data at client sides. Since Bayesian learning (BL) can quantify the uncertainty associated with both model parameters and prediction outcomes, there have been efforts to integrate BL with FL and the global objective is transformed into posterior approximation using Bayesian optimization. Variational inference is commonly used in such efforts which utilize the global distribution as the prior for the optimization of local Bayesian neural networks (BNNs) and thus eliminates the need for assigning specific prior distributions for clients. However, due to statistical heterogeneity across clients, the global distribution, representing the collective knowledge of all clients, may not be precise as client prior. To address this concern, we propose a federated Bayesian learning framework with personalized priors (pFedBL) where each client is assigned with a local BNN. Specifically, we first introduce a KL-divergence-based distribution aggregation scheme to ensure the effectiveness of the global distribution. Meanwhile, under the mild assumption that the server has access to a general unlabeled dataset, the server uses predictions as well as predictive uncertainty of these data, derived from local BNNs, to construct feature distributions. These distributions are then provided to clients for fine-tuning the global distribution, resulting in personalized priors. In addition, to ensure optimal integration of local and global data insights, we design an adaptive $zeta$ strategy in the local objective function to balance the log-likelihood estimation term and the KL divergence term. We provide theoretical analysis regarding the upper bound of the averaged generalization error for the proposed pFedBL and experimental results demonstrate its effectiveness on three datasets under different problem settings.
大多数现有的联邦学习(FL)框架使用确定性模型作为任务模型,这可能会由于客户端的小规模数据而导致过拟合。由于贝叶斯学习(BL)可以量化与模型参数和预测结果相关的不确定性,因此人们一直在努力将BL与FL相结合,并使用贝叶斯优化将全局目标转化为后验逼近。变分推理通常用于利用全局分布作为局部贝叶斯神经网络(bnn)优化的先验,从而消除了为客户分配特定先验分布的需要。然而,由于客户之间的统计异质性,代表所有客户集体知识的全球分布可能不像客户之前那样精确。为了解决这个问题,我们提出了一个具有个性化先验的联邦贝叶斯学习框架(pFedBL),其中每个客户端都被分配了一个本地BNN。具体而言,我们首先引入了一种基于kl -散度的分布聚合方案,以确保全局分布的有效性。同时,在服务器可以访问一般未标记数据集的温和假设下,服务器使用来自本地bnn的预测以及这些数据的预测不确定性来构建特征分布。然后将这些分布提供给客户端,以便对全局分布进行微调,从而产生个性化的先验。此外,为了确保局部和全局数据洞察的最佳集成,我们在局部目标函数中设计了自适应$zeta$策略来平衡对数似然估计项和KL散度项。我们对所提出的pFedBL的平均泛化误差上界进行了理论分析,实验结果证明了该方法在不同问题设置下的三个数据集上的有效性。
{"title":"pFedBL: Federated Bayesian Learning With Personalized Prior","authors":"Xinhui Yu;Arvin Tashakori;Liang Zou;Z. Jane Wang","doi":"10.1109/TAI.2025.3576201","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576201","url":null,"abstract":"Most existing federated learning (FL) frameworks use deterministic models as the task model, which may suffer from overfitting due to small-scale data at client sides. Since Bayesian learning (BL) can quantify the uncertainty associated with both model parameters and prediction outcomes, there have been efforts to integrate BL with FL and the global objective is transformed into posterior approximation using Bayesian optimization. Variational inference is commonly used in such efforts which utilize the global distribution as the prior for the optimization of local Bayesian neural networks (BNNs) and thus eliminates the need for assigning specific prior distributions for clients. However, due to statistical heterogeneity across clients, the global distribution, representing the collective knowledge of all clients, may not be precise as client prior. To address this concern, we propose a federated Bayesian learning framework with personalized priors (pFedBL) where each client is assigned with a local BNN. Specifically, we first introduce a KL-divergence-based distribution aggregation scheme to ensure the effectiveness of the global distribution. Meanwhile, under the mild assumption that the server has access to a general unlabeled dataset, the server uses predictions as well as predictive uncertainty of these data, derived from local BNNs, to construct feature distributions. These distributions are then provided to clients for fine-tuning the global distribution, resulting in personalized priors. In addition, to ensure optimal integration of local and global data insights, we design an adaptive <inline-formula><tex-math>$zeta$</tex-math></inline-formula> strategy in the local objective function to balance the log-likelihood estimation term and the KL divergence term. We provide theoretical analysis regarding the upper bound of the averaged generalization error for the proposed pFedBL and experimental results demonstrate its effectiveness on three datasets under different problem settings.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"455-470"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Soft Parameter Sharing Model for Cross-Problem Generalization in Vehicle Routing Problems 车辆路径问题跨问题泛化的软参数共享模型
Pub Date : 2025-06-04 DOI: 10.1109/TAI.2025.3576336
Yang Wang;Ya-Hui Jia;Wei-Neng Chen;Yi Mei
Neural combinatorial optimization (NCO) has achieved remarkable performance in solving individual vehicle routing problems (VRPs) by leveraging attention mechanisms. However, when generalizing across different problems, these methods perform poorly because the hard parameter sharing models they adopted are unable to capture the commonalities and peculiarities of different problems. To address this limitation, we propose a novel multitask NCO method called the soft parameter sharing model (SPSM) that incorporates multiple independent attention modules and a gating network. SPSM allows the model to learn both universal patterns and individualized requirements without explicitly designating any module as shared or task-specific. When solving a specific VRP, the gating network may decide the importance of the characteristics learned by each attention module. Additionally, we adopt the maximum entropy reinforcement learning to maintain the diversity of the model in the training process, which can prevent the model from being greedy for some dominant tasks or only for the training tasks. Experimental results demonstrate that SPSM significantly enhances zero-shot generalization performance across ten unseen VRP variants and real-world benchmark instances.
神经组合优化(NCO)利用注意机制在解决车辆个体路径问题方面取得了显著的效果。然而,当泛化到不同的问题时,这些方法的性能很差,因为它们采用的硬参数共享模型无法捕捉不同问题的共性和特殊性。为了解决这一限制,我们提出了一种新的多任务NCO方法,称为软参数共享模型(SPSM),它包含多个独立的注意力模块和一个门控网络。SPSM允许模型学习通用模式和个性化需求,而无需显式地将任何模块指定为共享的或特定于任务的。在求解特定的VRP时,门控网络可以决定每个注意模块学习到的特征的重要性。此外,我们采用最大熵强化学习来保持模型在训练过程中的多样性,防止模型贪心于某些优势任务或只贪心于训练任务。实验结果表明,SPSM显著提高了十种未见过的VRP变体和实际基准实例的零射击泛化性能。
{"title":"Soft Parameter Sharing Model for Cross-Problem Generalization in Vehicle Routing Problems","authors":"Yang Wang;Ya-Hui Jia;Wei-Neng Chen;Yi Mei","doi":"10.1109/TAI.2025.3576336","DOIUrl":"https://doi.org/10.1109/TAI.2025.3576336","url":null,"abstract":"Neural combinatorial optimization (NCO) has achieved remarkable performance in solving individual vehicle routing problems (VRPs) by leveraging attention mechanisms. However, when generalizing across different problems, these methods perform poorly because the hard parameter sharing models they adopted are unable to capture the commonalities and peculiarities of different problems. To address this limitation, we propose a novel multitask NCO method called the soft parameter sharing model (SPSM) that incorporates multiple independent attention modules and a gating network. SPSM allows the model to learn both universal patterns and individualized requirements without explicitly designating any module as shared or task-specific. When solving a specific VRP, the gating network may decide the importance of the characteristics learned by each attention module. Additionally, we adopt the maximum entropy reinforcement learning to maintain the diversity of the model in the training process, which can prevent the model from being greedy for some dominant tasks or only for the training tasks. Experimental results demonstrate that SPSM significantly enhances zero-shot generalization performance across ten unseen VRP variants and real-world benchmark instances.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"471-485"},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Generative Model: A Data Synthesizing Framework for Multisource Heterogeneous Data 分布式生成模型:多源异构数据的数据综合框架
Pub Date : 2025-06-03 DOI: 10.1109/TAI.2025.3575548
Zuobin Xiong;Wei Li;Yingshu Li;Zhipeng Cai
Recent advancement in generative artificial intelligence (AI) influenced a broad area with successful applications across multiple domains, including computer vision, natural language processing, and the Internet of Things (IoT). However, many existing implementations rely on centralized architectures, which introduce security and privacy concerns while also increasing communication overhead. Limited research has explored the development of distributed generative models, particularly in scenarios where training data originate from various heterogeneous sources. To fill the gap, this article introduces a distributed generative model framework aimed at enhancing data generation in hierarchical IoT systems. The proposed framework supports distributed data generation across three distinct scenarios: feature-related data, label-related data, and feature-label nonrelated data. Furthermore, both synchronous and asynchronous update mechanisms are incorporated to accommodate diverse application requirements within IoT environments. Comprehensive experiments using simulated, image, and tabular datasets are conducted to assess the performance of the proposed framework in comparison with state-of-the-art methods. The results indicate that the framework effectively produces high-quality synthetic data while preserving the integrity of downstream tasks. Beyond large language models (LLMs), these findings suggest that generative AI has the potential to transform data generation in distributed IoT systems and be extended to a broader range of applications.
生成式人工智能(AI)的最新进展影响了广泛的领域,在多个领域取得了成功的应用,包括计算机视觉、自然语言处理和物联网(IoT)。然而,许多现有的实现依赖于集中式体系结构,这引入了安全和隐私问题,同时也增加了通信开销。有限的研究已经探索了分布式生成模型的发展,特别是在训练数据来自各种异构来源的情况下。为了填补这一空白,本文介绍了一种分布式生成模型框架,旨在增强分层物联网系统中的数据生成。提议的框架支持跨三种不同场景的分布式数据生成:与功能相关的数据、与标签相关的数据和与功能标签不相关的数据。此外,还结合了同步和异步更新机制,以适应物联网环境中的各种应用需求。使用模拟、图像和表格数据集进行综合实验,与最先进的方法进行比较,以评估所提出框架的性能。结果表明,该框架有效地生成了高质量的合成数据,同时保持了下游任务的完整性。除了大型语言模型(llm),这些发现表明,生成式人工智能有可能改变分布式物联网系统中的数据生成,并扩展到更广泛的应用领域。
{"title":"Distributed Generative Model: A Data Synthesizing Framework for Multisource Heterogeneous Data","authors":"Zuobin Xiong;Wei Li;Yingshu Li;Zhipeng Cai","doi":"10.1109/TAI.2025.3575548","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575548","url":null,"abstract":"Recent advancement in generative artificial intelligence (AI) influenced a broad area with successful applications across multiple domains, including computer vision, natural language processing, and the Internet of Things (IoT). However, many existing implementations rely on centralized architectures, which introduce security and privacy concerns while also increasing communication overhead. Limited research has explored the development of distributed generative models, particularly in scenarios where training data originate from various heterogeneous sources. To fill the gap, this article introduces a distributed generative model framework aimed at enhancing data generation in hierarchical IoT systems. The proposed framework supports distributed data generation across three distinct scenarios: feature-related data, label-related data, and feature-label nonrelated data. Furthermore, both synchronous and asynchronous update mechanisms are incorporated to accommodate diverse application requirements within IoT environments. Comprehensive experiments using simulated, image, and tabular datasets are conducted to assess the performance of the proposed framework in comparison with state-of-the-art methods. The results indicate that the framework effectively produces high-quality synthetic data while preserving the integrity of downstream tasks. Beyond large language models (LLMs), these findings suggest that generative AI has the potential to transform data generation in distributed IoT systems and be extended to a broader range of applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"399-411"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Linguistic Models: Investigating LLMs’ Metalinguistic Abilities 大型语言模型:调查法学硕士的元语言能力
Pub Date : 2025-06-03 DOI: 10.1109/TAI.2025.3575745
Gašper Beguš;Maksymilian Dąbkowski;Ryan Rhodes
The performance of large language models (LLMs) has recently improved to the point where models can perform well on many language tasks. We show here that—for the first time—the models can also generate valid metalinguistic analyses of language data. We outline a research program where the behavioral interpretability of LLMs on these tasks is tested via prompting. LLMs are trained primarily on text—as such, evaluating their metalinguistic abilities improves our understanding of their general capabilities and sheds new light on theoretical models in linguistics. We show that OpenAI’s [56] o1 vastly outperforms other models on tasks involving drawing syntactic trees and phonological generalization. We speculate that OpenAI o1’s unique advantage over other models may result from the model’s chain-of-thought mechanism, which mimics the structure of human reasoning used in complex cognitive tasks, such as linguistic analysis.
大型语言模型(llm)的性能最近得到了改进,模型可以很好地执行许多语言任务。我们在这里首次表明,这些模型也可以对语言数据生成有效的元语言分析。我们概述了一个研究项目,通过提示来测试法学硕士在这些任务上的行为可解释性。法学硕士主要在文本上进行培训,因此,评估他们的元语言能力可以提高我们对他们一般能力的理解,并为语言学的理论模型提供新的视角。我们表明,OpenAI的b[56]1在绘制句法树和语音泛化的任务上大大优于其他模型。我们推测,OpenAI o1相对于其他模型的独特优势可能源于该模型的思维链机制,该机制模仿了用于复杂认知任务(如语言分析)的人类推理结构。
{"title":"Large Linguistic Models: Investigating LLMs’ Metalinguistic Abilities","authors":"Gašper Beguš;Maksymilian Dąbkowski;Ryan Rhodes","doi":"10.1109/TAI.2025.3575745","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575745","url":null,"abstract":"The performance of large language models (LLMs) has recently improved to the point where models can perform well on many language tasks. We show here that—for the first time—the models can also generate valid metalinguistic analyses of language data. We outline a research program where the <italic>behavioral interpretability</i> of LLMs on these tasks is tested via prompting. LLMs are trained primarily on text—as such, evaluating their metalinguistic abilities improves our understanding of their general capabilities and sheds new light on theoretical models in linguistics. We show that OpenAI’s <xref>[56]</xref> o1 vastly outperforms other models on tasks involving drawing syntactic trees and phonological generalization. We speculate that OpenAI o1’s unique advantage over other models may result from the model’s <italic>chain-of-thought</i> mechanism, which mimics the structure of human reasoning used in complex cognitive tasks, such as linguistic analysis.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 12","pages":"3453-3467"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11022724","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ICQ-TransE: LLM-Enhanced Image-Caption-Question Translating Embeddings for Knowledge-Based Visual Question Answering 基于知识的视觉问答的llm增强图像标题问题翻译嵌入
Pub Date : 2025-06-03 DOI: 10.1109/TAI.2025.3575553
Heng Liu;Boyue Wang;Xiaoyan Li;Yanfeng Sun;Yongli Hu;Baocai Yin
In knowledge-based visual question answering (KB-VQA), the answer can be naturally represented by translating visual object embedding referred by the question according to the cross-modality relation embedding related to both the question and the image. Though the triplet representation of cross-modality knowledge is plausible and proven effective, these methods often encounter two challenges: 1) The semantic gap between the image and the question makes it difficult to accurately embed the cross-modality relation; and 2) the visual objects in the question often have ambiguous references in the input image. To solve the above challenges, we propose the image-caption-question translating embeddings (ICQ-TransE), which more effectively models both the cross-modality relation and the head entity of visual objects. Specifically, for cross-modality relation embedding, the designed image-caption-question information transmission mechanism transmits the information flow from image to question through the caption bridge, where the caption simultaneously has the visual content and the textual form. With this powerful bridge, cross-modality information can be more effectively fused, resulting in more precisely encoded relation embeddings. For the visual object embedding, instead of using a fixed number of visual regions as the previous methods, the most relevant visual regions to the question are dynamically selected. Experimental results on OK-VQA and KRVQA challenging datasets verify the effectiveness of ICQ-TransE compared with multiple state-of-the-art methods for visual question answering with knowledge. Our code will be available at https://github.com/cmcv2022/ICQ-TransE.
在基于知识的视觉问答(KB-VQA)中,根据与问题和图像相关的跨模态关系嵌入,通过翻译问题所引用的视觉对象嵌入,可以自然地表示答案。虽然跨模态知识的三元组表示是合理且有效的,但这些方法经常遇到两个挑战:1)图像和问题之间的语义差距使跨模态关系难以准确嵌入;2)问题中的视觉对象通常在输入图像中有模糊的引用。为了解决上述问题,我们提出了图像标题问题翻译嵌入(ICQ-TransE),它可以更有效地建模视觉对象的跨模态关系和头部实体。具体而言,对于跨模态关系嵌入,设计的图像-字幕-问题信息传递机制通过字幕桥将信息流从图像传递到问题,其中字幕同时具有视觉内容和文本形式。有了这个强大的桥梁,跨模态信息可以更有效地融合,从而产生更精确编码的关系嵌入。在视觉对象嵌入中,不像以前的方法那样使用固定数量的视觉区域,而是动态选择与问题最相关的视觉区域。在OK-VQA和KRVQA挑战性数据集上的实验结果验证了ICQ-TransE与多种最先进的基于知识的视觉问答方法的有效性。我们的代码可以在https://github.com/cmcv2022/ICQ-TransE上找到。
{"title":"ICQ-TransE: LLM-Enhanced Image-Caption-Question Translating Embeddings for Knowledge-Based Visual Question Answering","authors":"Heng Liu;Boyue Wang;Xiaoyan Li;Yanfeng Sun;Yongli Hu;Baocai Yin","doi":"10.1109/TAI.2025.3575553","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575553","url":null,"abstract":"In knowledge-based visual question answering (KB-VQA), the answer can be naturally represented by translating visual object embedding referred by the question according to the cross-modality relation embedding related to both the question and the image. Though the triplet representation of cross-modality knowledge is plausible and proven effective, these methods often encounter two challenges: 1) The semantic gap between the image and the question makes it difficult to accurately embed the cross-modality relation; and 2) the visual objects in the question often have ambiguous references in the input image. To solve the above challenges, we propose the image-caption-question translating embeddings (ICQ-TransE), which more effectively models both the cross-modality relation and the head entity of visual objects. Specifically, for cross-modality relation embedding, the designed image-caption-question information transmission mechanism transmits the information flow from image to question through the caption bridge, where the caption simultaneously has the visual content and the textual form. With this powerful bridge, cross-modality information can be more effectively fused, resulting in more precisely encoded relation embeddings. For the visual object embedding, instead of using a fixed number of visual regions as the previous methods, the most relevant visual regions to the question are dynamically selected. Experimental results on OK-VQA and KRVQA challenging datasets verify the effectiveness of ICQ-TransE compared with multiple state-of-the-art methods for visual question answering with knowledge. Our code will be available at <uri>https://github.com/cmcv2022/ICQ-TransE</uri>.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"412-425"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Artificial Intelligence Publication Information IEEE人工智能学报
Pub Date : 2025-06-02 DOI: 10.1109/TAI.2025.3569136
{"title":"IEEE Transactions on Artificial Intelligence Publication Information","authors":"","doi":"10.1109/TAI.2025.3569136","DOIUrl":"https://doi.org/10.1109/TAI.2025.3569136","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11020980","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Bias in Opportunistic Screening for MACE with Causal Reasoning. 用因果推理减轻MACE机会性筛查的偏倚。
Pub Date : 2025-05-08 DOI: 10.1109/tai.2025.3567961
Jialu Pi, Juan Maria Farina, Chieh-Ju Chao, Chadi Ayoub, Reza Arsanjani, Imon Banerjee

Mitigating population drift is vital for developing robust AI models for clinical use. While current methodologies focus on reducing demographic bias in disease predictions, they overlook the significant impact of chronic comorbidities. Addressing these complexities is essential to enhance predictive accuracy and reliability across diverse patient demographics, ultimately improving healthcare outcomes. We propose a causal reasoning framework to address selection bias in opportunistic screening for 1-year composite MACE risk using chest X-ray images. Training in high-risk primarily Caucasian patients (43% MACE event), the model was evaluated in a lower-risk emergency department setting (12.8% MACE event) and a relatively lower-risk external Asian patient population (23.81% MACE event) to assess selection bias effects. We benchmarked our approach against a high-performance disease classification model, a propensity score matching strategy, and a debiasing model for unknown biases. The causal+confounder framework achieved an AUC of 0.75 and 0.7 on Shift data and Shift external, outperforming baselines, and a comparable AUC of 0.7 on internal data despite penalties for confounders. It minimized disparities in confounding factors and surpassed traditional and state-of-the-art debiasing methods. Experimental data show that integrating causal reasoning and confounder adjustments in AI models enhances their effectiveness. This approach shows promise for creating fair and robust clinical decision support systems that account for population shifts, ultimately improving the reliability and ethical integrity of AI-driven clinical decision-making.

缓解人口漂移对于开发用于临床的强大人工智能模型至关重要。虽然目前的方法侧重于减少疾病预测中的人口统计学偏差,但它们忽视了慢性合并症的重大影响。解决这些复杂性对于提高不同患者统计数据预测的准确性和可靠性,最终改善医疗保健结果至关重要。我们提出了一个因果推理框架,以解决利用胸部x线图像对1年复合MACE风险进行机会性筛查时的选择偏差。在高风险的主要高加索患者(43% MACE事件)中进行培训,该模型在低风险的急诊科环境(12.8% MACE事件)和相对低风险的外部亚洲患者人群(23.81% MACE事件)中进行评估,以评估选择偏倚效应。我们将我们的方法与高性能疾病分类模型、倾向评分匹配策略和未知偏差的去偏模型进行了基准测试。因果+混杂因素框架在Shift数据和外部Shift数据上的AUC分别为0.75和0.7,优于基线,尽管对混杂因素进行了处罚,但内部数据的AUC可达0.7。它最大限度地减少了混杂因素的差异,超越了传统和最先进的去偏方法。实验数据表明,在人工智能模型中整合因果推理和混杂因素调整可以提高模型的有效性。这种方法有望创造公平和强大的临床决策支持系统,考虑到人口的变化,最终提高人工智能驱动的临床决策的可靠性和道德完整性。
{"title":"Mitigating Bias in Opportunistic Screening for MACE with Causal Reasoning.","authors":"Jialu Pi, Juan Maria Farina, Chieh-Ju Chao, Chadi Ayoub, Reza Arsanjani, Imon Banerjee","doi":"10.1109/tai.2025.3567961","DOIUrl":"10.1109/tai.2025.3567961","url":null,"abstract":"<p><p>Mitigating population drift is vital for developing robust AI models for clinical use. While current methodologies focus on reducing demographic bias in disease predictions, they overlook the significant impact of chronic comorbidities. Addressing these complexities is essential to enhance predictive accuracy and reliability across diverse patient demographics, ultimately improving healthcare outcomes. We propose a causal reasoning framework to address selection bias in opportunistic screening for 1-year composite MACE risk using chest X-ray images. Training in high-risk primarily Caucasian patients (43% MACE event), the model was evaluated in a lower-risk emergency department setting (12.8% MACE event) and a relatively lower-risk external Asian patient population (23.81% MACE event) to assess selection bias effects. We benchmarked our approach against a high-performance disease classification model, a propensity score matching strategy, and a debiasing model for unknown biases. The causal+confounder framework achieved an AUC of 0.75 and 0.7 on Shift data and Shift external, outperforming baselines, and a comparable AUC of 0.7 on internal data despite penalties for confounders. It minimized disparities in confounding factors and surpassed traditional and state-of-the-art debiasing methods. Experimental data show that integrating causal reasoning and confounder adjustments in AI models enhances their effectiveness. This approach shows promise for creating fair and robust clinical decision support systems that account for population shifts, ultimately improving the reliability and ethical integrity of AI-driven clinical decision-making.</p>","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12768338/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145914250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Artificial Intelligence Publication Information IEEE人工智能学报
Pub Date : 2025-04-30 DOI: 10.1109/TAI.2025.3557987
{"title":"IEEE Transactions on Artificial Intelligence Publication Information","authors":"","doi":"10.1109/TAI.2025.3557987","DOIUrl":"https://doi.org/10.1109/TAI.2025.3557987","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10980623","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1