首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
Demystifying MuZero Planning: Interpreting the Learned Model 揭开MuZero计划的神秘面纱:解释学习模型
Pub Date : 2025-07-22 DOI: 10.1109/TAI.2025.3591082
Hung Guei;Yan-Ru Ju;Wei-Yu Chen;Ti-Rong Wu
MuZero has achieved superhuman performance in various games by using a dynamics network to predict the environment dynamics for planning, without relying on simulators. However, the latent states learned by the dynamics network make its planning process opaque. This article aims to demystify MuZero’s model by interpreting the learned latent states. We incorporate observation reconstruction and state consistency into MuZero training and conduct an in-depth analysis to evaluate latent states across two board games: 9$,boldsymboltimes,$9 Go and Gomoku, and three Atari games: Breakout, Ms. Pacman, and Pong. Our findings reveal that while the dynamics network becomes less accurate over longer simulations, MuZero still performs effectively by using planning to correct errors. Our experiments also show that the dynamics network learns better latent states in board games than in Atari games. These insights contribute to a better understanding of MuZero and offer directions for future research to improve the performance, robustness, and interpretability of the MuZero algorithm.
MuZero通过动态网络预测环境动态进行规划,而不依赖模拟器,在各种游戏中取得了超人的表现。然而,动态网络学习到的潜在状态使得其规划过程不透明。本文旨在通过对习得潜在状态的解释来揭开MuZero模型的神秘面纱。我们将观察重建和状态一致性结合到MuZero训练中,并进行了深入的分析,以评估两种棋盘游戏(9$,boldsymboltimes,$9围棋和Gomoku)以及三种雅达利游戏(Breakout, Ms. Pacman和Pong)的潜在状态。我们的研究结果表明,虽然动态网络在长时间的模拟中变得不那么精确,但MuZero仍然通过使用计划来纠正错误而有效地执行。我们的实验还表明,动态网络在棋盘游戏中比在雅达利游戏中学习到更好的潜在状态。这些见解有助于更好地理解MuZero,并为未来研究提供方向,以提高MuZero算法的性能,鲁棒性和可解释性。
{"title":"Demystifying MuZero Planning: Interpreting the Learned Model","authors":"Hung Guei;Yan-Ru Ju;Wei-Yu Chen;Ti-Rong Wu","doi":"10.1109/TAI.2025.3591082","DOIUrl":"https://doi.org/10.1109/TAI.2025.3591082","url":null,"abstract":"MuZero has achieved superhuman performance in various games by using a dynamics network to predict the environment dynamics for planning, without relying on simulators. However, the latent states learned by the dynamics network make its planning process opaque. This article aims to demystify MuZero’s model by interpreting the learned latent states. We incorporate observation reconstruction and state consistency into MuZero training and conduct an in-depth analysis to evaluate latent states across two board games: 9<inline-formula><tex-math>$,boldsymboltimes,$</tex-math></inline-formula>9 Go and Gomoku, and three Atari games: Breakout, Ms. Pacman, and Pong. Our findings reveal that while the dynamics network becomes less accurate over longer simulations, MuZero still performs effectively by using planning to correct errors. Our experiments also show that the dynamics network learns better latent states in board games than in Atari games. These insights contribute to a better understanding of MuZero and offer directions for future research to improve the performance, robustness, and interpretability of the MuZero algorithm.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1025-1036"},"PeriodicalIF":0.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146176030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilabel Chest X-Ray Image Classification via Category Disentangled Causal Learning 基于类别解纠缠因果学习的多标签胸部x线图像分类
Pub Date : 2025-07-22 DOI: 10.1109/TAI.2025.3591094
Qiang Li;Mengdi Liu;Rihao Chang;Weizhi Nie;Shaojin Bai;Anan Liu
Chest X-rays (CXR) are widely used to diagnose chest diseases. Since patients often suffer from multiple diseases simultaneously, it is crucial to identify multiple abnormalities in a single CXR image, which is defined as a multilabel classification task. Recent methods aim to improve performance by leveraging label co-occurrences as prior knowledge. However, these statistical co-occurrences often introduce spurious correlations, which reduce the reliability of the model, and data imbalance further amplifies the harm of such spurious correlations for rare disease diagnosis. In this study, we proposed a category disentangled causal learning (CDCL) framework that considers both category-level and causal-level representations to provide robust and reliable CXR image diagnosis results. Specifically, we introduce the category attention (CA) mechanism to disentangle disease-specific features, enabling the model to effectively capture the discriminative features of each disease in the image. Additionally, we employ the label embeddings to learn a set of discriminative features at the global category level, complementing CA to enhance the effectiveness of category disentanglement. Causal intervention is then applied to the disentangled features to guide the model in learning true causal relationships, mitigating the impact of spurious correlations. The proposed CDCL framework was evaluated on the ChestX-Ray14 and CheXpert datasets, achieving mean AUC of 0.849 and 0.896, respectively. Ablation studies and visualization experiments demonstrated its competitiveness, particularly with significant improvements in rare disease identification.
胸部x光(CXR)被广泛用于诊断胸部疾病。由于患者通常同时患有多种疾病,因此在单个CXR图像中识别多个异常至关重要,这被定义为多标签分类任务。最近的方法旨在通过利用标签共现作为先验知识来提高性能。然而,这些统计共现往往会引入虚假相关性,从而降低模型的可靠性,数据不平衡进一步放大了这种虚假相关性对罕见病诊断的危害。在这项研究中,我们提出了一个类别解纠缠因果学习(CDCL)框架,该框架考虑了类别水平和因果水平的表征,以提供鲁棒和可靠的CXR图像诊断结果。具体来说,我们引入了类别注意(CA)机制来解开疾病特异性特征,使模型能够有效地捕获图像中每种疾病的判别特征。此外,我们使用标签嵌入来学习全局类别层面的一组判别特征,补充CA来提高类别解纠缠的有效性。然后将因果干预应用于解纠缠的特征,以指导模型学习真实的因果关系,减轻虚假相关性的影响。在ChestX-Ray14和CheXpert数据集上对所提出的CDCL框架进行了评估,平均AUC分别为0.849和0.896。消融研究和可视化实验证明了它的竞争力,特别是在罕见疾病识别方面的显着改善。
{"title":"Multilabel Chest X-Ray Image Classification via Category Disentangled Causal Learning","authors":"Qiang Li;Mengdi Liu;Rihao Chang;Weizhi Nie;Shaojin Bai;Anan Liu","doi":"10.1109/TAI.2025.3591094","DOIUrl":"https://doi.org/10.1109/TAI.2025.3591094","url":null,"abstract":"Chest X-rays (CXR) are widely used to diagnose chest diseases. Since patients often suffer from multiple diseases simultaneously, it is crucial to identify multiple abnormalities in a single CXR image, which is defined as a multilabel classification task. Recent methods aim to improve performance by leveraging label co-occurrences as prior knowledge. However, these statistical co-occurrences often introduce spurious correlations, which reduce the reliability of the model, and data imbalance further amplifies the harm of such spurious correlations for rare disease diagnosis. In this study, we proposed a category disentangled causal learning (CDCL) framework that considers both category-level and causal-level representations to provide robust and reliable CXR image diagnosis results. Specifically, we introduce the category attention (CA) mechanism to disentangle disease-specific features, enabling the model to effectively capture the discriminative features of each disease in the image. Additionally, we employ the label embeddings to learn a set of discriminative features at the global category level, complementing CA to enhance the effectiveness of category disentanglement. Causal intervention is then applied to the disentangled features to guide the model in learning true causal relationships, mitigating the impact of spurious correlations. The proposed CDCL framework was evaluated on the ChestX-Ray14 and CheXpert datasets, achieving mean AUC of 0.849 and 0.896, respectively. Ablation studies and visualization experiments demonstrated its competitiveness, particularly with significant improvements in rare disease identification.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1048-1061"},"PeriodicalIF":0.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146176016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency-Domain Feature Reconstruction Network With Memory Units for Anomaly Detection of Fused Magnesium Furnaces 基于记忆单元的电熔镁炉异常检测频域特征重构网络
Pub Date : 2025-07-22 DOI: 10.1109/TAI.2025.3591089
Qiang Liu;Yuxin Wang;Chao Yang;Jialin An;Yiu-ming Cheung
Anomaly detection of smelting process benefits the operation safety of fused magnesium furnaces (FMFs). While generative models that fit well complex data distributions in the latent space offer an effective way to anomaly detection, conventional generative models have difficulties in adapting to visual interferences such as dynamic water mist, dust, and on-site lighting changes. To this end, this article establishes a new frequency-domain feature reconstruction network with memory units for anomaly detection of fused magnesium furnaces. This network utilizes high-frequency filtering to extract features in the frequency domain to suppress the adverse effects of brightness variations caused by fluctuations in the furnace flame. Using the extracted frequency domain features, wavelet sampling is integrated with memory units for reconstruction to eliminate interferences in the frequency domain while preserving anomalous features, thereby alleviating overgeneralization. Moreover, a new adaptive threshold calculation method is proposed for the anomaly detection of FMFs. Finally, the effectiveness of the proposed method is demonstrated by using the image collected from a real FMF.
熔炼过程异常检测有利于熔镁炉的安全运行。虽然生成模型可以很好地拟合潜在空间中复杂数据分布,为异常检测提供了有效的方法,但传统的生成模型在适应动态水雾、粉尘和现场照明变化等视觉干扰方面存在困难。为此,本文建立了一种新的带有记忆单元的频域特征重构网络,用于熔镁炉的异常检测。该网络利用高频滤波在频域提取特征,以抑制由炉膛火焰波动引起的亮度变化的不利影响。利用提取的频域特征,将小波采样与存储单元相结合进行重构,在保留异常特征的同时消除频域干扰,避免过度泛化。在此基础上,提出了一种新的自适应阈值计算方法。最后,用实际FMF采集的图像验证了该方法的有效性。
{"title":"Frequency-Domain Feature Reconstruction Network With Memory Units for Anomaly Detection of Fused Magnesium Furnaces","authors":"Qiang Liu;Yuxin Wang;Chao Yang;Jialin An;Yiu-ming Cheung","doi":"10.1109/TAI.2025.3591089","DOIUrl":"https://doi.org/10.1109/TAI.2025.3591089","url":null,"abstract":"Anomaly detection of smelting process benefits the operation safety of fused magnesium furnaces (FMFs). While generative models that fit well complex data distributions in the latent space offer an effective way to anomaly detection, conventional generative models have difficulties in adapting to visual interferences such as dynamic water mist, dust, and on-site lighting changes. To this end, this article establishes a new frequency-domain feature reconstruction network with memory units for anomaly detection of fused magnesium furnaces. This network utilizes high-frequency filtering to extract features in the frequency domain to suppress the adverse effects of brightness variations caused by fluctuations in the furnace flame. Using the extracted frequency domain features, wavelet sampling is integrated with memory units for reconstruction to eliminate interferences in the frequency domain while preserving anomalous features, thereby alleviating overgeneralization. Moreover, a new adaptive threshold calculation method is proposed for the anomaly detection of FMFs. Finally, the effectiveness of the proposed method is demonstrated by using the image collected from a real FMF.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1037-1047"},"PeriodicalIF":0.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stragglers Reimagined: Explainability-Driven Adaptive Federated Learning for Resource Constrained IoMT System 重新构想的掉队者:资源受限IoMT系统的可解释性驱动自适应联邦学习
Pub Date : 2025-07-18 DOI: 10.1109/TAI.2025.3590703
Riya Tapwal
In this article, we present FlexiFed, a framework designed to enhance federated learning (FL) by addressing the challenges of device inclusivity and data prioritization. FL systems typically exclude low-resource devices, known as stragglers, due to their limited computational power, leading to the loss of valuable and often unique data. Additionally, current FL systems lack transparency, making it difficult to prioritize the contributions of individual devices. FlexiFed overcomes these issues by enabling stragglers to share simplified outputs, such as predictions and key feature importance scores, instead of full model updates. This reduces their computational and communication burden. The framework integrates explainability techniques to identify and emphasize critical data, ensuring rare and significant contributions are prioritized during training. FlexiFed distinguishes itself from similar frameworks by combining hierarchical aggregation with explainability-driven prioritization, directly addressing the need for fairness and transparency in diverse and resource-constrained environments.
在本文中,我们介绍了FlexiFed,这是一个旨在通过解决设备包容性和数据优先级的挑战来增强联邦学习(FL)的框架。由于计算能力有限,FL系统通常排除低资源设备,即所谓的离散设备,这会导致有价值且通常是唯一数据的丢失。此外,目前的FL系统缺乏透明度,使得难以优先考虑单个设备的贡献。FlexiFed克服了这些问题,它使离散者能够共享简化的输出,如预测和关键特征重要性分数,而不是完整的模型更新。这减少了他们的计算和通信负担。该框架集成了可解释性技术,以识别和强调关键数据,确保在培训期间优先考虑罕见和重要的贡献。FlexiFed与类似框架的区别在于,它将分层聚合与可解释性驱动的优先级相结合,直接解决了在多样化和资源受限环境中对公平性和透明度的需求。
{"title":"Stragglers Reimagined: Explainability-Driven Adaptive Federated Learning for Resource Constrained IoMT System","authors":"Riya Tapwal","doi":"10.1109/TAI.2025.3590703","DOIUrl":"https://doi.org/10.1109/TAI.2025.3590703","url":null,"abstract":"In this article, we present FlexiFed, a framework designed to enhance federated learning (FL) by addressing the challenges of device inclusivity and data prioritization. FL systems typically exclude low-resource devices, known as stragglers, due to their limited computational power, leading to the loss of valuable and often unique data. Additionally, current FL systems lack transparency, making it difficult to prioritize the contributions of individual devices. FlexiFed overcomes these issues by enabling stragglers to share simplified outputs, such as predictions and key feature importance scores, instead of full model updates. This reduces their computational and communication burden. The framework integrates explainability techniques to identify and emphasize critical data, ensuring rare and significant contributions are prioritized during training. FlexiFed distinguishes itself from similar frameworks by combining hierarchical aggregation with explainability-driven prioritization, directly addressing the need for fairness and transparency in diverse and resource-constrained environments.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1002-1011"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146176015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning-Driven Optimization of a Sensor Network for Accurate Pollutant Source Identification 用于准确污染源识别的传感器网络机器学习驱动优化
Pub Date : 2025-07-18 DOI: 10.1109/TAI.2025.3590691
Sidi Mohammed Alaoui;Khalifa Djemal;Ehsan Sedgh Gooya;Amir Ali Feiz;Ayman Al Falou
Optimizing sensor networks for localizing atmospheric pollution sources and enhancing estimation accuracy remains a significant challenge in air pollution studies. To address this, various techniques have been recently developed. Among them, machine learning has demonstrated its ability to model and optimize complex problems, including sensor network optimization. To improve the localization of atmospheric pollution sources in air quality research activities, we propose, in this article, a machine learning-driven optimization of sensor networks method (ML-OSN). The method introduces a new combination of hierarchical agglomerative clustering and Siamese neural networks, thereby improving the prediction of similarities in pollutant concentrations across different wind directions and leading to an optimized sensor network. The proposed ML-OSN method was evaluated and compared with a standard clustering approach based on the Pearson correlation coefficient, using the augmented Indianapolis dataset. The resulting optimal sensor network configuration achieved broader spatial coverage and improved source estimation accuracy, reducing the error score to 1.34 compared with 1.44 obtained with the Pearson-based approach.
优化传感器网络定位大气污染源,提高估计精度,是大气污染研究面临的重大挑战。为了解决这个问题,最近开发了各种技术。其中,机器学习已经证明了其建模和优化复杂问题的能力,包括传感器网络优化。为了提高空气质量研究活动中大气污染源的定位,本文提出了一种机器学习驱动的传感器网络优化方法(ML-OSN)。该方法引入了分层聚集聚类和暹罗神经网络的新组合,从而改进了对不同风向污染物浓度相似性的预测,并导致优化的传感器网络。利用增强的Indianapolis数据集,对提出的ML-OSN方法进行了评估,并与基于Pearson相关系数的标准聚类方法进行了比较。所得到的最优传感器网络配置实现了更大的空间覆盖范围,提高了源估计精度,将误差分数降至1.34,而基于pearson的方法获得的误差分数为1.44。
{"title":"Machine Learning-Driven Optimization of a Sensor Network for Accurate Pollutant Source Identification","authors":"Sidi Mohammed Alaoui;Khalifa Djemal;Ehsan Sedgh Gooya;Amir Ali Feiz;Ayman Al Falou","doi":"10.1109/TAI.2025.3590691","DOIUrl":"https://doi.org/10.1109/TAI.2025.3590691","url":null,"abstract":"Optimizing sensor networks for localizing atmospheric pollution sources and enhancing estimation accuracy remains a significant challenge in air pollution studies. To address this, various techniques have been recently developed. Among them, machine learning has demonstrated its ability to model and optimize complex problems, including sensor network optimization. To improve the localization of atmospheric pollution sources in air quality research activities, we propose, in this article, a machine learning-driven optimization of sensor networks method (ML-OSN). The method introduces a new combination of hierarchical agglomerative clustering and Siamese neural networks, thereby improving the prediction of similarities in pollutant concentrations across different wind directions and leading to an optimized sensor network. The proposed ML-OSN method was evaluated and compared with a standard clustering approach based on the Pearson correlation coefficient, using the augmented Indianapolis dataset. The resulting optimal sensor network configuration achieved broader spatial coverage and improved source estimation accuracy, reducing the error score to 1.34 compared with 1.44 obtained with the Pearson-based approach.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"973-985"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Learning Feature Enhancement and High–Low Frequency Texture Interaction Networks for DIBR-Synthesized View Quality Assessment dibr合成视点质量评价的对比学习特征增强和高低频纹理交互网络
Pub Date : 2025-07-18 DOI: 10.1109/TAI.2025.3590692
Chongchong Jin;Yuanhao Cai;Yeyao Chen;Ting Luo;Zhouyan He;Yang Song
Depth image-based rendering (DIBR) is a common method for synthesizing virtual views to achieve smooth transitions in immersive media, but its immature technology often introduces distortions, adversely affecting visual quality. Obviously, accurately assessing the quality of synthesized views is crucial for monitoring and guiding the rendering process. To this end, this article proposes a no-reference deep learning-based quality assessment method for DIBR-synthesized views, which is primarily achieved by combining a contrastive learning feature enhancement network and a high–low frequency texture interaction network, abbreviated as CONTIN. Different from the traditional methods based on handcrafted feature extraction, the proposed method employs an end-to-end deep learning approach, fully exploiting the data characteristics and feature correlations. Specifically, to address the issue of sample expansion in existing deep learning methods, a contrastive sample database is first constructed by simulating various traditional and rendering distortions based on natural images, and training is performed on this database to obtain a contrastive learning feature enhancement network, which is used to extract contrastive features. Additionally, since contrastive learning tends to focus on learning abstract semantic-level features rather than pixel-level texture details, a wavelet transform decoupling is further applied to the synthetic distortion samples to construct a high–low frequency texture interaction network for extracting texture features. Finally, the two types of features are fused and regressed to generate the final quality score. Experimental results show that the proposed method achieves superior performance across three benchmark databases (namely, IRCCyN/IVC, IETR, andMCL-3D), with PLCC reaching 0.9404, 0.8380, and 0.9666, respectively, representing improvements of 0.0179, 0.0350, and 0.0175 higher than the existing best methods.
深度图像渲染(deep image-based rendering, DIBR)是一种在沉浸式媒体中合成虚拟视图以实现平滑过渡的常用方法,但其技术尚不成熟,往往会带来失真,对视觉质量产生不利影响。显然,准确地评估合成视图的质量对于监视和指导呈现过程至关重要。为此,本文提出了一种基于无参考深度学习的dibr合成视图质量评估方法,该方法主要通过对比学习特征增强网络和高低频纹理交互网络(简称CONTIN)相结合来实现。与传统基于手工特征提取的方法不同,该方法采用端到端深度学习方法,充分利用了数据特征和特征相关性。具体而言,针对现有深度学习方法中的样本扩展问题,首先基于自然图像模拟各种传统和渲染失真,构建对比样本数据库,并对该数据库进行训练,得到对比学习特征增强网络,用于提取对比特征。此外,由于对比学习倾向于学习抽象的语义级特征,而不是像素级纹理细节,因此进一步对合成畸变样本进行小波解耦,构建高低频纹理交互网络提取纹理特征。最后,对两类特征进行融合和回归,生成最终的质量分数。实验结果表明,该方法在三个基准数据库(IRCCyN/IVC、IETR和mcl - 3d)上取得了优异的性能,PLCC分别达到0.9404、0.8380和0.9666,比现有最佳方法分别提高0.0179、0.0350和0.0175。
{"title":"Contrastive Learning Feature Enhancement and High–Low Frequency Texture Interaction Networks for DIBR-Synthesized View Quality Assessment","authors":"Chongchong Jin;Yuanhao Cai;Yeyao Chen;Ting Luo;Zhouyan He;Yang Song","doi":"10.1109/TAI.2025.3590692","DOIUrl":"https://doi.org/10.1109/TAI.2025.3590692","url":null,"abstract":"Depth image-based rendering (DIBR) is a common method for synthesizing virtual views to achieve smooth transitions in immersive media, but its immature technology often introduces distortions, adversely affecting visual quality. Obviously, accurately assessing the quality of synthesized views is crucial for monitoring and guiding the rendering process. To this end, this article proposes a no-reference deep learning-based quality assessment method for DIBR-synthesized views, which is primarily achieved by combining a contrastive learning feature enhancement network and a high–low frequency texture interaction network, abbreviated as CONTIN. Different from the traditional methods based on handcrafted feature extraction, the proposed method employs an end-to-end deep learning approach, fully exploiting the data characteristics and feature correlations. Specifically, to address the issue of sample expansion in existing deep learning methods, a contrastive sample database is first constructed by simulating various traditional and rendering distortions based on natural images, and training is performed on this database to obtain a contrastive learning feature enhancement network, which is used to extract contrastive features. Additionally, since contrastive learning tends to focus on learning abstract semantic-level features rather than pixel-level texture details, a wavelet transform decoupling is further applied to the synthetic distortion samples to construct a high–low frequency texture interaction network for extracting texture features. Finally, the two types of features are fused and regressed to generate the final quality score. Experimental results show that the proposed method achieves superior performance across three benchmark databases (namely, IRCCyN/IVC, IETR, andMCL-3D), with PLCC reaching 0.9404, 0.8380, and 0.9666, respectively, representing improvements of 0.0179, 0.0350, and 0.0175 higher than the existing best methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"986-1001"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Normalizing Flow-Based Fine-Grained Modeling for Unknown Gesture Rejection in Myoelectric Pattern Recognition 肌电模式识别中未知手势拒绝的归一化流细粒度建模
Pub Date : 2025-07-18 DOI: 10.1109/TAI.2025.3590706
Jingyang Jia;Le Wu;Shengcai Duan;Xun Chen
Gesture recognition systems based on surface electromyography (sEMG) exhibit high accuracy in laboratory settings. However, they often underperform in real-world applications due to the occurrence of unknown gestures not encountered during training. Prototype learning methods, which learn gesture prototypes and classify unknown gestures based on distances to these prototypes, effectively reject unknown gestures. However, relying solely on global feature distances may overlook subtle variations, weakening discrimination between similar features and reducing the model’s ability to identify unknown gestures resembling known ones. To address these limitations, we propose a fine-grained method that models the probability distribution of each feature point, enabling the detection of subtle differences in partial features. Specifically, we employ normalizing flows to capture detailed information at the feature-point level. This approach enhances the model’s capacity to recognize challenging unknown gestures that partially differ from known gesture patterns. In addition, we introduce synthetic unknown gestures generated by applying slight perturbations to known samples, simulating challenging unknown scenarios. We then design a novel loss function that pulls known gestures closer together while pushing synthetic unknown gestures further apart, creating a more robust rejection model. Extensive experiments on both custom and public datasets demonstrate that our method achieves an area under the curve (AUC) of 0.988 on the custom dataset and an average AUC of 0.984 and 0.782 on the two public datasets, CapgMyo-DBc and NinaproDB5, respectively. These results indicate that the proposed method provides a robust and practical solution for reliable myoelectric control in real-world applications.
基于表面肌电图(sEMG)的手势识别系统在实验室环境中表现出很高的准确性。然而,由于在训练过程中没有遇到的未知手势的出现,它们在实际应用中往往表现不佳。原型学习方法学习手势原型,并根据与这些原型的距离对未知手势进行分类,可以有效地拒绝未知手势。然而,仅仅依赖全局特征距离可能会忽略细微的变化,削弱了相似特征之间的区别,降低了模型识别与已知手势相似的未知手势的能力。为了解决这些限制,我们提出了一种细粒度方法,该方法对每个特征点的概率分布进行建模,从而能够检测到部分特征的细微差异。具体地说,我们使用规范化流来捕获特征点级别的详细信息。这种方法增强了模型识别具有挑战性的未知手势的能力,这些手势与已知手势模式部分不同。此外,我们引入了通过对已知样本施加轻微扰动产生的合成未知手势,模拟具有挑战性的未知场景。然后,我们设计了一个新的损失函数,将已知的手势拉得更近,同时将合成的未知手势推得更远,从而创建了一个更健壮的拒绝模型。在自定义数据集和公共数据集上的大量实验表明,我们的方法在自定义数据集上实现了0.988的曲线下面积(AUC),在两个公共数据集CapgMyo-DBc和NinaproDB5上分别实现了0.984和0.782的平均AUC。这些结果表明,该方法为实际应用中可靠的肌电控制提供了鲁棒性和实用性的解决方案。
{"title":"Normalizing Flow-Based Fine-Grained Modeling for Unknown Gesture Rejection in Myoelectric Pattern Recognition","authors":"Jingyang Jia;Le Wu;Shengcai Duan;Xun Chen","doi":"10.1109/TAI.2025.3590706","DOIUrl":"https://doi.org/10.1109/TAI.2025.3590706","url":null,"abstract":"Gesture recognition systems based on surface electromyography (sEMG) exhibit high accuracy in laboratory settings. However, they often underperform in real-world applications due to the occurrence of unknown gestures not encountered during training. Prototype learning methods, which learn gesture prototypes and classify unknown gestures based on distances to these prototypes, effectively reject unknown gestures. However, relying solely on global feature distances may overlook subtle variations, weakening discrimination between similar features and reducing the model’s ability to identify unknown gestures resembling known ones. To address these limitations, we propose a fine-grained method that models the probability distribution of each feature point, enabling the detection of subtle differences in partial features. Specifically, we employ normalizing flows to capture detailed information at the feature-point level. This approach enhances the model’s capacity to recognize challenging unknown gestures that partially differ from known gesture patterns. In addition, we introduce synthetic unknown gestures generated by applying slight perturbations to known samples, simulating challenging unknown scenarios. We then design a novel loss function that pulls known gestures closer together while pushing synthetic unknown gestures further apart, creating a more robust rejection model. Extensive experiments on both custom and public datasets demonstrate that our method achieves an area under the curve (AUC) of 0.988 on the custom dataset and an average AUC of 0.984 and 0.782 on the two public datasets, CapgMyo-DBc and NinaproDB5, respectively. These results indicate that the proposed method provides a robust and practical solution for reliable myoelectric control in real-world applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1012-1024"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146176031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retraction Notice: Quantum-Assisted Activation for Supervised Learning in Healthcare-Based Intrusion Detection Systems 撤回通知:基于医疗保健的入侵检测系统中监督学习的量子辅助激活
Pub Date : 2025-07-14 DOI: 10.1109/TAI.2025.3582067
Nikhil Laxminarayana;Nimish Mishra;Prayag Tiwari;Sahil Garg;Bikash K. Behera;Ahmed Farouk
N. Laxminarayana, N. Mishra, P. Tiwari, S. Garg, B. K. Behera, and A. Farouk, “Quantum-assisted activation for supervised learning in healthcare-based intrusion detection systems,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 3, pp. 977–984, Mar. 2024.
N. Laxminarayana, N. Mishra, P. Tiwari, S. Garg, B. K. Behera, A. Farouk,“基于医疗保健的入侵检测系统中监督学习的量子辅助激活”,《IEEE人工智能学报》,第5卷,第5期。3,第977-984页,2024年3月。
{"title":"Retraction Notice: Quantum-Assisted Activation for Supervised Learning in Healthcare-Based Intrusion Detection Systems","authors":"Nikhil Laxminarayana;Nimish Mishra;Prayag Tiwari;Sahil Garg;Bikash K. Behera;Ahmed Farouk","doi":"10.1109/TAI.2025.3582067","DOIUrl":"https://doi.org/10.1109/TAI.2025.3582067","url":null,"abstract":"N. Laxminarayana, N. Mishra, P. Tiwari, S. Garg, B. K. Behera, and A. Farouk, “Quantum-assisted activation for supervised learning in healthcare-based intrusion detection systems,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 3, pp. 977–984, Mar. 2024.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"606-606"},"PeriodicalIF":0.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11080238","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization for Community Detection in Multilayer Networks: A Comprehensive Review and Novel Taxonomy 多层网络中社区检测的优化:综述与新分类
Pub Date : 2025-07-10 DOI: 10.1109/TAI.2025.3586828
Randa Boukabene;Fatima Benbouzid-Si Tayeb
Community detection is a rapidly growing field, especially for multilayer networks—systems with multiple interaction types. While these networks offer great potential, analyzing them remains complex and underexplored. Recently, researchers have turned to optimization techniques to address these challenges. However, despite diverse approaches, there’s no comprehensive study consolidating these advancements. To bridge this gap, this article provides a structured review of optimization techniques for community detection in multilayer networks, classifying methods by three criteria: resolution types, optimization types, and resolution methods. This aims to clarify the field and guide future research. This effort seeks to bring clarity to the field, offering a unified perspective on existing methods, while also providing a foundation to inspire and guide future research directions.
社区检测是一个快速发展的领域,特别是对于具有多种交互类型的多层网络系统。虽然这些网络提供了巨大的潜力,但分析它们仍然很复杂,而且尚未得到充分探索。最近,研究人员转向优化技术来解决这些挑战。然而,尽管有各种各样的方法,却没有全面的研究来巩固这些进步。为了弥补这一差距,本文对多层网络中社区检测的优化技术进行了结构化的回顾,并根据三个标准对方法进行了分类:分辨率类型、优化类型和分辨率方法。旨在厘清领域,指导未来的研究。这项工作旨在为该领域带来清晰度,为现有方法提供统一的视角,同时也为启发和指导未来的研究方向提供基础。
{"title":"Optimization for Community Detection in Multilayer Networks: A Comprehensive Review and Novel Taxonomy","authors":"Randa Boukabene;Fatima Benbouzid-Si Tayeb","doi":"10.1109/TAI.2025.3586828","DOIUrl":"https://doi.org/10.1109/TAI.2025.3586828","url":null,"abstract":"Community detection is a rapidly growing field, especially for multilayer networks—systems with multiple interaction types. While these networks offer great potential, analyzing them remains complex and underexplored. Recently, researchers have turned to optimization techniques to address these challenges. However, despite diverse approaches, there’s no comprehensive study consolidating these advancements. To bridge this gap, this article provides a structured review of optimization techniques for community detection in multilayer networks, classifying methods by three criteria: resolution types, optimization types, and resolution methods. This aims to clarify the field and guide future research. This effort seeks to bring clarity to the field, offering a unified perspective on existing methods, while also providing a foundation to inspire and guide future research directions.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"1185-1200"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Parameter Attention Sharing Transformer for Joint Human Activity and Identity Recognition 联合人体活动与身份识别的零参数注意力共享变压器
Pub Date : 2025-07-08 DOI: 10.1109/TAI.2025.3586571
Shuokang Huang;Po-Yu Chen;Peilin Zhou;Kaihan Li;Julie A. McCann
WiFi-based human sensing is gaining popularity thanks to it not requiring additional devices and not being as intrusive as cameras. Specifically, human features can be extracted from WiFi channel state information (CSI) to recognize human activities, identities, etc. However, most previous works rely on single-task learning models for recognition (e.g., to either recognize activities OR identities solely). The lack of cross-task knowledge sharing restricts these models to task-specific features and poor generalization. Recent studies have applied multitask learning (MTL) to tackle this, but their cross-task sharing modules add vast amounts of extra parameters. Such massive parameters increase model complexity and reduce time efficiency. In this article, we propose a novel zero-parameter attention sharing transformer (ZAST) to efficiently recognize both activities and identities. In ZAST, a cross-task attention on attention (CAoA) mechanism computes the relevance of attention scores for cross-task knowledge sharing, as a new paradigm for lightweight MTL. To mitigate the perturbation caused by attention sharing, we formulate a multihead similarity loss (L-MS) for stable model training. We further equip ZAST with channelwise squeeze and excitation (CSE) that efficiently learns the channel correlations of CSI. Extensive experiments on four public datasets indicate that ZAST achieves state-of-the-art recognition performance with the lowest complexity and the highest efficiency.
由于不需要额外的设备,而且不像摄像头那样具有侵入性,基于wifi的人体感应越来越受欢迎。具体来说,可以从WiFi信道状态信息(CSI)中提取人的特征来识别人的活动、身份等。然而,大多数先前的工作依赖于单一任务的学习模型进行识别(例如,要么单独识别活动,要么单独识别身份)。缺乏跨任务的知识共享限制了这些模型的特定于任务的特征和较差的泛化。最近的研究已经应用了多任务学习(MTL)来解决这个问题,但是他们的跨任务共享模块增加了大量的额外参数。如此庞大的参数增加了模型的复杂性,降低了时间效率。在本文中,我们提出了一种新的零参数注意力共享变压器(ZAST)来有效地识别活动和身份。在ZAST中,跨任务注意对注意(CAoA)机制计算了跨任务知识共享的注意分数的相关性,作为轻量级MTL的新范式。为了减轻由注意力共享引起的扰动,我们制定了一个多头相似损失(L-MS)用于稳定模型训练。我们进一步为ZAST配备了通道挤压和激励(CSE),可以有效地学习CSI的通道相关性。在四个公共数据集上的大量实验表明,ZAST以最低的复杂度和最高的效率达到了最先进的识别性能。
{"title":"Zero-Parameter Attention Sharing Transformer for Joint Human Activity and Identity Recognition","authors":"Shuokang Huang;Po-Yu Chen;Peilin Zhou;Kaihan Li;Julie A. McCann","doi":"10.1109/TAI.2025.3586571","DOIUrl":"https://doi.org/10.1109/TAI.2025.3586571","url":null,"abstract":"WiFi-based human sensing is gaining popularity thanks to it not requiring additional devices and not being as intrusive as cameras. Specifically, human features can be extracted from WiFi channel state information (CSI) to recognize human activities, identities, etc. However, most previous works rely on single-task learning models for recognition (e.g., to either recognize activities OR identities solely). The lack of cross-task knowledge sharing restricts these models to task-specific features and poor generalization. Recent studies have applied multitask learning (MTL) to tackle this, but their cross-task sharing modules add vast amounts of extra parameters. Such massive parameters increase model complexity and reduce time efficiency. In this article, we propose a novel zero-parameter attention sharing transformer (ZAST) to efficiently recognize both activities and identities. In ZAST, a cross-task attention on attention (CAoA) mechanism computes the relevance of attention scores for cross-task knowledge sharing, as a new paradigm for lightweight MTL. To mitigate the perturbation caused by attention sharing, we formulate a multihead similarity loss (L-MS) for stable model training. We further equip ZAST with channelwise squeeze and excitation (CSE) that efficiently learns the channel correlations of CSI. Extensive experiments on four public datasets indicate that ZAST achieves state-of-the-art recognition performance with the lowest complexity and the highest efficiency.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 2","pages":"960-972"},"PeriodicalIF":0.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146176019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1