首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
A Deterministic–Probabilistic Approach to Neural Network Pruning 神经网络剪枝的确定性-概率方法
Pub Date : 2025-04-08 DOI: 10.1109/TAI.2025.3558718
Soumyadipta Banerjee;Jiaul H. Paik
Modern deep networks are highly over-parameterized. Thus, training and testing such models in various applications are computationally intensive with excessive memory and energy requirements. Network pruning aims to find smaller subnetworks from within these dense networks that do not compromise on the test accuracy. In this article, we present a probabilistic and deterministic pruning methodology which determines the likelihood of retention of the weight parameters by modeling the layer-specific distribution of extreme values of the weights. Our method automatically finds the sparsity in each layer, unlike existing pruning techniques which require an explicit input of the sparsity information. Experiments in the present work show that deterministic–probabilistic pruning consistently achieves high sparsity levels, ranging from 65 to 95%, while maintaining comparable or improved testing accuracy across multiple datasets such as MNIST, CIFAR-10, and Tiny ImageNet, on architectures including VGG-16, ResNet-18, and ResNet-50.
现代深度网络是高度过度参数化的。因此,在各种应用程序中训练和测试这样的模型是计算密集型的,并且需要过多的内存和能量。网络修剪的目的是在这些密集的网络中找到较小的子网,这些子网不会影响测试的准确性。在本文中,我们提出了一种概率和确定性修剪方法,该方法通过对权重极值的特定层分布建模来确定权重参数保留的可能性。我们的方法可以自动找到每一层的稀疏性,而不像现有的修剪技术需要显式输入稀疏性信息。本工作中的实验表明,确定性-概率修剪始终达到高稀疏度水平,范围从65到95%,同时在多个数据集(如MNIST, CIFAR-10和Tiny ImageNet)上保持相当或改进的测试精度,架构包括VGG-16, ResNet-18和ResNet-50。
{"title":"A Deterministic–Probabilistic Approach to Neural Network Pruning","authors":"Soumyadipta Banerjee;Jiaul H. Paik","doi":"10.1109/TAI.2025.3558718","DOIUrl":"https://doi.org/10.1109/TAI.2025.3558718","url":null,"abstract":"Modern deep networks are highly over-parameterized. Thus, training and testing such models in various applications are computationally intensive with excessive memory and energy requirements. Network pruning aims to find smaller subnetworks from within these dense networks that do not compromise on the test accuracy. In this article, we present a probabilistic and deterministic pruning methodology which determines the likelihood of retention of the weight parameters by modeling the layer-specific distribution of extreme values of the weights. Our method automatically finds the sparsity in each layer, unlike existing pruning techniques which require an explicit input of the sparsity information. Experiments in the present work show that deterministic–probabilistic pruning consistently achieves high sparsity levels, ranging from 65 to 95%, while maintaining comparable or improved testing accuracy across multiple datasets such as MNIST, CIFAR-10, and Tiny ImageNet, on architectures including VGG-16, ResNet-18, and ResNet-50.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 10","pages":"2830-2839"},"PeriodicalIF":0.0,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145196043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Artificial Intelligence Publication Information IEEE人工智能学报
Pub Date : 2025-03-31 DOI: 10.1109/TAI.2025.3551528
{"title":"IEEE Transactions on Artificial Intelligence Publication Information","authors":"","doi":"10.1109/TAI.2025.3551528","DOIUrl":"https://doi.org/10.1109/TAI.2025.3551528","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10946100","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143740222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting 3-D Point Cloud Registration by Orthogonal Self-Ensemble Learning 正交自集成学习促进三维点云配准
Pub Date : 2025-03-30 DOI: 10.1109/TAI.2025.3575036
Mingzhi Yuan;Ao Shen;Yingfan Ma;Jie Du;Qiao Huang;Manning Wang
Deep learning has significantly advanced the development of point cloud registration. However, in recent years, some methods have relied on additional sensor information or complex network designs to improve registration performance, which incurs considerable computational overhead. These methods often struggle to strike a reasonable balance between computational cost and performance gains. To address this, we propose a plug-and-play orthogonal self-ensemble module designed to enhance registration performance with minimal additional overhead. Specifically, we design a novel ensemble learning strategy to mine the complementary information within the extracted features of previous methods. Unlike most ensemble learning methods, our method does not set multiple complex models for performance enhancement. Instead, it only cascades a lightweight dual-branch network after the features extracted by the original model to obtain two sets of features with more diversity. To further reduce redundancy between features and prevent the degradation of the dual-branch network, we introduce an orthogonal constraint that ensures the features output by the two branches are more complementary. Finally, by concatenating the two sets of complementary features, the final enhanced features are obtained. Compared to the original features, these enhanced features thoroughly exploit the internal information and exhibit greater distinctiveness, leading to improved registration performance. To validate the effectiveness of our method, we plug it into GeoTransformer, resulting in consistent performance improvements across 3DMatch, KITTI, and ModelNet40 datasets. Moreover, our method is compatible with other performance-enhancing methods. In conjunction with the overlap prior in PEAL, GeoTransformer achieves a new state-of-the-art performance.
深度学习极大地推动了点云配准的发展。然而,近年来,一些方法依赖于额外的传感器信息或复杂的网络设计来提高配准性能,这带来了相当大的计算开销。这些方法通常难以在计算成本和性能增益之间取得合理的平衡。为了解决这个问题,我们提出了一个即插即用的正交自集成模块,旨在以最小的额外开销提高注册性能。具体而言,我们设计了一种新的集成学习策略来挖掘先前方法提取的特征中的互补信息。与大多数集成学习方法不同,我们的方法不需要为性能增强设置多个复杂的模型。它只是在原始模型提取的特征之后级联一个轻量级的双分支网络,得到两组更多样化的特征。为了进一步减少特征之间的冗余并防止双分支网络的退化,我们引入了一个正交约束,以确保两个分支输出的特征更具互补性。最后,通过将两组互补特征串接,得到最终的增强特征。与原始特征相比,这些增强特征充分利用了内部信息,具有更强的显著性,从而提高了配准性能。为了验证我们方法的有效性,我们将其插入GeoTransformer,从而在3DMatch、KITTI和ModelNet40数据集上实现一致的性能改进。此外,我们的方法与其他性能增强方法兼容。与PEAL中的重叠先验相结合,GeoTransformer实现了新的最先进的性能。
{"title":"Boosting 3-D Point Cloud Registration by Orthogonal Self-Ensemble Learning","authors":"Mingzhi Yuan;Ao Shen;Yingfan Ma;Jie Du;Qiao Huang;Manning Wang","doi":"10.1109/TAI.2025.3575036","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575036","url":null,"abstract":"Deep learning has significantly advanced the development of point cloud registration. However, in recent years, some methods have relied on additional sensor information or complex network designs to improve registration performance, which incurs considerable computational overhead. These methods often struggle to strike a reasonable balance between computational cost and performance gains. To address this, we propose a plug-and-play orthogonal self-ensemble module designed to enhance registration performance with minimal additional overhead. Specifically, we design a novel ensemble learning strategy to mine the complementary information within the extracted features of previous methods. Unlike most ensemble learning methods, our method does not set multiple complex models for performance enhancement. Instead, it only cascades a lightweight dual-branch network after the features extracted by the original model to obtain two sets of features with more diversity. To further reduce redundancy between features and prevent the degradation of the dual-branch network, we introduce an orthogonal constraint that ensures the features output by the two branches are more complementary. Finally, by concatenating the two sets of complementary features, the final enhanced features are obtained. Compared to the original features, these enhanced features thoroughly exploit the internal information and exhibit greater distinctiveness, leading to improved registration performance. To validate the effectiveness of our method, we plug it into GeoTransformer, resulting in consistent performance improvements across 3DMatch, KITTI, and ModelNet40 datasets. Moreover, our method is compatible with other performance-enhancing methods. In conjunction with the overlap prior in PEAL, GeoTransformer achieves a new state-of-the-art performance.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"375-384"},"PeriodicalIF":0.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MALADY: Multiclass Active Learning With Auction Dynamics on Graphs MALADY:多班级主动学习与拍卖动态图
Pub Date : 2025-03-30 DOI: 10.1109/TAI.2025.3575038
Gokul Bhusal;Kevin Miller;Ekaterina Merkurjev
Active learning (AL) enhances the performance of machine learning (ML) methods, particularly in low-label rate scenarios, by judiciously selecting a limited number of unlabeled data points for labeling, with the goal of improving the performance of an underlying classifier. In this work, we introduce the multiclass AL with auction dynamics on graphs (MALADY) algorithm, which leverages an auction dynamics technique on similarity graphs for efficient AL. In particular, the proposed algorithm incorporates an AL loop using as its underlying semisupervised procedure an efficient and effective similarity graph-based auction method consisting of upper and lower bound auctions that integrate class size constraints. In addition, we introduce a novel AL acquisition function that incorporates the dual variable of the auction algorithm to measure the uncertainty in the classifier to prioritize queries near the decision boundaries between different classes. Overall, the proposed method can efficiently obtain accurate results using extremely small labeled sets containing just a few elements per class; this is crucial since labeled data are scarce for many applications. Moreover, the proposed technique can incorporate class size information, which improves accuracy even further. Last, using experiments on classification tasks and various datasets, we evaluate the performance of our proposed method and show that it exceeds that of comparison algorithms.
主动学习(AL)通过明智地选择有限数量的未标记数据点进行标记,以提高底层分类器的性能,增强了机器学习(ML)方法的性能,特别是在低标记率场景下。在这项工作中,我们引入了带有图上拍卖动态的多类人工智能(MALADY)算法,该算法利用相似图上的拍卖动态技术来实现高效的人工智能。特别是,所提出的算法结合了一个人工智能循环,作为其底层半监督过程,该循环使用了一种高效且有效的基于相似图的拍卖方法,该方法由整合了类大小约束的上界和下界拍卖组成。此外,我们引入了一种新的人工智能获取函数,该函数结合了拍卖算法的双变量来衡量分类器中的不确定性,从而在不同类别之间的决策边界附近优先考虑查询。总的来说,该方法可以使用极小的标记集,每个类只包含几个元素,从而有效地获得准确的结果;这是至关重要的,因为标记数据对于许多应用程序来说是稀缺的。此外,所提出的技术可以纳入班级规模信息,这进一步提高了准确性。最后,通过对分类任务和各种数据集的实验,我们评估了我们提出的方法的性能,并表明它优于比较算法。
{"title":"MALADY: Multiclass Active Learning With Auction Dynamics on Graphs","authors":"Gokul Bhusal;Kevin Miller;Ekaterina Merkurjev","doi":"10.1109/TAI.2025.3575038","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575038","url":null,"abstract":"Active learning (AL) enhances the performance of machine learning (ML) methods, particularly in low-label rate scenarios, by judiciously selecting a limited number of unlabeled data points for labeling, with the goal of improving the performance of an underlying classifier. In this work, we introduce the multiclass AL with auction dynamics on graphs (MALADY) algorithm, which leverages an auction dynamics technique on similarity graphs for efficient AL. In particular, the proposed algorithm incorporates an AL loop using as its underlying semisupervised procedure an efficient and effective similarity graph-based auction method consisting of upper and lower bound auctions that integrate class size constraints. In addition, we introduce a novel AL acquisition function that incorporates the dual variable of the auction algorithm to measure the uncertainty in the classifier to prioritize queries near the decision boundaries between different classes. Overall, the proposed method can efficiently obtain accurate results using extremely small labeled sets containing just a few elements per class; this is crucial since labeled data are scarce for many applications. Moreover, the proposed technique can incorporate class size information, which improves accuracy even further. Last, using experiments on classification tasks and various datasets, we evaluate the performance of our proposed method and show that it exceeds that of comparison algorithms.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"385-398"},"PeriodicalIF":0.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Multiscale Dynamic Graph Convolutional Network for Traffic Data Cognition 一种新的交通数据认知多尺度动态图卷积网络
Pub Date : 2025-03-29 DOI: 10.1109/TAI.2025.3574655
Jiyao An;Zhaohui Pu;Qingqin Liu;Lei Zhang;Md Sohel Rana
This article investigates traffic data cognitive modelling problem in real traffic scene by fully utilizing multiscale spatio-temporal dependence between multiple traffic nodes, along with a novel dynamic graph convolutional network (GCN). Most recently, the deep learning network model is weighed down by some practical problems focused on as follows: 1) The existing graph convolution operations typically aggregate information from the given k-hop neighbors; and 2) How to model the similarity of traffic data patterns among these nodes given the spatio-temporal heterogeneity of traffic data. In this article, we propose a novel hierarchical traffic data cognitive modelling framework called multiscale spatio-temporal dynamic graph convolutional network architecture (MSST-DGCN). And, a multiscale graph convolution module is first constructed to expand the receptive field of convolutional operations, by developing a novel sub-GCNs cumulative concatenation mechanism. Meanwhile, two specified dynamic graphs are designed to model the spatio-temporal correlation among these nodes from both a proximity and long-term perspective through a novel Gaussian calculation strategy, which are efficiently able to represent/cognize the dynamic similarity of traffic data patterns. Through a series of qualitative evaluations, the present model has the ability to perceive the traffic data pattern states of nodes. At last, two real world traffic datasets experiments are developed to show that the proposed approach achieves state-of-the-art traffic data cognitive performance.
本文通过充分利用多个交通节点之间的多尺度时空依赖关系,结合一种新的动态图卷积网络(GCN),研究了真实交通场景下的交通数据认知建模问题。最近,深度学习网络模型被一些实际问题所困扰,主要集中在以下几个方面:1)现有的图卷积操作通常是从给定的k-hop邻居中聚集信息;2)考虑交通数据的时空异质性,如何建立节点间交通数据模式相似性模型。在本文中,我们提出了一种新的分层交通数据认知建模框架——多尺度时空动态图卷积网络架构(MSST-DGCN)。并且,通过开发一种新的子gcns累积级联机制,首先构建了一个多尺度图卷积模块来扩展卷积运算的接受域。同时,设计了两个指定的动态图,通过一种新颖的高斯计算策略,从近距离和长期角度对这些节点之间的时空相关性进行建模,能够有效地表示/识别交通数据模式的动态相似度。通过一系列定性评价,该模型具有感知节点交通数据模式状态的能力。最后,通过两个真实世界的交通数据集实验表明,该方法达到了最先进的交通数据认知性能。
{"title":"A Novel Multiscale Dynamic Graph Convolutional Network for Traffic Data Cognition","authors":"Jiyao An;Zhaohui Pu;Qingqin Liu;Lei Zhang;Md Sohel Rana","doi":"10.1109/TAI.2025.3574655","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574655","url":null,"abstract":"This article investigates traffic data cognitive modelling problem in real traffic scene by fully utilizing multiscale spatio-temporal dependence between multiple traffic nodes, along with a novel dynamic graph convolutional network (GCN). Most recently, the deep learning network model is weighed down by some practical problems focused on as follows: 1) The existing graph convolution operations typically aggregate information from the given k-hop neighbors; and 2) How to model the similarity of traffic data patterns among these nodes given the spatio-temporal heterogeneity of traffic data. In this article, we propose a novel hierarchical traffic data cognitive modelling framework called multiscale spatio-temporal dynamic graph convolutional network architecture (MSST-DGCN). And, a multiscale graph convolution module is first constructed to expand the receptive field of convolutional operations, by developing a novel sub-GCNs cumulative concatenation mechanism. Meanwhile, two specified dynamic graphs are designed to model the spatio-temporal correlation among these nodes from both a proximity and long-term perspective through a novel Gaussian calculation strategy, which are efficiently able to represent/cognize the dynamic similarity of traffic data patterns. Through a series of qualitative evaluations, the present model has the ability to perceive the traffic data pattern states of nodes. At last, two real world traffic datasets experiments are developed to show that the proposed approach achieves state-of-the-art traffic data cognitive performance.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"362-374"},"PeriodicalIF":0.0,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Malicious Clients and Contribution Co-Aware Federated Unlearning 恶意客户端与贡献协同感知联合学习
Pub Date : 2025-03-28 DOI: 10.1109/TAI.2025.3556092
Yang Wang;Xue Li;Siguang Chen
Existing federated unlearning methods to eliminate the negative impact of malicious clients on the global model are influenced by unreasonable assumptions (e.g., an auxiliary dataset) or fail to balance model performance and efficiency. To overcome these shortcomings, we propose a malicious clients and contribution co-aware federated unlearning (MCC-Fed) method. Specifically, we introduce a method for detecting malicious clients to reduce their impact on the global model. Next, we design a contribution-aware metric, which accurately quantifies the negative impact of malicious clients on the global calculating their historical contribution ratio. Then, based on this metric, we propose a novel federated unlearning method in which benign clients use the contribution-aware metric as a regularization term to unlearn the influence of malicious clients, and restoring model performance. Experimental results demonstrate that our method effectively addresses the issue of excessive unlearning during the unlearning process, improves the efficiency of performance recovery, and enhances robustness against malicious clients. Federated unlearning effectively removes malicious clients’ influence while reducing training costs compared to retraining.
现有的消除恶意客户端对全局模型负面影响的联合学习方法受到不合理假设(例如,辅助数据集)的影响,或者无法平衡模型性能和效率。为了克服这些缺点,我们提出了一种恶意客户端和贡献共同感知联合学习(MCC-Fed)方法。具体来说,我们介绍了一种检测恶意客户端的方法,以减少它们对全局模型的影响。接下来,我们设计了一个贡献感知度量,该度量准确地量化了恶意客户端对全局的负面影响,并计算了它们的历史贡献率。然后,在此基础上,提出了一种新的联合学习方法,良性客户端使用贡献感知度量作为正则化项来忘记恶意客户端的影响,并恢复模型性能。实验结果表明,该方法有效地解决了学习过程中过度学习的问题,提高了性能恢复效率,增强了对恶意客户端的鲁棒性。与再培训相比,联合学习有效地消除了恶意客户的影响,同时降低了培训成本。
{"title":"Malicious Clients and Contribution Co-Aware Federated Unlearning","authors":"Yang Wang;Xue Li;Siguang Chen","doi":"10.1109/TAI.2025.3556092","DOIUrl":"https://doi.org/10.1109/TAI.2025.3556092","url":null,"abstract":"Existing federated unlearning methods to eliminate the negative impact of malicious clients on the global model are influenced by unreasonable assumptions (e.g., an auxiliary dataset) or fail to balance model performance and efficiency. To overcome these shortcomings, we propose a malicious clients and contribution co-aware federated unlearning (MCC-Fed) method. Specifically, we introduce a method for detecting malicious clients to reduce their impact on the global model. Next, we design a contribution-aware metric, which accurately quantifies the negative impact of malicious clients on the global calculating their historical contribution ratio. Then, based on this metric, we propose a novel federated unlearning method in which benign clients use the contribution-aware metric as a regularization term to unlearn the influence of malicious clients, and restoring model performance. Experimental results demonstrate that our method effectively addresses the issue of excessive unlearning during the unlearning process, improves the efficiency of performance recovery, and enhances robustness against malicious clients. Federated unlearning effectively removes malicious clients’ influence while reducing training costs compared to retraining.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 10","pages":"2848-2857"},"PeriodicalIF":0.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145196041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COSMIC: A Novel Contextualized Orientation Similarity Metric Incorporating Consistency for NLG Assessment COSMIC:一种包含一致性的新情境化取向相似性度量法,用于NLG评估
Pub Date : 2025-03-27 DOI: 10.1109/TAI.2025.3574292
Hadi Al Khansa;Mariette Awad
The field of natural language generation (NLG) has undergone remarkable expansion, largely enabled by enhanced model architectures, affordable computing, and the availability of large datasets. With NLG systems finding increasing adoption across many applications, the imperative to evaluate their performance has grown exponentially. However, relying solely on human evaluation for evaluation is nonscalable. To address this challenge, it is important to explore more scalable evaluation methodologies that can ensure the continued development and efficacy of NLG systems. Presently, only a few automated evaluation metrics are commonly utilized, with BLEU and ROUGE being the predominant choices. Yet, these metrics have faced criticism for their limited correlation with human judgment, their focus on surface-level similarity, and their tendency to overlook semantic nuances. While transformer metrics have been introduced to capture semantic similarity, our study reveals scenarios where even these metrics fail. Considering these limitations, we propose and validate a novel metric called “COSMIC,” which incorporates contradiction detection with contextual embedding similarity. To illustrate these limitations and showcase the performance of COSMIC, we conducted a case study using a fine-tuned LLAMA model to transform questions and short answers into declarative sentences. This task, despite its significance in generating natural language inference datasets, has not received widespread exploration since 2018. Results show that COSMIC can capture cases of contradiction between the reference and generated text while staying highly correlated with embeddings similarity when the reference and generated text are consistent and semantically similar. BLEU, ROUGE, and most transformer-based metrics demonstrate an inability to identify contradictions.
自然语言生成(NLG)领域已经经历了显著的扩展,这主要得益于增强的模型架构、可负担的计算和大型数据集的可用性。随着NLG系统在许多应用程序中得到越来越多的采用,评估其性能的必要性呈指数级增长。然而,仅仅依靠人的评价进行评价是不可扩展的。为了应对这一挑战,重要的是探索更具可扩展性的评估方法,以确保NLG系统的持续发展和有效性。目前,只有少数自动化评估度量标准被普遍使用,BLEU和ROUGE是主要的选择。然而,这些指标因其与人类判断的有限相关性、其对表面相似性的关注以及其忽视语义细微差别的倾向而面临批评。虽然已经引入了转换器度量来捕获语义相似性,但我们的研究揭示了甚至这些度量都失败的情况。考虑到这些限制,我们提出并验证了一个名为“COSMIC”的新度量,它将矛盾检测与上下文嵌入相似度结合在一起。为了说明这些限制并展示COSMIC的性能,我们进行了一个案例研究,使用经过微调的LLAMA模型将问题和简短答案转换为陈述句。尽管该任务在生成自然语言推理数据集方面具有重要意义,但自2018年以来并未得到广泛的探索。结果表明,当参考文献和生成文本一致且语义相似时,COSMIC可以捕获参考文献和生成文本之间的矛盾情况,同时与嵌入相似度保持高度相关。BLEU、ROUGE和大多数基于转换器的度量都无法识别矛盾。
{"title":"COSMIC: A Novel Contextualized Orientation Similarity Metric Incorporating Consistency for NLG Assessment","authors":"Hadi Al Khansa;Mariette Awad","doi":"10.1109/TAI.2025.3574292","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574292","url":null,"abstract":"The field of natural language generation (NLG) has undergone remarkable expansion, largely enabled by enhanced model architectures, affordable computing, and the availability of large datasets. With NLG systems finding increasing adoption across many applications, the imperative to evaluate their performance has grown exponentially. However, relying solely on human evaluation for evaluation is nonscalable. To address this challenge, it is important to explore more scalable evaluation methodologies that can ensure the continued development and efficacy of NLG systems. Presently, only a few automated evaluation metrics are commonly utilized, with BLEU and ROUGE being the predominant choices. Yet, these metrics have faced criticism for their limited correlation with human judgment, their focus on surface-level similarity, and their tendency to overlook semantic nuances. While transformer metrics have been introduced to capture semantic similarity, our study reveals scenarios where even these metrics fail. Considering these limitations, we propose and validate a novel metric called “COSMIC,” which incorporates contradiction detection with contextual embedding similarity. To illustrate these limitations and showcase the performance of COSMIC, we conducted a case study using a fine-tuned LLAMA model to transform questions and short answers into declarative sentences. This task, despite its significance in generating natural language inference datasets, has not received widespread exploration since 2018. Results show that COSMIC can capture cases of contradiction between the reference and generated text while staying highly correlated with embeddings similarity when the reference and generated text are consistent and semantically similar. BLEU, ROUGE, and most transformer-based metrics demonstrate an inability to identify contradictions.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"332-346"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IT2-ENFIS: Interval Type-2 Exclusionary Neuro-Fuzzy Inference System, an Attempt Toward Trustworthy Regression Learning 区间2型排他性神经模糊推理系统,一种可信回归学习的尝试
Pub Date : 2025-03-27 DOI: 10.1109/TAI.2025.3574299
Chuan Xue;Jianli Gao;Zhou Gu
As machine learning technologies progress and are increasingly applied to critical and sensitive fields, the reliability issues of earlier technologies are becoming more evident. For the new generation of machine learning solutions, trustworthiness frequently takes precedence over performance when evaluating their applicability for specific applications. This manuscript introduces the IT2-ENFIS neuro-fuzzy model, a robust and trustworthy single-network solution specifically designed for data regression tasks affected by substantial label noise and outliers. The primary architecture applies interval type-2 fuzzy logic and the Sugeno inference engine. A meta-heuristic gradient-based optimizer (GBO), the Huber loss function, and the Cauchy M-estimator are employed for robust learning. IT2-ENFIS demonstrates superior performance on noise-contaminated datasets and excels in real-world scenarios, with excellent generalization capability and interpretability.
随着机器学习技术的进步和越来越多地应用于关键和敏感领域,早期技术的可靠性问题变得越来越明显。对于新一代机器学习解决方案,在评估其对特定应用的适用性时,可信度通常优先于性能。本文介绍了IT2-ENFIS神经模糊模型,这是一种鲁棒且值得信赖的单网络解决方案,专为受大量标签噪声和异常值影响的数据回归任务而设计。主架构采用区间2型模糊逻辑和Sugeno推理引擎。采用基于梯度的元启发式优化器(GBO)、Huber损失函数和Cauchy m -估计器进行鲁棒学习。IT2-ENFIS在噪声污染数据集上表现优异,在现实场景中表现出色,具有出色的泛化能力和可解释性。
{"title":"IT2-ENFIS: Interval Type-2 Exclusionary Neuro-Fuzzy Inference System, an Attempt Toward Trustworthy Regression Learning","authors":"Chuan Xue;Jianli Gao;Zhou Gu","doi":"10.1109/TAI.2025.3574299","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574299","url":null,"abstract":"As machine learning technologies progress and are increasingly applied to critical and sensitive fields, the reliability issues of earlier technologies are becoming more evident. For the new generation of machine learning solutions, trustworthiness frequently takes precedence over performance when evaluating their applicability for specific applications. This manuscript introduces the IT2-ENFIS neuro-fuzzy model, a robust and trustworthy single-network solution specifically designed for data regression tasks affected by substantial label noise and outliers. The primary architecture applies interval type-2 fuzzy logic and the Sugeno inference engine. A meta-heuristic gradient-based optimizer (GBO), the Huber loss function, and the Cauchy M-estimator are employed for robust learning. IT2-ENFIS demonstrates superior performance on noise-contaminated datasets and excels in real-world scenarios, with excellent generalization capability and interpretability.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"347-361"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Deep Unfolded Quantum Machine Learning Framework 深度展开量子机器学习框架建模
Pub Date : 2025-03-26 DOI: 10.1109/TAI.2025.3573303
Shanika Iroshi Nanayakkara;Shiva Raj Pokhrel
Quantum machine learning models, like quantum neural networks (QNN) and quantum support vector classifiers (QSVC), often struggle with overfitting, slow convergence, and suboptimal generalization across various datasets. This article explores the advantages of integrating deep unfolding techniques into quantum models and develops a framework focusing on deep unfolded variational quantum classifiers (DVQC), deep unfolded quantum neural networks (DQNN), and deep unfolded QSVC (DQSVC). Our novel unfolding transforms quantum circuit training into a sequence of learnable layers, with each layer representing an optimization step that concurrently renews both circuit parameters and QNN hyperparameters. The proposed framework significantly improves training and test accuracy by dynamically adjusting learning rate, perturbations, and other similar hyperparameters, particularly on complex datasets like genomic and breast cancer. Our evaluation and experiment show that proposed DVQC and DQNN outperform baseline VQC and QNN, achieving 90% training accuracy and up to 20% higher test accuracy on genomic and adhoc datasets. DQSVC achieves 100% accuracy on adhoc and 97% on genomic datasets, surpassing the 90% test accuracy of traditional QSVC. Our implementation details will be publicly available.
量子机器学习模型,如量子神经网络(QNN)和量子支持向量分类器(QSVC),经常在各种数据集上遇到过拟合、缓慢收敛和次优泛化的问题。本文探讨了将深度展开技术集成到量子模型中的优势,并开发了一个专注于深度未展开变分量子分类器(DVQC)、深度未展开量子神经网络(DQNN)和深度未展开QSVC (DQSVC)的框架。我们的新颖展开将量子电路训练转换为一系列可学习层,每层代表一个优化步骤,同时更新电路参数和QNN超参数。提出的框架通过动态调整学习率、扰动和其他类似的超参数,特别是在基因组和乳腺癌等复杂数据集上,显著提高了训练和测试的准确性。我们的评估和实验表明,所提出的DVQC和DQNN优于基线VQC和QNN,在基因组和特殊数据集上达到90%的训练准确率和高达20%的测试准确率。DQSVC在adhoc数据集上达到100%的准确率,在基因组数据集上达到97%的准确率,超过了传统QSVC 90%的测试准确率。我们的实现细节将会公开。
{"title":"Modeling Deep Unfolded Quantum Machine Learning Framework","authors":"Shanika Iroshi Nanayakkara;Shiva Raj Pokhrel","doi":"10.1109/TAI.2025.3573303","DOIUrl":"https://doi.org/10.1109/TAI.2025.3573303","url":null,"abstract":"Quantum machine learning models, like quantum neural networks (QNN) and quantum support vector classifiers (QSVC), often struggle with overfitting, slow convergence, and suboptimal generalization across various datasets. This article explores the advantages of integrating deep unfolding techniques into quantum models and develops a framework focusing on deep unfolded variational quantum classifiers (DVQC), deep unfolded quantum neural networks (DQNN), and deep unfolded QSVC (DQSVC). Our novel unfolding transforms quantum circuit training into a sequence of learnable layers, with each layer representing an optimization step that concurrently renews both circuit parameters and QNN hyperparameters. The proposed framework significantly improves training and test accuracy by dynamically adjusting learning rate, perturbations, and other similar hyperparameters, particularly on complex datasets like genomic and breast cancer. Our evaluation and experiment show that proposed DVQC and DQNN outperform baseline VQC and QNN, achieving 90% training accuracy and up to 20% higher test accuracy on genomic and adhoc datasets. DQSVC achieves 100% accuracy on adhoc and 97% on genomic datasets, surpassing the 90% test accuracy of traditional QSVC. Our implementation details will be publicly available.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"321-331"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nuclei Segmentation Using Multiheaded U-Net and Shearlet-Based Unsharp Masking 基于多头U-Net和shearlet的非锐利掩蔽的核分割
Pub Date : 2025-03-26 DOI: 10.1109/TAI.2025.3572849
Shivam Mishra;Amit Vishwakarma;Anil Kumar
An automated nuclei segmentation is an important technique for understanding and analyzing cellular characteristics that ease computer-aided digital pathology and are useful for disease diagnosis. However, this task is difficult because of the diversity in nuclei size, blurry boundaries, and several imaging modalities. A convolutional neural network (CNN)-based multiheaded U-Net (M-UNet) framework has been proposed to address such issues. This architecture uses filters of different kernel sizes for multiple heads to extract multiresolution features of an image. Shearlet-based unsharp masking (SBUM) method is proposed for preprocessing, which primarily emphasizes features like contours, boundaries, and minute details of the source image. In this article, a hybrid loss function is formulated, which includes intersection over union (IOU) loss and Dice loss along with binary cross entropy loss. The hybrid loss function is tried to be minimized by the optimization algorithm, and the higher metrics values during the testing phase represent better segmentation performance in the spatial domain. The proposed method yields superior segmentation images and quantitative findings as compared to the state-of-the-art nuclei segmentation techniques. The proposed technique attains IOU, F1Score, accuracy, and precision values of 0.8325, 0.9086, 0.9651, and 0.9001, respectively.
自动细胞核分割是理解和分析细胞特征的一项重要技术,有助于计算机辅助数字病理和疾病诊断。然而,由于细胞核大小的多样性、边界模糊和多种成像方式,这项任务很困难。为了解决这些问题,提出了一种基于卷积神经网络(CNN)的多头U-Net (M-UNet)框架。该体系结构对多个头部使用不同核大小的过滤器来提取图像的多分辨率特征。提出了基于shearlet的非锐利掩蔽(SBUM)预处理方法,该方法主要强调源图像的轮廓、边界和微小细节等特征。本文建立了一种混合损失函数,它包括交联损失和骰子损失以及二元交叉熵损失。优化算法尽量使混合损失函数最小,测试阶段的度量值越高,在空间域中的分割性能越好。与最先进的核分割技术相比,提出的方法产生优越的分割图像和定量结果。该方法的IOU、F1Score、准确度和精度值分别为0.8325、0.9086、0.9651和0.9001。
{"title":"Nuclei Segmentation Using Multiheaded U-Net and Shearlet-Based Unsharp Masking","authors":"Shivam Mishra;Amit Vishwakarma;Anil Kumar","doi":"10.1109/TAI.2025.3572849","DOIUrl":"https://doi.org/10.1109/TAI.2025.3572849","url":null,"abstract":"An automated nuclei segmentation is an important technique for understanding and analyzing cellular characteristics that ease computer-aided digital pathology and are useful for disease diagnosis. However, this task is difficult because of the diversity in nuclei size, blurry boundaries, and several imaging modalities. A convolutional neural network (CNN)-based multiheaded U-Net (M-UNet) framework has been proposed to address such issues. This architecture uses filters of different kernel sizes for multiple heads to extract multiresolution features of an image. Shearlet-based unsharp masking (SBUM) method is proposed for preprocessing, which primarily emphasizes features like contours, boundaries, and minute details of the source image. In this article, a hybrid loss function is formulated, which includes intersection over union (IOU) loss and Dice loss along with binary cross entropy loss. The hybrid loss function is tried to be minimized by the optimization algorithm, and the higher metrics values during the testing phase represent better segmentation performance in the spatial domain. The proposed method yields superior segmentation images and quantitative findings as compared to the state-of-the-art nuclei segmentation techniques. The proposed technique attains IOU, F1Score, accuracy, and precision values of 0.8325, 0.9086, 0.9651, and 0.9001, respectively.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"297-307"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1