Pub Date : 2025-04-08DOI: 10.1109/TAI.2025.3558718
Soumyadipta Banerjee;Jiaul H. Paik
Modern deep networks are highly over-parameterized. Thus, training and testing such models in various applications are computationally intensive with excessive memory and energy requirements. Network pruning aims to find smaller subnetworks from within these dense networks that do not compromise on the test accuracy. In this article, we present a probabilistic and deterministic pruning methodology which determines the likelihood of retention of the weight parameters by modeling the layer-specific distribution of extreme values of the weights. Our method automatically finds the sparsity in each layer, unlike existing pruning techniques which require an explicit input of the sparsity information. Experiments in the present work show that deterministic–probabilistic pruning consistently achieves high sparsity levels, ranging from 65 to 95%, while maintaining comparable or improved testing accuracy across multiple datasets such as MNIST, CIFAR-10, and Tiny ImageNet, on architectures including VGG-16, ResNet-18, and ResNet-50.
{"title":"A Deterministic–Probabilistic Approach to Neural Network Pruning","authors":"Soumyadipta Banerjee;Jiaul H. Paik","doi":"10.1109/TAI.2025.3558718","DOIUrl":"https://doi.org/10.1109/TAI.2025.3558718","url":null,"abstract":"Modern deep networks are highly over-parameterized. Thus, training and testing such models in various applications are computationally intensive with excessive memory and energy requirements. Network pruning aims to find smaller subnetworks from within these dense networks that do not compromise on the test accuracy. In this article, we present a probabilistic and deterministic pruning methodology which determines the likelihood of retention of the weight parameters by modeling the layer-specific distribution of extreme values of the weights. Our method automatically finds the sparsity in each layer, unlike existing pruning techniques which require an explicit input of the sparsity information. Experiments in the present work show that deterministic–probabilistic pruning consistently achieves high sparsity levels, ranging from 65 to 95%, while maintaining comparable or improved testing accuracy across multiple datasets such as MNIST, CIFAR-10, and Tiny ImageNet, on architectures including VGG-16, ResNet-18, and ResNet-50.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 10","pages":"2830-2839"},"PeriodicalIF":0.0,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145196043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-30DOI: 10.1109/TAI.2025.3575036
Mingzhi Yuan;Ao Shen;Yingfan Ma;Jie Du;Qiao Huang;Manning Wang
Deep learning has significantly advanced the development of point cloud registration. However, in recent years, some methods have relied on additional sensor information or complex network designs to improve registration performance, which incurs considerable computational overhead. These methods often struggle to strike a reasonable balance between computational cost and performance gains. To address this, we propose a plug-and-play orthogonal self-ensemble module designed to enhance registration performance with minimal additional overhead. Specifically, we design a novel ensemble learning strategy to mine the complementary information within the extracted features of previous methods. Unlike most ensemble learning methods, our method does not set multiple complex models for performance enhancement. Instead, it only cascades a lightweight dual-branch network after the features extracted by the original model to obtain two sets of features with more diversity. To further reduce redundancy between features and prevent the degradation of the dual-branch network, we introduce an orthogonal constraint that ensures the features output by the two branches are more complementary. Finally, by concatenating the two sets of complementary features, the final enhanced features are obtained. Compared to the original features, these enhanced features thoroughly exploit the internal information and exhibit greater distinctiveness, leading to improved registration performance. To validate the effectiveness of our method, we plug it into GeoTransformer, resulting in consistent performance improvements across 3DMatch, KITTI, and ModelNet40 datasets. Moreover, our method is compatible with other performance-enhancing methods. In conjunction with the overlap prior in PEAL, GeoTransformer achieves a new state-of-the-art performance.
{"title":"Boosting 3-D Point Cloud Registration by Orthogonal Self-Ensemble Learning","authors":"Mingzhi Yuan;Ao Shen;Yingfan Ma;Jie Du;Qiao Huang;Manning Wang","doi":"10.1109/TAI.2025.3575036","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575036","url":null,"abstract":"Deep learning has significantly advanced the development of point cloud registration. However, in recent years, some methods have relied on additional sensor information or complex network designs to improve registration performance, which incurs considerable computational overhead. These methods often struggle to strike a reasonable balance between computational cost and performance gains. To address this, we propose a plug-and-play orthogonal self-ensemble module designed to enhance registration performance with minimal additional overhead. Specifically, we design a novel ensemble learning strategy to mine the complementary information within the extracted features of previous methods. Unlike most ensemble learning methods, our method does not set multiple complex models for performance enhancement. Instead, it only cascades a lightweight dual-branch network after the features extracted by the original model to obtain two sets of features with more diversity. To further reduce redundancy between features and prevent the degradation of the dual-branch network, we introduce an orthogonal constraint that ensures the features output by the two branches are more complementary. Finally, by concatenating the two sets of complementary features, the final enhanced features are obtained. Compared to the original features, these enhanced features thoroughly exploit the internal information and exhibit greater distinctiveness, leading to improved registration performance. To validate the effectiveness of our method, we plug it into GeoTransformer, resulting in consistent performance improvements across 3DMatch, KITTI, and ModelNet40 datasets. Moreover, our method is compatible with other performance-enhancing methods. In conjunction with the overlap prior in PEAL, GeoTransformer achieves a new state-of-the-art performance.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"375-384"},"PeriodicalIF":0.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-30DOI: 10.1109/TAI.2025.3575038
Gokul Bhusal;Kevin Miller;Ekaterina Merkurjev
Active learning (AL) enhances the performance of machine learning (ML) methods, particularly in low-label rate scenarios, by judiciously selecting a limited number of unlabeled data points for labeling, with the goal of improving the performance of an underlying classifier. In this work, we introduce the multiclass AL with auction dynamics on graphs (MALADY) algorithm, which leverages an auction dynamics technique on similarity graphs for efficient AL. In particular, the proposed algorithm incorporates an AL loop using as its underlying semisupervised procedure an efficient and effective similarity graph-based auction method consisting of upper and lower bound auctions that integrate class size constraints. In addition, we introduce a novel AL acquisition function that incorporates the dual variable of the auction algorithm to measure the uncertainty in the classifier to prioritize queries near the decision boundaries between different classes. Overall, the proposed method can efficiently obtain accurate results using extremely small labeled sets containing just a few elements per class; this is crucial since labeled data are scarce for many applications. Moreover, the proposed technique can incorporate class size information, which improves accuracy even further. Last, using experiments on classification tasks and various datasets, we evaluate the performance of our proposed method and show that it exceeds that of comparison algorithms.
{"title":"MALADY: Multiclass Active Learning With Auction Dynamics on Graphs","authors":"Gokul Bhusal;Kevin Miller;Ekaterina Merkurjev","doi":"10.1109/TAI.2025.3575038","DOIUrl":"https://doi.org/10.1109/TAI.2025.3575038","url":null,"abstract":"Active learning (AL) enhances the performance of machine learning (ML) methods, particularly in low-label rate scenarios, by judiciously selecting a limited number of unlabeled data points for labeling, with the goal of improving the performance of an underlying classifier. In this work, we introduce the multiclass AL with auction dynamics on graphs (MALADY) algorithm, which leverages an auction dynamics technique on similarity graphs for efficient AL. In particular, the proposed algorithm incorporates an AL loop using as its underlying semisupervised procedure an efficient and effective similarity graph-based auction method consisting of upper and lower bound auctions that integrate class size constraints. In addition, we introduce a novel AL acquisition function that incorporates the dual variable of the auction algorithm to measure the uncertainty in the classifier to prioritize queries near the decision boundaries between different classes. Overall, the proposed method can efficiently obtain accurate results using extremely small labeled sets containing just a few elements per class; this is crucial since labeled data are scarce for many applications. Moreover, the proposed technique can incorporate class size information, which improves accuracy even further. Last, using experiments on classification tasks and various datasets, we evaluate the performance of our proposed method and show that it exceeds that of comparison algorithms.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"385-398"},"PeriodicalIF":0.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article investigates traffic data cognitive modelling problem in real traffic scene by fully utilizing multiscale spatio-temporal dependence between multiple traffic nodes, along with a novel dynamic graph convolutional network (GCN). Most recently, the deep learning network model is weighed down by some practical problems focused on as follows: 1) The existing graph convolution operations typically aggregate information from the given k-hop neighbors; and 2) How to model the similarity of traffic data patterns among these nodes given the spatio-temporal heterogeneity of traffic data. In this article, we propose a novel hierarchical traffic data cognitive modelling framework called multiscale spatio-temporal dynamic graph convolutional network architecture (MSST-DGCN). And, a multiscale graph convolution module is first constructed to expand the receptive field of convolutional operations, by developing a novel sub-GCNs cumulative concatenation mechanism. Meanwhile, two specified dynamic graphs are designed to model the spatio-temporal correlation among these nodes from both a proximity and long-term perspective through a novel Gaussian calculation strategy, which are efficiently able to represent/cognize the dynamic similarity of traffic data patterns. Through a series of qualitative evaluations, the present model has the ability to perceive the traffic data pattern states of nodes. At last, two real world traffic datasets experiments are developed to show that the proposed approach achieves state-of-the-art traffic data cognitive performance.
{"title":"A Novel Multiscale Dynamic Graph Convolutional Network for Traffic Data Cognition","authors":"Jiyao An;Zhaohui Pu;Qingqin Liu;Lei Zhang;Md Sohel Rana","doi":"10.1109/TAI.2025.3574655","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574655","url":null,"abstract":"This article investigates traffic data cognitive modelling problem in real traffic scene by fully utilizing multiscale spatio-temporal dependence between multiple traffic nodes, along with a novel dynamic graph convolutional network (GCN). Most recently, the deep learning network model is weighed down by some practical problems focused on as follows: 1) The existing graph convolution operations typically aggregate information from the given k-hop neighbors; and 2) How to model the similarity of traffic data patterns among these nodes given the spatio-temporal heterogeneity of traffic data. In this article, we propose a novel hierarchical traffic data cognitive modelling framework called multiscale spatio-temporal dynamic graph convolutional network architecture (MSST-DGCN). And, a multiscale graph convolution module is first constructed to expand the receptive field of convolutional operations, by developing a novel sub-GCNs cumulative concatenation mechanism. Meanwhile, two specified dynamic graphs are designed to model the spatio-temporal correlation among these nodes from both a proximity and long-term perspective through a novel Gaussian calculation strategy, which are efficiently able to represent/cognize the dynamic similarity of traffic data patterns. Through a series of qualitative evaluations, the present model has the ability to perceive the traffic data pattern states of nodes. At last, two real world traffic datasets experiments are developed to show that the proposed approach achieves state-of-the-art traffic data cognitive performance.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"362-374"},"PeriodicalIF":0.0,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-28DOI: 10.1109/TAI.2025.3556092
Yang Wang;Xue Li;Siguang Chen
Existing federated unlearning methods to eliminate the negative impact of malicious clients on the global model are influenced by unreasonable assumptions (e.g., an auxiliary dataset) or fail to balance model performance and efficiency. To overcome these shortcomings, we propose a malicious clients and contribution co-aware federated unlearning (MCC-Fed) method. Specifically, we introduce a method for detecting malicious clients to reduce their impact on the global model. Next, we design a contribution-aware metric, which accurately quantifies the negative impact of malicious clients on the global calculating their historical contribution ratio. Then, based on this metric, we propose a novel federated unlearning method in which benign clients use the contribution-aware metric as a regularization term to unlearn the influence of malicious clients, and restoring model performance. Experimental results demonstrate that our method effectively addresses the issue of excessive unlearning during the unlearning process, improves the efficiency of performance recovery, and enhances robustness against malicious clients. Federated unlearning effectively removes malicious clients’ influence while reducing training costs compared to retraining.
{"title":"Malicious Clients and Contribution Co-Aware Federated Unlearning","authors":"Yang Wang;Xue Li;Siguang Chen","doi":"10.1109/TAI.2025.3556092","DOIUrl":"https://doi.org/10.1109/TAI.2025.3556092","url":null,"abstract":"Existing federated unlearning methods to eliminate the negative impact of malicious clients on the global model are influenced by unreasonable assumptions (e.g., an auxiliary dataset) or fail to balance model performance and efficiency. To overcome these shortcomings, we propose a malicious clients and contribution co-aware federated unlearning (MCC-Fed) method. Specifically, we introduce a method for detecting malicious clients to reduce their impact on the global model. Next, we design a contribution-aware metric, which accurately quantifies the negative impact of malicious clients on the global calculating their historical contribution ratio. Then, based on this metric, we propose a novel federated unlearning method in which benign clients use the contribution-aware metric as a regularization term to unlearn the influence of malicious clients, and restoring model performance. Experimental results demonstrate that our method effectively addresses the issue of excessive unlearning during the unlearning process, improves the efficiency of performance recovery, and enhances robustness against malicious clients. Federated unlearning effectively removes malicious clients’ influence while reducing training costs compared to retraining.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 10","pages":"2848-2857"},"PeriodicalIF":0.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145196041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-27DOI: 10.1109/TAI.2025.3574292
Hadi Al Khansa;Mariette Awad
The field of natural language generation (NLG) has undergone remarkable expansion, largely enabled by enhanced model architectures, affordable computing, and the availability of large datasets. With NLG systems finding increasing adoption across many applications, the imperative to evaluate their performance has grown exponentially. However, relying solely on human evaluation for evaluation is nonscalable. To address this challenge, it is important to explore more scalable evaluation methodologies that can ensure the continued development and efficacy of NLG systems. Presently, only a few automated evaluation metrics are commonly utilized, with BLEU and ROUGE being the predominant choices. Yet, these metrics have faced criticism for their limited correlation with human judgment, their focus on surface-level similarity, and their tendency to overlook semantic nuances. While transformer metrics have been introduced to capture semantic similarity, our study reveals scenarios where even these metrics fail. Considering these limitations, we propose and validate a novel metric called “COSMIC,” which incorporates contradiction detection with contextual embedding similarity. To illustrate these limitations and showcase the performance of COSMIC, we conducted a case study using a fine-tuned LLAMA model to transform questions and short answers into declarative sentences. This task, despite its significance in generating natural language inference datasets, has not received widespread exploration since 2018. Results show that COSMIC can capture cases of contradiction between the reference and generated text while staying highly correlated with embeddings similarity when the reference and generated text are consistent and semantically similar. BLEU, ROUGE, and most transformer-based metrics demonstrate an inability to identify contradictions.
{"title":"COSMIC: A Novel Contextualized Orientation Similarity Metric Incorporating Consistency for NLG Assessment","authors":"Hadi Al Khansa;Mariette Awad","doi":"10.1109/TAI.2025.3574292","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574292","url":null,"abstract":"The field of natural language generation (NLG) has undergone remarkable expansion, largely enabled by enhanced model architectures, affordable computing, and the availability of large datasets. With NLG systems finding increasing adoption across many applications, the imperative to evaluate their performance has grown exponentially. However, relying solely on human evaluation for evaluation is nonscalable. To address this challenge, it is important to explore more scalable evaluation methodologies that can ensure the continued development and efficacy of NLG systems. Presently, only a few automated evaluation metrics are commonly utilized, with BLEU and ROUGE being the predominant choices. Yet, these metrics have faced criticism for their limited correlation with human judgment, their focus on surface-level similarity, and their tendency to overlook semantic nuances. While transformer metrics have been introduced to capture semantic similarity, our study reveals scenarios where even these metrics fail. Considering these limitations, we propose and validate a novel metric called “COSMIC,” which incorporates contradiction detection with contextual embedding similarity. To illustrate these limitations and showcase the performance of COSMIC, we conducted a case study using a fine-tuned LLAMA model to transform questions and short answers into declarative sentences. This task, despite its significance in generating natural language inference datasets, has not received widespread exploration since 2018. Results show that COSMIC can capture cases of contradiction between the reference and generated text while staying highly correlated with embeddings similarity when the reference and generated text are consistent and semantically similar. BLEU, ROUGE, and most transformer-based metrics demonstrate an inability to identify contradictions.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"332-346"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-27DOI: 10.1109/TAI.2025.3574299
Chuan Xue;Jianli Gao;Zhou Gu
As machine learning technologies progress and are increasingly applied to critical and sensitive fields, the reliability issues of earlier technologies are becoming more evident. For the new generation of machine learning solutions, trustworthiness frequently takes precedence over performance when evaluating their applicability for specific applications. This manuscript introduces the IT2-ENFIS neuro-fuzzy model, a robust and trustworthy single-network solution specifically designed for data regression tasks affected by substantial label noise and outliers. The primary architecture applies interval type-2 fuzzy logic and the Sugeno inference engine. A meta-heuristic gradient-based optimizer (GBO), the Huber loss function, and the Cauchy M-estimator are employed for robust learning. IT2-ENFIS demonstrates superior performance on noise-contaminated datasets and excels in real-world scenarios, with excellent generalization capability and interpretability.
随着机器学习技术的进步和越来越多地应用于关键和敏感领域,早期技术的可靠性问题变得越来越明显。对于新一代机器学习解决方案,在评估其对特定应用的适用性时,可信度通常优先于性能。本文介绍了IT2-ENFIS神经模糊模型,这是一种鲁棒且值得信赖的单网络解决方案,专为受大量标签噪声和异常值影响的数据回归任务而设计。主架构采用区间2型模糊逻辑和Sugeno推理引擎。采用基于梯度的元启发式优化器(GBO)、Huber损失函数和Cauchy m -估计器进行鲁棒学习。IT2-ENFIS在噪声污染数据集上表现优异,在现实场景中表现出色,具有出色的泛化能力和可解释性。
{"title":"IT2-ENFIS: Interval Type-2 Exclusionary Neuro-Fuzzy Inference System, an Attempt Toward Trustworthy Regression Learning","authors":"Chuan Xue;Jianli Gao;Zhou Gu","doi":"10.1109/TAI.2025.3574299","DOIUrl":"https://doi.org/10.1109/TAI.2025.3574299","url":null,"abstract":"As machine learning technologies progress and are increasingly applied to critical and sensitive fields, the reliability issues of earlier technologies are becoming more evident. For the new generation of machine learning solutions, trustworthiness frequently takes precedence over performance when evaluating their applicability for specific applications. This manuscript introduces the IT2-ENFIS neuro-fuzzy model, a robust and trustworthy single-network solution specifically designed for data regression tasks affected by substantial label noise and outliers. The primary architecture applies interval type-2 fuzzy logic and the Sugeno inference engine. A meta-heuristic gradient-based optimizer (GBO), the Huber loss function, and the Cauchy M-estimator are employed for robust learning. IT2-ENFIS demonstrates superior performance on noise-contaminated datasets and excels in real-world scenarios, with excellent generalization capability and interpretability.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"347-361"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TAI.2025.3573303
Shanika Iroshi Nanayakkara;Shiva Raj Pokhrel
Quantum machine learning models, like quantum neural networks (QNN) and quantum support vector classifiers (QSVC), often struggle with overfitting, slow convergence, and suboptimal generalization across various datasets. This article explores the advantages of integrating deep unfolding techniques into quantum models and develops a framework focusing on deep unfolded variational quantum classifiers (DVQC), deep unfolded quantum neural networks (DQNN), and deep unfolded QSVC (DQSVC). Our novel unfolding transforms quantum circuit training into a sequence of learnable layers, with each layer representing an optimization step that concurrently renews both circuit parameters and QNN hyperparameters. The proposed framework significantly improves training and test accuracy by dynamically adjusting learning rate, perturbations, and other similar hyperparameters, particularly on complex datasets like genomic and breast cancer. Our evaluation and experiment show that proposed DVQC and DQNN outperform baseline VQC and QNN, achieving 90% training accuracy and up to 20% higher test accuracy on genomic and adhoc datasets. DQSVC achieves 100% accuracy on adhoc and 97% on genomic datasets, surpassing the 90% test accuracy of traditional QSVC. Our implementation details will be publicly available.
{"title":"Modeling Deep Unfolded Quantum Machine Learning Framework","authors":"Shanika Iroshi Nanayakkara;Shiva Raj Pokhrel","doi":"10.1109/TAI.2025.3573303","DOIUrl":"https://doi.org/10.1109/TAI.2025.3573303","url":null,"abstract":"Quantum machine learning models, like quantum neural networks (QNN) and quantum support vector classifiers (QSVC), often struggle with overfitting, slow convergence, and suboptimal generalization across various datasets. This article explores the advantages of integrating deep unfolding techniques into quantum models and develops a framework focusing on deep unfolded variational quantum classifiers (DVQC), deep unfolded quantum neural networks (DQNN), and deep unfolded QSVC (DQSVC). Our novel unfolding transforms quantum circuit training into a sequence of learnable layers, with each layer representing an optimization step that concurrently renews both circuit parameters and QNN hyperparameters. The proposed framework significantly improves training and test accuracy by dynamically adjusting learning rate, perturbations, and other similar hyperparameters, particularly on complex datasets like genomic and breast cancer. Our evaluation and experiment show that proposed DVQC and DQNN outperform baseline VQC and QNN, achieving 90% training accuracy and up to 20% higher test accuracy on genomic and adhoc datasets. DQSVC achieves 100% accuracy on adhoc and 97% on genomic datasets, surpassing the 90% test accuracy of traditional QSVC. Our implementation details will be publicly available.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"321-331"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TAI.2025.3572849
Shivam Mishra;Amit Vishwakarma;Anil Kumar
An automated nuclei segmentation is an important technique for understanding and analyzing cellular characteristics that ease computer-aided digital pathology and are useful for disease diagnosis. However, this task is difficult because of the diversity in nuclei size, blurry boundaries, and several imaging modalities. A convolutional neural network (CNN)-based multiheaded U-Net (M-UNet) framework has been proposed to address such issues. This architecture uses filters of different kernel sizes for multiple heads to extract multiresolution features of an image. Shearlet-based unsharp masking (SBUM) method is proposed for preprocessing, which primarily emphasizes features like contours, boundaries, and minute details of the source image. In this article, a hybrid loss function is formulated, which includes intersection over union (IOU) loss and Dice loss along with binary cross entropy loss. The hybrid loss function is tried to be minimized by the optimization algorithm, and the higher metrics values during the testing phase represent better segmentation performance in the spatial domain. The proposed method yields superior segmentation images and quantitative findings as compared to the state-of-the-art nuclei segmentation techniques. The proposed technique attains IOU, F1Score, accuracy, and precision values of 0.8325, 0.9086, 0.9651, and 0.9001, respectively.
{"title":"Nuclei Segmentation Using Multiheaded U-Net and Shearlet-Based Unsharp Masking","authors":"Shivam Mishra;Amit Vishwakarma;Anil Kumar","doi":"10.1109/TAI.2025.3572849","DOIUrl":"https://doi.org/10.1109/TAI.2025.3572849","url":null,"abstract":"An automated nuclei segmentation is an important technique for understanding and analyzing cellular characteristics that ease computer-aided digital pathology and are useful for disease diagnosis. However, this task is difficult because of the diversity in nuclei size, blurry boundaries, and several imaging modalities. A convolutional neural network (CNN)-based multiheaded U-Net (M-UNet) framework has been proposed to address such issues. This architecture uses filters of different kernel sizes for multiple heads to extract multiresolution features of an image. Shearlet-based unsharp masking (SBUM) method is proposed for preprocessing, which primarily emphasizes features like contours, boundaries, and minute details of the source image. In this article, a hybrid loss function is formulated, which includes intersection over union (IOU) loss and Dice loss along with binary cross entropy loss. The hybrid loss function is tried to be minimized by the optimization algorithm, and the higher metrics values during the testing phase represent better segmentation performance in the spatial domain. The proposed method yields superior segmentation images and quantitative findings as compared to the state-of-the-art nuclei segmentation techniques. The proposed technique attains IOU, F1Score, accuracy, and precision values of 0.8325, 0.9086, 0.9651, and 0.9001, respectively.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"297-307"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}