Pub Date : 2025-03-17DOI: 10.1109/TAI.2025.3550913
Aranyak Maity;Ayan Banerjee;Sandeep K. S. Gupta
Errors in artificial intelligence (AI)-enabled autonomous systems (AASs) where both the cause and effect are unknown to the human operator at the time they occur are referred to as “unknown-unknown” errors. This article introduces a methodology for preemptively identifying “unknown-unknown” errors in AAS that arise due to unpredictable human interactions and complex real-world usage scenarios, potentially leading to critical safety incidents through unsafe shifts in operational data distributions. We posit that AAS functioning in human-in-the-loop and human-in-the-plant modes must adhere to established physical laws, even when unknown-unknown errors occur. Our approach employs constructing physics-guided models from operational data, coupled with conformal inference for assessing structural breaks in the underlying model caused by violations of physical laws, thereby facilitating early detection of such errors before unsafe shifts in operational data distribution occur. Validation across diverse contexts—zero-day vulnerabilities in autonomous vehicles, hardware failures in artificial pancreas systems, and design deficiencies in aircraft in maneuvering characteristics augmentation systems (MCASs)—demonstrates our framework's efficacy in preempting unsafe data distribution shifts due to unknown-unknowns. This methodology not only advances unknown-unknown error detection in AAS but also sets a new benchmark for integrating physics-guided models and machine learning to ensure system safety.
{"title":"Detection of Unknown-Unknowns in Human-in-Loop Human-in-Plant Safety Critical Systems","authors":"Aranyak Maity;Ayan Banerjee;Sandeep K. S. Gupta","doi":"10.1109/TAI.2025.3550913","DOIUrl":"https://doi.org/10.1109/TAI.2025.3550913","url":null,"abstract":"Errors in artificial intelligence (AI)-enabled autonomous systems (AASs) where both the cause and effect are unknown to the human operator at the time they occur are referred to as “unknown-unknown” errors. This article introduces a methodology for preemptively identifying “unknown-unknown” errors in AAS that arise due to unpredictable human interactions and complex real-world usage scenarios, potentially leading to critical safety incidents through unsafe shifts in operational data distributions. We posit that AAS functioning in human-in-the-loop and human-in-the-plant modes must adhere to established physical laws, even when unknown-unknown errors occur. Our approach employs constructing physics-guided models from operational data, coupled with conformal inference for assessing structural breaks in the underlying model caused by violations of physical laws, thereby facilitating early detection of such errors before unsafe shifts in operational data distribution occur. Validation across diverse contexts—zero-day vulnerabilities in autonomous vehicles, hardware failures in artificial pancreas systems, and design deficiencies in aircraft in maneuvering characteristics augmentation systems (MCASs)—demonstrates our framework's efficacy in preempting unsafe data distribution shifts due to unknown-unknowns. This methodology not only advances unknown-unknown error detection in AAS but also sets a new benchmark for integrating physics-guided models and machine learning to ensure system safety.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2526-2541"},"PeriodicalIF":0.0,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-15DOI: 10.1109/TAI.2025.3567434
Md. Ashikur Rahman;Md. Mamun Ali;Kawsar Ahmed;Imran Mahmud;Francis M. Bui;Li Chen;Mohammad Ali Moni
To prevent different chemicals from entering the brain, the blood–brain barrier penetrating peptide (3BPP) acts as a vital barrier between the bloodstream and the central nervous system (CNS). This barrier significantly hinders the treatment of neurological and CNS disorders. 3BPP can get beyond this barrier, making it easier to enter the brain and essential for treating CNS and neurological diseases and disorders. Computational techniques are being explored because traditional laboratory tests for 3BPP identification are costly and time-consuming. In this work, we introduced a novel technique for 3BPP prediction with a hybrid deep learning model. Our proposed model, Deep3BPP, leverages the LSA, a word embedding method for peptide sequence extraction, and integrates CNN with LSTM (CNN-LSTM) for the final prediction model. Deep3BPP performance metrics show a remarkable accuracy of 97.42%, a Kappa value of 0.9257, and an MCC of 0.9362. These findings indicate a more efficient and cost-effective method of identifying 3BPP, which has important implications for researchers in the pharmaceutical and medical industries. Thus, this work offers insightful information that can advance both scientific research and the well-being of people overall.
{"title":"Deep3BPP: Identification of Blood–Brain Barrier Penetrating Peptides Using Word Embedding Feature Extraction Method and CNN-LSTM","authors":"Md. Ashikur Rahman;Md. Mamun Ali;Kawsar Ahmed;Imran Mahmud;Francis M. Bui;Li Chen;Mohammad Ali Moni","doi":"10.1109/TAI.2025.3567434","DOIUrl":"https://doi.org/10.1109/TAI.2025.3567434","url":null,"abstract":"To prevent different chemicals from entering the brain, the blood–brain barrier penetrating peptide (3BPP) acts as a vital barrier between the bloodstream and the central nervous system (CNS). This barrier significantly hinders the treatment of neurological and CNS disorders. 3BPP can get beyond this barrier, making it easier to enter the brain and essential for treating CNS and neurological diseases and disorders. Computational techniques are being explored because traditional laboratory tests for 3BPP identification are costly and time-consuming. In this work, we introduced a novel technique for 3BPP prediction with a hybrid deep learning model. Our proposed model, Deep3BPP, leverages the LSA, a word embedding method for peptide sequence extraction, and integrates CNN with LSTM (CNN-LSTM) for the final prediction model. Deep3BPP performance metrics show a remarkable accuracy of 97.42%, a Kappa value of 0.9257, and an MCC of 0.9362. These findings indicate a more efficient and cost-effective method of identifying 3BPP, which has important implications for researchers in the pharmaceutical and medical industries. Thus, this work offers insightful information that can advance both scientific research and the well-being of people overall.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"562-570"},"PeriodicalIF":0.0,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-15DOI: 10.1109/TAI.2025.3570282
H. M. Dipu Kabir
Multitask learning is a popular approach to training high-performing neural networks with improved generalization. In this article, we propose a background class to achieve improved generalization at a lower computation compared to multitask learning to help researchers and organizations with limited computation power. We also present a methodology for selecting background images and discuss potential future improvements. We apply our approach to several datasets and achieve improved generalization with much lower computation. Through the class activation mappings (CAMs) of the trained models, we observed the tendency toward looking at a bigger picture with the proposed model training methodology. Applying the vision transformer with the proposed background class, we receive state-of-the-art (SOTA) performance on CIFAR-10C, Caltech-101, and CINIC-10 datasets.
{"title":"Reduction of Class Activation Uncertainty With Background Information","authors":"H. M. Dipu Kabir","doi":"10.1109/TAI.2025.3570282","DOIUrl":"https://doi.org/10.1109/TAI.2025.3570282","url":null,"abstract":"Multitask learning is a popular approach to training high-performing neural networks with improved generalization. In this article, we propose a background class to achieve improved generalization at a lower computation compared to multitask learning to help researchers and organizations with limited computation power. We also present a methodology for selecting background images and discuss potential future improvements. We apply our approach to several datasets and achieve improved generalization with much lower computation. Through the class activation mappings (CAMs) of the trained models, we observed the tendency toward looking at a bigger picture with the proposed model training methodology. Applying the vision transformer with the proposed background class, we receive state-of-the-art (SOTA) performance on CIFAR-10C, Caltech-101, and CINIC-10 datasets.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 1","pages":"571-585"},"PeriodicalIF":0.0,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145898263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-13DOI: 10.1109/TAI.2025.3551669
Anindita Mohanta;Sourav Dey Roy;Niharika Nath;Abhijit Datta;Mrinal Kanti Bhowmik
Cancer is one of the most severe diseases, affecting the lives of many people in the modern world. Among the various types of cancer, cervical cancer is one of the most frequently occurring cancers in the female population. In most cases, doctors and practitioners can typically only identify cervical cancer in its latter stages. Planning cancer therapy and increasing patient survival rates become very difficult as the disease progresses. As a result, diagnosing cervical cancer in its initial stages has become imperative to arrange proper therapy and surgery. In this article, we present a survey of automatic computerized methods for diagnosing cervical abnormalities based on microscopic imaging modalities. The present survey was conducted by defining a novel taxonomy of the surveyed techniques based on the approaches they used. We also discuss the challenges and subchallenges associated with an automatic cervical cancer diagnosis based on microscopic imaging modalities. Additionally, surveys on various public and private datasets used by the research community for developing new methods are presented. In this article, the performances of published papers are compared. The article concludes by suggesting possible research directions in these fields.
{"title":"A Comprehensive Survey on Diagnostic Microscopic Imaging Modalities, Challenges, Taxonomy, and Future Directions for Cervical Abnormality Detection and Grading","authors":"Anindita Mohanta;Sourav Dey Roy;Niharika Nath;Abhijit Datta;Mrinal Kanti Bhowmik","doi":"10.1109/TAI.2025.3551669","DOIUrl":"https://doi.org/10.1109/TAI.2025.3551669","url":null,"abstract":"Cancer is one of the most severe diseases, affecting the lives of many people in the modern world. Among the various types of cancer, cervical cancer is one of the most frequently occurring cancers in the female population. In most cases, doctors and practitioners can typically only identify cervical cancer in its latter stages. Planning cancer therapy and increasing patient survival rates become very difficult as the disease progresses. As a result, diagnosing cervical cancer in its initial stages has become imperative to arrange proper therapy and surgery. In this article, we present a survey of automatic computerized methods for diagnosing cervical abnormalities based on microscopic imaging modalities. The present survey was conducted by defining a novel taxonomy of the surveyed techniques based on the approaches they used. We also discuss the challenges and subchallenges associated with an automatic cervical cancer diagnosis based on microscopic imaging modalities. Additionally, surveys on various public and private datasets used by the research community for developing new methods are presented. In this article, the performances of published papers are compared. The article concludes by suggesting possible research directions in these fields.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2354-2383"},"PeriodicalIF":0.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-10DOI: 10.1109/TAI.2025.3549740
Jaime S. Cardoso;Ricardo P. M. Cruz;Tomé Albuquerque
In many real-world prediction tasks, the class labels contain information about the relative order between the labels that are not captured by commonly used loss functions such as multicategory cross-entropy. In ordinal regression, many works have incorporated ordinality into models and loss functions by promoting unimodality of the probability output. However, current approaches are based on heuristics, particularly nonparametric ones, which are still insufficiently explored in the literature. We analyze the set of unimodal distributions in the probability simplex, establishing fundamental properties and giving new perspectives to understand the ordinal regression problem. Two contributions are then proposed to incorporate the preference for unimodal distributions into the predictive model: 1) UnimodalNet, a new architecture that by construction ensures the output is a unimodal distribution, and 2) Wasserstein regularization, a new loss term that relies on the notion of projection in a set to promote unimodality. Experiments show that the new architecture achieves top performance, while the proposed new loss term is very competitive while maintaining high unimodality.
{"title":"Unimodal Distributions for Ordinal Regression","authors":"Jaime S. Cardoso;Ricardo P. M. Cruz;Tomé Albuquerque","doi":"10.1109/TAI.2025.3549740","DOIUrl":"https://doi.org/10.1109/TAI.2025.3549740","url":null,"abstract":"In many real-world prediction tasks, the class labels contain information about the relative order between the labels that are not captured by commonly used loss functions such as multicategory cross-entropy. In ordinal regression, many works have incorporated ordinality into models and loss functions by promoting unimodality of the probability output. However, current approaches are based on heuristics, particularly nonparametric ones, which are still insufficiently explored in the literature. We analyze the set of unimodal distributions in the probability simplex, establishing fundamental properties and giving new perspectives to understand the ordinal regression problem. Two contributions are then proposed to incorporate the preference for unimodal distributions into the predictive model: 1) UnimodalNet, a new architecture that by construction ensures the output is a unimodal distribution, and 2) Wasserstein regularization, a new loss term that relies on the notion of projection in a set to promote unimodality. Experiments show that the new architecture achieves top performance, while the proposed new loss term is very competitive while maintaining high unimodality.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2498-2509"},"PeriodicalIF":0.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10918699","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-08DOI: 10.1109/TAI.2025.3566926
Pavana Pradeep Kumar;Krishna Kant;Francesco Di Rienzo;Carlo Vallati
Correct pose/posture is crucial in most human activities, and increasingly in using computer screens of many form factors. In this article, we build a spatiotemporal reasoning infrastructure on top of standard computer vision (CV) algorithms to provide an alternate, much more accurate, faster method for tracking correct posture than pure deep learning (DL) methods. We use CV to determine poses of the 2-D human stick models from RGB images, which are further enhanced using depth information (from RGB-D camera) to determine relevant angles and compare them against the standards. By applying our method to two very different posture applications (knowledge worker and taekwondo), we show that it outperforms all others, including machine learning, deep learning, and time series-based prediction. Furthermore, superior performance is seen not only in the estimation accuracy but also in the estimation speed.
{"title":"Video-Based Human-Posture Monitoring From RGB-D Cameras","authors":"Pavana Pradeep Kumar;Krishna Kant;Francesco Di Rienzo;Carlo Vallati","doi":"10.1109/TAI.2025.3566926","DOIUrl":"https://doi.org/10.1109/TAI.2025.3566926","url":null,"abstract":"Correct pose/posture is crucial in most human activities, and increasingly in using computer screens of many form factors. In this article, we build a spatiotemporal reasoning infrastructure on top of standard computer vision (CV) algorithms to provide an alternate, much more accurate, faster method for tracking correct posture than pure deep learning (DL) methods. We use CV to determine poses of the 2-D human stick models from RGB images, which are further enhanced using depth information (from RGB-D camera) to determine relevant angles and compare them against the standards. By applying our method to two very different posture applications (knowledge worker and taekwondo), we show that it outperforms all others, including machine learning, deep learning, and time series-based prediction. Furthermore, superior performance is seen not only in the estimation accuracy but also in the estimation speed.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 12","pages":"3406-3416"},"PeriodicalIF":0.0,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1109/TAI.2025.3549396
Akshay Jain;Shiv Ram Dubey;Satish Kumar Singh;KC Santosh;Bidyut Baran Chaudhuri
Convolutional neural networks (CNNs) have made remarkable strides; however, they remain susceptible to vulnerabilities, particularly to image perturbations that humans can easily recognize. This weakness, often termed as “attacks,” underscores the limited robustness of CNNs and the need for research into fortifying their resistance against such manipulations. This study introduces a novel nonuniform illumination (NUI) attack technique, where images are subtly altered using varying NUI masks. Extensive experiments are conducted on widely accepted datasets including CIFAR10, TinyImageNet, CalTech256, and NWPU-RESISC45 focusing on image classification with 12 different NUI masks. The resilience of VGG, ResNet, MobilenetV3-small, InceptionV3, and EfficientNet_b0 models against NUI attacks are evaluated. Our results show a substantial decline in the CNN models’ classification accuracy when subjected to NUI attacks, due to changes in the image pixel value distribution, indicating their vulnerability under NUI. To mitigate this, a defense strategy is proposed, including NUI-attacked images, generated through the new NUI transformation, into the training set. The results demonstrate a significant enhancement in CNN model performance when confronted with perturbed images affected by NUI attacks. This strategy seeks to bolster CNN models’ resilience against NUI attacks. A comparative study with other attack techniques shows the effectiveness of the NUI attack and defense technique.1
1The code is available at https://github.com/Akshayjain97/Non-Uniform_Illumination
{"title":"Non-uniform Illumination Attack for Fooling Convolutional Neural Networks","authors":"Akshay Jain;Shiv Ram Dubey;Satish Kumar Singh;KC Santosh;Bidyut Baran Chaudhuri","doi":"10.1109/TAI.2025.3549396","DOIUrl":"https://doi.org/10.1109/TAI.2025.3549396","url":null,"abstract":"Convolutional neural networks (CNNs) have made remarkable strides; however, they remain susceptible to vulnerabilities, particularly to image perturbations that humans can easily recognize. This weakness, often termed as “attacks,” underscores the limited robustness of CNNs and the need for research into fortifying their resistance against such manipulations. This study introduces a novel nonuniform illumination (NUI) attack technique, where images are subtly altered using varying NUI masks. Extensive experiments are conducted on widely accepted datasets including CIFAR10, TinyImageNet, CalTech256, and NWPU-RESISC45 focusing on image classification with 12 different NUI masks. The resilience of VGG, ResNet, MobilenetV3-small, InceptionV3, and EfficientNet_b0 models against NUI attacks are evaluated. Our results show a substantial decline in the CNN models’ classification accuracy when subjected to NUI attacks, due to changes in the image pixel value distribution, indicating their vulnerability under NUI. To mitigate this, a defense strategy is proposed, including NUI-attacked images, generated through the new NUI transformation, into the training set. The results demonstrate a significant enhancement in CNN model performance when confronted with perturbed images affected by NUI attacks. This strategy seeks to bolster CNN models’ resilience against NUI attacks. A comparative study with other attack techniques shows the effectiveness of the NUI attack and defense technique.<xref><sup>1</sup></xref><fn><p><sup>1</sup>The code is available at <uri>https://github.com/Akshayjain97/Non-Uniform_Illumination</uri></p></fn>","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2476-2485"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, a novel neural network architecture called MalaNet is proposed for the detection and diagnosis of malaria, an infectious disease that poses a major global health challenge. The proposed neural network architecture is inspired by small-world network principles, which generally involve the introduction of new links. A small-world neural network is realized by establishing new connections, thereby reducing the average path length and increasing clustering coefficient. These characteristics are known to enhance interconnectivity and improve feature propagation within the network. In the context of malaria diagnosis, these characteristics of MalaNet can enhance detection accuracy and enable better generalization in scenarios with limited data availability. Broadly, two variants of MalaNet are proposed in this work. First, a small-world-inspired feed-forward neural network (FNN) is developed for symptom and categorical feature-based diagnosis, providing an accessible solution when blood smear images are unavailable. Subsequently, a small-world-inspired convolutional neural network (CNN) is developed for precise and automated diagnosis when blood smear images are available. Both variants of MalaNet are rigorously validated using the National Institute of Health Malaria dataset, a clinical dataset from Federal Polytechnic Ilaro Medical Centre, Nigeria, and the APTOS dataset. Comparative results against several state-of-the-art neural network models in the literature demonstrate MalaNet’s superior performance, generalization capability, and computational efficiency. The small-world neural network architecture proposed in this work enhances feature learning, diagnostic accuracy, and adaptability in limited-data and resource-constrained settings, motivating its application in disease diagnosis where timely and accurate results are critical.
{"title":"MalaNet: A Small World Inspired Neural Network for Automated Malaria Diagnosis","authors":"Shubham Dwivedi;Kartikeya Pandey;Kumar Shubham;Om Jee Pandey;Achyut Mani Tripathi;Tushar Sandhan;Rajesh M. Hegde","doi":"10.1109/TAI.2025.3549406","DOIUrl":"https://doi.org/10.1109/TAI.2025.3549406","url":null,"abstract":"In this work, a novel neural network architecture called MalaNet is proposed for the detection and diagnosis of malaria, an infectious disease that poses a major global health challenge. The proposed neural network architecture is inspired by small-world network principles, which generally involve the introduction of new links. A small-world neural network is realized by establishing new connections, thereby reducing the average path length and increasing clustering coefficient. These characteristics are known to enhance interconnectivity and improve feature propagation within the network. In the context of malaria diagnosis, these characteristics of MalaNet can enhance detection accuracy and enable better generalization in scenarios with limited data availability. Broadly, two variants of MalaNet are proposed in this work. First, a small-world-inspired feed-forward neural network (FNN) is developed for symptom and categorical feature-based diagnosis, providing an accessible solution when blood smear images are unavailable. Subsequently, a small-world-inspired convolutional neural network (CNN) is developed for precise and automated diagnosis when blood smear images are available. Both variants of MalaNet are rigorously validated using the National Institute of Health Malaria dataset, a clinical dataset from Federal Polytechnic Ilaro Medical Centre, Nigeria, and the APTOS dataset. Comparative results against several state-of-the-art neural network models in the literature demonstrate MalaNet’s superior performance, generalization capability, and computational efficiency. The small-world neural network architecture proposed in this work enhances feature learning, diagnostic accuracy, and adaptability in limited-data and resource-constrained settings, motivating its application in disease diagnosis where timely and accurate results are critical.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2486-2497"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1109/TAI.2025.3567431
Jinchao Han;Yan Yan;Baoxian Zhang
With the increasing utilization of unmanned aerial vehicles (UAVs) in military operations, multi-UAV air combat has been emerging as one of the most important modes for future warfare. How to achieve intelligent cooperative maneuver policies subject to limited information sharing caused by the communication constraints among UAVs is crucial for winning air combat. In this article, we formulate the communication-constrained multi-UAV air combat problem as a Markov game and propose a novel sparse inferred intention sharing multiagent reinforcement learning (SIIS-MARL) algorithm for improving the winning rate of multi-UAV air combat. Our proposed algorithm contains the following designs: An intention inference module that enables each UAV to infer the intentions of teammates through the theory of mind (ToM) network for improved cooperation among teammates, and an attention-based sparse transmission mechanism which utilizes the inferred intentions and encoded embeddings to learn communication weights of teammates for enabling efficient sparsity in communication without causing performance penalty. Simulation results validate the effectiveness of our proposed algorithm as compared with existing work.
{"title":"Towards Efficient Multi-UAV Air Combat: An Intention Inference and Sparse Transmission Based Multiagent Reinforcement Learning Algorithm","authors":"Jinchao Han;Yan Yan;Baoxian Zhang","doi":"10.1109/TAI.2025.3567431","DOIUrl":"https://doi.org/10.1109/TAI.2025.3567431","url":null,"abstract":"With the increasing utilization of unmanned aerial vehicles (UAVs) in military operations, multi-UAV air combat has been emerging as one of the most important modes for future warfare. How to achieve intelligent cooperative maneuver policies subject to limited information sharing caused by the communication constraints among UAVs is crucial for winning air combat. In this article, we formulate the communication-constrained multi-UAV air combat problem as a Markov game and propose a novel sparse inferred intention sharing multiagent reinforcement learning (SIIS-MARL) algorithm for improving the winning rate of multi-UAV air combat. Our proposed algorithm contains the following designs: An intention inference module that enables each UAV to infer the intentions of teammates through the theory of mind (ToM) network for improved cooperation among teammates, and an attention-based sparse transmission mechanism which utilizes the inferred intentions and encoded embeddings to learn communication weights of teammates for enabling efficient sparsity in communication without causing performance penalty. Simulation results validate the effectiveness of our proposed algorithm as compared with existing work.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 12","pages":"3441-3452"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}