Accurately predicting stock prices remains a major challenge in financial analytics due to the complexity and noise inherent in market data. Feature selection plays a critical role in improving both computational efficiency and predictive performance. In this study, we introduce a novel hybrid framework that integrates metaheuristic feature-selection algorithms with an enhanced transformer-based prediction model fine-tuned using temporal embedding and adaptive attention pruning. We evaluate and compare the effectiveness of three nature-inspired metaheuristic algorithms: bat algorithm (BAT), gray wolf optimization (GWO), and beluga whale optimization (BWO) for selecting the most informative features from a time-series stock dataset. After feature selection, the optimal subsets are fed into our modified transformer equipped with temporal embeddings and adaptive attention pruning. Extensive experiments conducted on the Bharat Heavy Electricals Limited (BHEL) dataset show that the proposed hybrid framework outperforms traditional methods in terms of predictive accuracy. Among the evaluated approaches, the combination of BWO and the fine-tuned transformer achieves the best performance, yielding a Test RMSE of 0.0030 and a Test MAPE of 0.0108, demonstrating the superiority of BWO in identifying relevant features. This work provides a comprehensive comparative analysis of hybrid metaheuristic–deep learning models for stock price prediction and offers a foundation for integrating more explainable and scalable AI techniques into financial forecasting.
{"title":"A Hybrid Framework for Stock Price Forecasting Using Metaheuristic Feature Selection Approaches and Transformer Models Enhanced by Temporal Embedding and Attention Pruning","authors":"Amirhossein Malakouti Semnani, Sohrab Kordrostami, Amirhossein Refahi Sheikhani, Mohammad Hossein Moattar","doi":"10.1002/ail2.70018","DOIUrl":"https://doi.org/10.1002/ail2.70018","url":null,"abstract":"<p>Accurately predicting stock prices remains a major challenge in financial analytics due to the complexity and noise inherent in market data. Feature selection plays a critical role in improving both computational efficiency and predictive performance. In this study, we introduce a novel hybrid framework that integrates metaheuristic feature-selection algorithms with an enhanced transformer-based prediction model fine-tuned using temporal embedding and adaptive attention pruning. We evaluate and compare the effectiveness of three nature-inspired metaheuristic algorithms: bat algorithm (BAT), gray wolf optimization (GWO), and beluga whale optimization (BWO) for selecting the most informative features from a time-series stock dataset. After feature selection, the optimal subsets are fed into our modified transformer equipped with temporal embeddings and adaptive attention pruning. Extensive experiments conducted on the Bharat Heavy Electricals Limited (BHEL) dataset show that the proposed hybrid framework outperforms traditional methods in terms of predictive accuracy. Among the evaluated approaches, the combination of BWO and the fine-tuned transformer achieves the best performance, yielding a Test RMSE of 0.0030 and a Test MAPE of 0.0108, demonstrating the superiority of BWO in identifying relevant features. This work provides a comprehensive comparative analysis of hybrid metaheuristic–deep learning models for stock price prediction and offers a foundation for integrating more explainable and scalable AI techniques into financial forecasting.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial Intelligence (AI) is playing an increasingly vital role in the Industrial Internet of Things (IIoT), enabling predictive analytics, real-time monitoring, and autonomous operations across industries such as manufacturing, logistics, and energy. However, widespread adoption is hindered by technological, organizational, and infrastructural challenges. This paper examines the adoption, application, and challenges of AI–IIoT environments, focusing on implementation domains, adoption drivers, enabling technologies, and key barriers. We conducted a Systematic Literature Review (SLR using PRISMA). Peer-reviewed English-language journal articles published between 2018 and 2025 were sourced from ScienceDirect, Web of Science (WoS), Scopus, IEEE Xplore, Springer, Google Scholar, Elsevier, and Taylor & Francis. After applying inclusion criteria and screening procedures, 46 relevant journal articles were included for analysis. Key AI applications identified include predictive maintenance, anomaly detection, real-time monitoring, autonomous process control, and smart supply chains. Adoption is facilitated by external enablers 5G infrastructure, regulatory support, and internal factors, organizational readiness, and workforce skills. Challenges include data quality issues, cybersecurity risks, legacy system integration, and limited model scalability. Technologies such as edge computing, cloud platforms, and federated learning are instrumental in mitigating these challenges. While adoption is growing, significant barriers remain. AI has the potential to drive operational efficiency and innovation in IIoT, provided these constraints are addressed. This paper offers a comprehensive taxonomy of AI applications and proposes a framework of adoption factors, offering valuable insights for researchers, practitioners, and policymakers involved in AI-driven industrial transformation.
人工智能(AI)在工业物联网(IIoT)中发挥着越来越重要的作用,可以实现制造业、物流和能源等行业的预测分析、实时监控和自主运营。然而,广泛采用受到技术、组织和基础设施挑战的阻碍。本文研究了人工智能-工业物联网环境的采用、应用和挑战,重点关注实施领域、采用驱动因素、使能技术和关键障碍。我们使用PRISMA进行了系统文献综述(SLR)。2018年至2025年间发表的同行评议的英文期刊文章来自ScienceDirect、Web of Science (WoS)、Scopus、IEEE explore、施普林格、谷歌Scholar、Elsevier和Taylor & Francis。应用纳入标准和筛选程序后,纳入46篇相关期刊文章进行分析。确定的关键人工智能应用包括预测性维护、异常检测、实时监控、自主过程控制和智能供应链。5G基础设施、监管支持、内部因素、组织准备和劳动力技能等因素促进了采用。挑战包括数据质量问题、网络安全风险、遗留系统集成和有限的模型可扩展性。边缘计算、云平台和联合学习等技术有助于缓解这些挑战。虽然采用率在增长,但仍然存在重大障碍。如果这些限制得到解决,人工智能有可能提高工业物联网的运营效率和创新。本文提供了人工智能应用的全面分类,并提出了采用因素的框架,为参与人工智能驱动的产业转型的研究人员、从业者和政策制定者提供了有价值的见解。
{"title":"Unlocking IIoT Potential: A Systematic Review of AI Applications, Adoption Drivers, and Implementation Barriers","authors":"Tinashe Magara, Mampilo Phahlane","doi":"10.1002/ail2.70017","DOIUrl":"https://doi.org/10.1002/ail2.70017","url":null,"abstract":"<p>Artificial Intelligence (AI) is playing an increasingly vital role in the Industrial Internet of Things (IIoT), enabling predictive analytics, real-time monitoring, and autonomous operations across industries such as manufacturing, logistics, and energy. However, widespread adoption is hindered by technological, organizational, and infrastructural challenges. This paper examines the adoption, application, and challenges of AI–IIoT environments, focusing on implementation domains, adoption drivers, enabling technologies, and key barriers. We conducted a Systematic Literature Review (SLR using PRISMA). Peer-reviewed English-language journal articles published between 2018 and 2025 were sourced from ScienceDirect, Web of Science (WoS), Scopus, IEEE Xplore, Springer, Google Scholar, Elsevier, and Taylor & Francis. After applying inclusion criteria and screening procedures, 46 relevant journal articles were included for analysis. Key AI applications identified include predictive maintenance, anomaly detection, real-time monitoring, autonomous process control, and smart supply chains. Adoption is facilitated by external enablers 5G infrastructure, regulatory support, and internal factors, organizational readiness, and workforce skills. Challenges include data quality issues, cybersecurity risks, legacy system integration, and limited model scalability. Technologies such as edge computing, cloud platforms, and federated learning are instrumental in mitigating these challenges. While adoption is growing, significant barriers remain. AI has the potential to drive operational efficiency and innovation in IIoT, provided these constraints are addressed. This paper offers a comprehensive taxonomy of AI applications and proposes a framework of adoption factors, offering valuable insights for researchers, practitioners, and policymakers involved in AI-driven industrial transformation.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reinforcement learning (RL) has shown to be effective for simple automated cyber defence (ACD) type tasks. However, there are limitations to these approaches that prevent them from being deployed onto real-world hardware. Trained RL policies will often have limited transferability across even small changes to the environment setup. Instability during training can prevent optimal learning, a problem that only increases as the environment scales and grows in complexity. This work looks at addressing these limitations with a zero-shot transfer approach based on multi-agent RL. This is achieved by partitioning the task into smaller network machine subtasks, where agents learn the solution to the local problem. These local agents are independent of the network scale and can therefore be transferred to larger networks by mapping the agents to machines in the new network. Initial experiments show that this transfer method is effective for direct application to a number of ACD tasks. It is also shown that its performance is robust to changes in network activity, attack scenario and reduces the effects of network scale on performance.
{"title":"Multi-Agent Reinforcement Learning for Cyber Defence Transferability and Scalability","authors":"Andrew Thomas, Matthew Yates, Oliver Osborne","doi":"10.1002/ail2.70015","DOIUrl":"https://doi.org/10.1002/ail2.70015","url":null,"abstract":"<p>Reinforcement learning (RL) has shown to be effective for simple automated cyber defence (ACD) type tasks. However, there are limitations to these approaches that prevent them from being deployed onto real-world hardware. Trained RL policies will often have limited transferability across even small changes to the environment setup. Instability during training can prevent optimal learning, a problem that only increases as the environment scales and grows in complexity. This work looks at addressing these limitations with a zero-shot transfer approach based on multi-agent RL. This is achieved by partitioning the task into smaller network machine subtasks, where agents learn the solution to the local problem. These local agents are independent of the network scale and can therefore be transferred to larger networks by mapping the agents to machines in the new network. Initial experiments show that this transfer method is effective for direct application to a number of ACD tasks. It is also shown that its performance is robust to changes in network activity, attack scenario and reduces the effects of network scale on performance.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reinforcement learning (RL) solutions have shown considerable promise for automating the defense of networks to cyber attacks. However, a limitation to their real world deployment is the sample efficiency and generalizability of RL agents. This means that even small changes to attack types require a new agent to be trained from scratch. Meta-learning for RL aims to improve the sample efficiency of training agents by encoding pre-training information that assists fast adaptation. This work focuses on two key meta-learning approaches, MAML and ML3, representing differing approaches to encoding meta learning knowledge. Both approaches are limited to sets of environments that use the same action and observation space. To overcome this, we also present an extension to ML3, Gen ML3, that removes this requirement by training the learned loss on the reward information only. Experiments have been conducted on a distribution of network setups based on the PrimAITE environment. All approaches demonstrated improvements in sample efficiency against a PPO baseline for a range of automated cyber defense (ACD) tasks. We also show effective meta-learning across network topologies with Gen ML3.
{"title":"Meta Reinforcement Learning for Automated Cyber Defence","authors":"Andrew Thomas, Nick Tillyer","doi":"10.1002/ail2.70009","DOIUrl":"https://doi.org/10.1002/ail2.70009","url":null,"abstract":"<p>Reinforcement learning (RL) solutions have shown considerable promise for automating the defense of networks to cyber attacks. However, a limitation to their real world deployment is the sample efficiency and generalizability of RL agents. This means that even small changes to attack types require a new agent to be trained from scratch. Meta-learning for RL aims to improve the sample efficiency of training agents by encoding pre-training information that assists fast adaptation. This work focuses on two key meta-learning approaches, MAML and ML3, representing differing approaches to encoding meta learning knowledge. Both approaches are limited to sets of environments that use the same action and observation space. To overcome this, we also present an extension to ML3, Gen ML3, that removes this requirement by training the learned loss on the reward information only. Experiments have been conducted on a distribution of network setups based on the PrimAITE environment. All approaches demonstrated improvements in sample efficiency against a PPO baseline for a range of automated cyber defense (ACD) tasks. We also show effective meta-learning across network topologies with Gen ML3.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. P. Jayathunga, M. Ramashini, Juliana Zaini, R. Müller, Liyanage C. De Silva
This research aims to investigate the flying kinematics of bats, drawing inspiration from nature to benefit humans in achieving vision-independent flights. The project uses 50 cameras mounted inside a specially designed tunnel to analyze bat flying behavior. However, the simultaneous recording of all cameras generates numerous unnecessary image frames, prompting the need for a highly accurate filter. The study focuses on developing this filter using deep learning techniques to classify images based on the bat's location within each frame. We employed widely used one-stage object detection algorithms: YOLOv4, YOLOv4-tiny, and YOLOv5. Notably, YOLOv4, even though it is the older version we adapted, fulfills the intended objective of the study with better accuracy, for two different datasets, along with a higher learning process. Then, we closely looked at the tunable hyperparameters and augmentation techniques of the YOLOv4 model and adapted them in hyperparameter tuning on YOLOv5. Then we examined the impact of these hyperparameters and data augmentation techniques on the performance of YOLOv5L. Based on the result, we adapted the hyperparameters and data augmentation techniques, which have a positive impact on the YOLOv5 model. Improved YOLOv5L achieved better performance with a mean average precision (mAP) of 99.3% where the average precision (AP) of each classification scored more than 99% along with an auto anchor detection mechanism.
{"title":"Improved YOLOv5 for Efficient Elimination of Unwanted Video Frames From High-Speed Video Arrays Capturing Bat's Kinematics","authors":"D. P. Jayathunga, M. Ramashini, Juliana Zaini, R. Müller, Liyanage C. De Silva","doi":"10.1002/ail2.70014","DOIUrl":"https://doi.org/10.1002/ail2.70014","url":null,"abstract":"<p>This research aims to investigate the flying kinematics of bats, drawing inspiration from nature to benefit humans in achieving vision-independent flights. The project uses 50 cameras mounted inside a specially designed tunnel to analyze bat flying behavior. However, the simultaneous recording of all cameras generates numerous unnecessary image frames, prompting the need for a highly accurate filter. The study focuses on developing this filter using deep learning techniques to classify images based on the bat's location within each frame. We employed widely used one-stage object detection algorithms: YOLOv4, YOLOv4-tiny, and YOLOv5. Notably, YOLOv4, even though it is the older version we adapted, fulfills the intended objective of the study with better accuracy, for two different datasets, along with a higher learning process. Then, we closely looked at the tunable hyperparameters and augmentation techniques of the YOLOv4 model and adapted them in hyperparameter tuning on YOLOv5. Then we examined the impact of these hyperparameters and data augmentation techniques on the performance of YOLOv5L. Based on the result, we adapted the hyperparameters and data augmentation techniques, which have a positive impact on the YOLOv5 model. Improved YOLOv5L achieved better performance with a mean average precision (mAP) of 99.3% where the average precision (AP) of each classification scored more than 99% along with an auto anchor detection mechanism.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145626476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Kwame Amissah, Leonard Mensah Boante, Solomon Mensah, Ebenezer Owusu, Justice Kwame Appati
This study introduces a dynamically memory-adjusted whale optimization algorithm (DMA-WOA) for feature selection in polycystic ovary syndrome (PCOS) diagnosis. To overcome the standard WOA's limitations in balancing exploration and exploitation, DMA-WOA incorporated adaptive memory control to improve convergence stability and computational efficiency. In DMA-WOA adaptive control dynamics adjusted memory size and influence based on population diversity and fitness change, enabling consistent convergence in high-dimensional clinical data. The framework was evaluated on the only publicly available PCOS electronic health records dataset using diverse classifiers, including SVM, RF, LR, MLP, RNN, LSTM, GRU, TabTransformer, and TabNet. Results showed that DMA-WOA achieved superior accuracy, generalization, and runtime efficiency compared to baseline and standard WOA approaches, while comparative analysis with existing metaheuristics confirmed its enhanced optimization robustness and diagnostic reliability.
{"title":"Dynamic Memory-Augmented Whale Optimization Algorithm (DMA-WOA) as Feature Descriptor for Polycystic Ovary Syndrome Detection","authors":"Daniel Kwame Amissah, Leonard Mensah Boante, Solomon Mensah, Ebenezer Owusu, Justice Kwame Appati","doi":"10.1002/ail2.70016","DOIUrl":"https://doi.org/10.1002/ail2.70016","url":null,"abstract":"<p>This study introduces a dynamically memory-adjusted whale optimization algorithm (DMA-WOA) for feature selection in polycystic ovary syndrome (PCOS) diagnosis. To overcome the standard WOA's limitations in balancing exploration and exploitation, DMA-WOA incorporated adaptive memory control to improve convergence stability and computational efficiency. In DMA-WOA adaptive control dynamics adjusted memory size and influence based on population diversity and fitness change, enabling consistent convergence in high-dimensional clinical data. The framework was evaluated on the only publicly available PCOS electronic health records dataset using diverse classifiers, including SVM, RF, LR, MLP, RNN, LSTM, GRU, TabTransformer, and TabNet. Results showed that DMA-WOA achieved superior accuracy, generalization, and runtime efficiency compared to baseline and standard WOA approaches, while comparative analysis with existing metaheuristics confirmed its enhanced optimization robustness and diagnostic reliability.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70016","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145625968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nixson Okila, Andrew Katumba, Joyce Nakatumba-Nabende, Sudi Murindanyi, Jonathan Serugunda, Cosmas Mwikirize, Samuel Bugeza, Anthony Oriekot, Juliet Bosa, Eva Nabawanuka
Timely and accurate diagnosis of lung diseases is critical for reducing related morbidity and mortality. Lung ultrasound (LUS) has emerged as a useful point-of-care tool for evaluating various lung conditions. However, interpreting LUS images remains challenging due to operator-dependent variability, low image quality, and limited availability of experts in many regions. In this study, we present a lightweight and efficient deep learning model, ParSE-CNN, alongside fine-tuned versions of VGG-16, InceptionV3, Xception, and Vision Transformer architectures, to classify LUS images into three categories: COVID-19, other lung pathology, and healthy lung. Models were trained using data from public sources and Ugandan healthcare facilities, and evaluated on a held-out Ugandan dataset. Fine-tuned VGG-16 achieved the highest classification performance with 98% accuracy, 97% precision, 98% recall, and a 97% F1-score. ParSE-CNN yielded a competitive accuracy of 95%, precision of 94%, recall of 95%, and F1-score of 97% while offering a 58.3% faster inference time (0.006 s vs. 0.014 s) and a lower parameter count (5.18 M vs. 10.30 M) than VGG-16. To enhance input quality, we developed a preprocessing pipeline, and to improve interpretability, we employed Grad-CAM heatmaps, which showed high alignment with radiologically relevant features. Finally, ParSE-CNN was integrated into a mobile LUS workflow with a PC backend, enabling real-time AI-assisted diagnosis at the point of care in low-resource settings.
及时准确诊断肺部疾病对于降低相关发病率和死亡率至关重要。肺超声(LUS)已成为一个有用的点护理工具,评估各种肺部疾病。然而,由于操作员的可变性、低图像质量以及许多地区专家的有限可用性,解释LUS图像仍然具有挑战性。在这项研究中,我们提出了一个轻量级和高效的深度学习模型ParSE-CNN,以及VGG-16, InceptionV3, Xception和Vision Transformer架构的微调版本,将LUS图像分为三类:COVID-19,其他肺部病理和健康肺。使用来自公共资源和乌干达医疗设施的数据对模型进行了训练,并在一个闲置的乌干达数据集上进行了评估。经过微调的VGG-16达到了最高的分类性能,准确率为98%,精密度为97%,召回率为98%,f1分数为97%。与VGG-16相比,ParSE-CNN的竞争正确率为95%,精密度为94%,召回率为95%,f1得分为97%,推理时间缩短了58.3% (0.006 s vs. 0.014 s),参数计数更低(5.18 M vs. 10.30 M)。为了提高输入质量,我们开发了一个预处理管道,并提高可解释性,我们使用了Grad-CAM热图,该热图与放射学相关特征高度一致。最后,ParSE-CNN被集成到一个带有PC后端的移动LUS工作流程中,在资源匮乏的情况下,在护理点实现实时人工智能辅助诊断。
{"title":"Automated AI-Based Lung Disease Classification Using Point-of-Care Ultrasound","authors":"Nixson Okila, Andrew Katumba, Joyce Nakatumba-Nabende, Sudi Murindanyi, Jonathan Serugunda, Cosmas Mwikirize, Samuel Bugeza, Anthony Oriekot, Juliet Bosa, Eva Nabawanuka","doi":"10.1002/ail2.70012","DOIUrl":"https://doi.org/10.1002/ail2.70012","url":null,"abstract":"<p>Timely and accurate diagnosis of lung diseases is critical for reducing related morbidity and mortality. Lung ultrasound (LUS) has emerged as a useful point-of-care tool for evaluating various lung conditions. However, interpreting LUS images remains challenging due to operator-dependent variability, low image quality, and limited availability of experts in many regions. In this study, we present a lightweight and efficient deep learning model, ParSE-CNN, alongside fine-tuned versions of VGG-16, InceptionV3, Xception, and Vision Transformer architectures, to classify LUS images into three categories: COVID-19, other lung pathology, and healthy lung. Models were trained using data from public sources and Ugandan healthcare facilities, and evaluated on a held-out Ugandan dataset. Fine-tuned VGG-16 achieved the highest classification performance with 98% accuracy, 97% precision, 98% recall, and a 97% F1-score. ParSE-CNN yielded a competitive accuracy of 95%, precision of 94%, recall of 95%, and F1-score of 97% while offering a 58.3% faster inference time (0.006 s vs. 0.014 s) and a lower parameter count (5.18 M vs. 10.30 M) than VGG-16. To enhance input quality, we developed a preprocessing pipeline, and to improve interpretability, we employed Grad-CAM heatmaps, which showed high alignment with radiologically relevant features. Finally, ParSE-CNN was integrated into a mobile LUS workflow with a PC backend, enabling real-time AI-assisted diagnosis at the point of care in low-resource settings.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jia Yun Chua, Argyrios Zolotas, Miguel Arana-Catania
Remote sensing has become a vital tool across sectors such as urban planning, environmental monitoring, and disaster response. Although the volume of data generated has increased significantly, traditional vision models are often constrained by the requirement for extensive domain-specific labelled data and their limited ability to understand the context within complex environments. Vision Language Models offer a complementary approach by integrating visual and textual data; however, their application to remote sensing remains underexplored, particularly given their generalist nature. This work investigates the combination of vision models and VLMs to enhance image analysis in remote sensing, with a focus on aircraft detection and scene understanding. The integration of YOLO with VLMs such as LLaVA, ChatGPT, and Gemini aims to achieve more accurate and contextually aware image interpretation. Performance is evaluated on both labelled and unlabelled remote sensing data, as well as degraded image scenarios that are crucial for remote sensing. The findings show an average MAE improvement of 48.46% across models in the accuracy of aircraft detection and counting, especially in challenging conditions, in both raw and degraded scenarios. A 6.17% improvement in CLIPScore for comprehensive understanding of remote sensing images is obtained. The proposed approach combining traditional vision models and VLMs paves the way for more advanced and efficient remote sensing image analysis, especially in few-shot learning scenarios.
{"title":"Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models","authors":"Jia Yun Chua, Argyrios Zolotas, Miguel Arana-Catania","doi":"10.1002/ail2.70010","DOIUrl":"https://doi.org/10.1002/ail2.70010","url":null,"abstract":"<p>Remote sensing has become a vital tool across sectors such as urban planning, environmental monitoring, and disaster response. Although the volume of data generated has increased significantly, traditional vision models are often constrained by the requirement for extensive domain-specific labelled data and their limited ability to understand the context within complex environments. Vision Language Models offer a complementary approach by integrating visual and textual data; however, their application to remote sensing remains underexplored, particularly given their generalist nature. This work investigates the combination of vision models and VLMs to enhance image analysis in remote sensing, with a focus on aircraft detection and scene understanding. The integration of YOLO with VLMs such as LLaVA, ChatGPT, and Gemini aims to achieve more accurate and contextually aware image interpretation. Performance is evaluated on both labelled and unlabelled remote sensing data, as well as degraded image scenarios that are crucial for remote sensing. The findings show an average MAE improvement of 48.46% across models in the accuracy of aircraft detection and counting, especially in challenging conditions, in both raw and degraded scenarios. A 6.17% improvement in CLIPScore for comprehensive understanding of remote sensing images is obtained. The proposed approach combining traditional vision models and VLMs paves the way for more advanced and efficient remote sensing image analysis, especially in few-shot learning scenarios.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For identifying foliar diseases in crops at an early stage, accurate detection is necessary in maintaining food security, minimizing economic losses, and cultivating sustainable agriculture. In staple crops, potato is highly vulnerable to lethal diseases like Early Blight and Late Blight that can drastically affect both the quality and the quantity of the yield. Conventional diagnostic procedures using visual observation and/or laboratory examinations are frequently tedious, time-consuming, and susceptible to error. To address these problems, in this research, we propose a novel deep learning architecture using a customized convolutional neural network (CNN) for classifying potato leaf images into three distinct classes, namely Early Blight, Late Blight and Healthy. The model is trained on a selective and heavily augmented subset of the PlantVillage dataset containing 11,593 images and further optimized using regularization techniques like dropout and batch normalization. The system architecture is intended to keep the tradeoff between performance and computational efficiency, so as to fit real-world agricultural scenarios. To increase interpretability and improve trust, we use the Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize the regions in space of the leaves that most contribute to the prediction of the model. The experimental results show superior performance and the proposed model reaches 99.14% accuracy and close-to-perfect precision, recall and F1-scores in all of the classes. Grad-CAM visualizations validate that the model is robust in attending to biologically meaningful regions for the disease symptoms. In addition, we perform comparative analyses against recent state-of-the-art models, and demonstrate that the proposed approach outperforms the others in accuracy and interpretability.
{"title":"An Explainable and Lightweight CNN Framework for Robust Potato Leaf Disease Classification Using Grad-CAM Visualization","authors":"MD Jiabul Hoque, Md. Saiful Islam","doi":"10.1002/ail2.70011","DOIUrl":"https://doi.org/10.1002/ail2.70011","url":null,"abstract":"<p>For identifying foliar diseases in crops at an early stage, accurate detection is necessary in maintaining food security, minimizing economic losses, and cultivating sustainable agriculture. In staple crops, potato is highly vulnerable to lethal diseases like Early Blight and Late Blight that can drastically affect both the quality and the quantity of the yield. Conventional diagnostic procedures using visual observation and/or laboratory examinations are frequently tedious, time-consuming, and susceptible to error. To address these problems, in this research, we propose a novel deep learning architecture using a customized convolutional neural network (CNN) for classifying potato leaf images into three distinct classes, namely Early Blight, Late Blight and Healthy. The model is trained on a selective and heavily augmented subset of the PlantVillage dataset containing 11,593 images and further optimized using regularization techniques like dropout and batch normalization. The system architecture is intended to keep the tradeoff between performance and computational efficiency, so as to fit real-world agricultural scenarios. To increase interpretability and improve trust, we use the Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize the regions in space of the leaves that most contribute to the prediction of the model. The experimental results show superior performance and the proposed model reaches 99.14% accuracy and close-to-perfect precision, recall and F1-scores in all of the classes. Grad-CAM visualizations validate that the model is robust in attending to biologically meaningful regions for the disease symptoms. In addition, we perform comparative analyses against recent state-of-the-art models, and demonstrate that the proposed approach outperforms the others in accuracy and interpretability.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145385007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ross O'Driscoll, Claudia Hagen, Joe Bater, James Adams
Cyber-attacks pose a security threat to military command and control networks, Intelligence, Surveillance, and Reconnaissance (ISR) systems, and civilian critical national infrastructure. The use of artificial intelligence and autonomous agents in these attacks increases the scale, range, and complexity of this threat and the subsequent disruption they cause. Autonomous Cyber Defence (ACD) agents aimto mitigate this threat by responding at machine speed and at the scale required to address the problem. Additionally, they reduce the burden on the limited number of human cyber experts available to respond to an attack. Sequential decision-making algorithms such as Deep Reinforcement Learning (RL) provide a promising route to create ACD agents. These algorithms focus on a single objectivesuch as minimising the intrusion of red agents on the network, by using a handcrafted weighted sum of rewards. This approach removes the ability to adapt the model during inference, and fails to address the many competing objectivespresent when operating and protecting these networks. Conflicting objectives, such as restoring a machine from a back-up image, must be carefully balanced with the cost of associated down-time or the disruption to network traffic or services that might result. Instead of pursuing a Single-Objective RL (SORL) approach, here we present a simple example of a multi-objective network defense game that requires consideration of both defending the network against red-agents and maintaining the critical functionality of green-agents. Two Multi-Objective Reinforcement Learning (MORL) algorithms, namely Multi-Objective Proximal Policy Optimization (MOPPO) and Pareto-Conditioned Networks (PCN), are used to create two trained ACD agents whose performance is compared on our Multi-Objective Cyber Defense game. The benefits and limitations of MORL ACD agents in comparison to SORL ACD agents are discussed based on the investigations of this game.
{"title":"Multi-Objective Reinforcement Learning for Automated Resilient Cyber Defence","authors":"Ross O'Driscoll, Claudia Hagen, Joe Bater, James Adams","doi":"10.1002/ail2.70007","DOIUrl":"https://doi.org/10.1002/ail2.70007","url":null,"abstract":"<p>Cyber-attacks pose a security threat to military command and control networks, Intelligence, Surveillance, and Reconnaissance (ISR) systems, and civilian critical national infrastructure. The use of artificial intelligence and autonomous agents in these attacks increases the scale, range, and complexity of this threat and the subsequent disruption they cause. Autonomous Cyber Defence (ACD) agents aimto mitigate this threat by responding at machine speed and at the scale required to address the problem. Additionally, they reduce the burden on the limited number of human cyber experts available to respond to an attack. Sequential decision-making algorithms such as Deep Reinforcement Learning (RL) provide a promising route to create ACD agents. These algorithms focus on a single objectivesuch as minimising the intrusion of red agents on the network, by using a handcrafted weighted sum of rewards. This approach removes the ability to adapt the model during inference, and fails to address the many competing objectivespresent when operating and protecting these networks. Conflicting objectives, such as restoring a machine from a back-up image, must be carefully balanced with the cost of associated down-time or the disruption to network traffic or services that might result. Instead of pursuing a Single-Objective RL (SORL) approach, here we present a simple example of a multi-objective network defense game that requires consideration of both defending the network against red-agents and maintaining the critical functionality of green-agents. Two Multi-Objective Reinforcement Learning (MORL) algorithms, namely Multi-Objective Proximal Policy Optimization (MOPPO) and Pareto-Conditioned Networks (PCN), are used to create two trained ACD agents whose performance is compared on our Multi-Objective Cyber Defense game. The benefits and limitations of MORL ACD agents in comparison to SORL ACD agents are discussed based on the investigations of this game.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}