Pub Date : 2026-03-01Epub Date: 2025-11-19DOI: 10.1016/j.mlwa.2025.100796
Ronald Katende
Recent advances in AI have been driven by data abundance and computational scale, assumptions that rarely hold in low-resource environments. We examine how constraints in data, compute, connectivity, and institutional capacity reshape what effective AI should be. Using a structured mixed-methods review and PRISMA-inspired protocol over 300+ studies, we compare data-efficient approaches, physics-informed models, few-shot and self-supervised learning, parameter-efficient fine-tuning, TinyML, and federated learning, and evaluate them across deployment axes (data needs, compute footprint, latency, robustness, interpretability, and maintenance). Across health, agriculture, climate, and education, we show that lean, operator-informed, and locally validated methods often outperform conventional large-scale models under real constraints. We argue that data-efficient AI is not a stopgap but a foundational paradigm for equitable and sustainable innovation, and we provide a decision matrix and research-policy agenda to guide practitioners and funders in low-resource settings.
{"title":"Rethinking data-efficient artificial intelligence for low-resource settings","authors":"Ronald Katende","doi":"10.1016/j.mlwa.2025.100796","DOIUrl":"10.1016/j.mlwa.2025.100796","url":null,"abstract":"<div><div>Recent advances in AI have been driven by data abundance and computational scale, assumptions that rarely hold in low-resource environments. We examine how constraints in data, compute, connectivity, and institutional capacity reshape what effective AI should be. Using a structured mixed-methods review and PRISMA-inspired protocol over 300+ studies, we compare data-efficient approaches, physics-informed models, few-shot and self-supervised learning, parameter-efficient fine-tuning, TinyML, and federated learning, and evaluate them across deployment axes (data needs, compute footprint, latency, robustness, interpretability, and maintenance). Across health, agriculture, climate, and education, we show that lean, operator-informed, and locally validated methods often outperform conventional large-scale models under real constraints. We argue that data-efficient AI is not a stopgap but a foundational paradigm for equitable and sustainable innovation, and we provide a decision matrix and research-policy agenda to guide practitioners and funders in low-resource settings.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100796"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional rule-based anti-money laundering (AML) transaction monitoring systems suffer from high false-positive rates and rigidity in detecting complex emerging risk. This limitation has prompted changes to the Financial Action Task Force (FATF) recommendation 16, mandating the use of advanced systems for detecting money laundering schemes in cross-border payments. This study developed a hybrid framework integrating VAE-learned behavioural latent factors, GNN-captured relational network signals, and rule-based heuristics for enhanced anomaly detection. The model was evaluated on 54,258 real-world cross-border transaction records from an East African commercial bank. The One-Class SVM, optimised via a rigorous grid search proved superior compared to Isolation Forest and Local Outlier Factor benchmark, achieving a precision of 99.63% in the top 5% of prioritised alerts. Independent validation by a Kenyan financial institution confirms a batch processing speed of 1000 transactions per second on standard computer hardware (Intel Core i7, 16 GB RAM) and efficient high-priority alert triage, key requirements for deployment in financial institutions. Shapley additive explanations analysis further provided the interpretability of the feature contribution to the model performance. These results demonstrated that integration of rule-based features with deep-learning embeddings improves compliance work efficiency and proven pathway for resource-constrained financial institutions to comply with FATF regulatory demands upcoming in 2030.
{"title":"Hybrid deep learning for anti-money laundering: Unsupervised detection of emerging schemes via feature fusion and explainable artificial intelligence","authors":"Cosmas Ochieng Kungu , Kennedy Senagi , Evans Omondi","doi":"10.1016/j.mlwa.2026.100856","DOIUrl":"10.1016/j.mlwa.2026.100856","url":null,"abstract":"<div><div>Traditional rule-based anti-money laundering (AML) transaction monitoring systems suffer from high false-positive rates and rigidity in detecting complex emerging risk. This limitation has prompted changes to the Financial Action Task Force (FATF) recommendation 16, mandating the use of advanced systems for detecting money laundering schemes in cross-border payments. This study developed a hybrid framework integrating VAE-learned behavioural latent factors, GNN-captured relational network signals, and rule-based heuristics for enhanced anomaly detection. The model was evaluated on 54,258 real-world cross-border transaction records from an East African commercial bank. The One-Class SVM, optimised via a rigorous grid search proved superior compared to Isolation Forest and Local Outlier Factor benchmark, achieving a precision of 99.63% in the top 5% of prioritised alerts. Independent validation by a Kenyan financial institution confirms a batch processing speed of 1000 transactions per second on standard computer hardware (Intel Core i7, 16 GB RAM) and efficient high-priority alert triage, key requirements for deployment in financial institutions. Shapley additive explanations analysis further provided the interpretability of the feature contribution to the model performance. These results demonstrated that integration of rule-based features with deep-learning embeddings improves compliance work efficiency and proven pathway for resource-constrained financial institutions to comply with FATF regulatory demands upcoming in 2030.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100856"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146187664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-27DOI: 10.1016/j.mlwa.2025.100815
Shaurya Jauhari
Algorithmic red teaming for Large Language Models (LLMs) is a crucial practice for proactively ensuring their safety and robustness. This process involves using an LLM as an adversary to test the vulnerabilities of a target LLM, which is essential for identifying and mitigating potential security risks before the model is deployed. Automated methodologies, which surpass the constraints of human creativity, utilize a triad of models: an attacker, a target, and a judge. This primer provides a concise summary and comparison of several state-of-the-art algorithmic red-teaming approaches, including TAP, PAIR, Crescendo, and AutoDAN-Turbo. The goal of these techniques, such as prompt injection and jailbreaking, is to push LLMs beyond their intended safe behavior. Critically, the non-deterministic nature of LLMs presents a key challenge when they are utilized as assessors or judges, potentially rendering evaluations unreliable. The paper stresses that red teaming is not a one-time exercise and is particularly vital for AI agents that use LLMs as components, as a single failure can lead to significant public scrutiny.
{"title":"Algorithmic red teaming approaches to secure LLMs","authors":"Shaurya Jauhari","doi":"10.1016/j.mlwa.2025.100815","DOIUrl":"10.1016/j.mlwa.2025.100815","url":null,"abstract":"<div><div>Algorithmic red teaming for Large Language Models (LLMs) is a crucial practice for proactively ensuring their safety and robustness. This process involves using an LLM as an adversary to test the vulnerabilities of a target LLM, which is essential for identifying and mitigating potential security risks before the model is deployed. Automated methodologies, which surpass the constraints of human creativity, utilize a triad of models: an attacker, a target, and a judge. This primer provides a concise summary and comparison of several state-of-the-art algorithmic red-teaming approaches, including TAP, PAIR, Crescendo, and AutoDAN-Turbo. The goal of these techniques, such as prompt injection and jailbreaking, is to push LLMs beyond their intended safe behavior. Critically, the non-deterministic nature of LLMs presents a key challenge when they are utilized as assessors or judges, potentially rendering evaluations unreliable. The paper stresses that red teaming is not a one-time exercise and is particularly vital for AI agents that use LLMs as components, as a single failure can lead to significant public scrutiny.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100815"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145924636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-24DOI: 10.1016/j.mlwa.2025.100819
Nitu Bharati, Patrick Wong, Soraya Kouadri Mostéfaoui, Dhouha Kbaier, Jan Collie
The growing spread of deepfake images, combined with the sophistication of machine learning tools and techniques used to produce them, pose serious threats to the integrity of information, individual privacy, and the preservation of public trust. To detect these deepfake images for legal investigation purposes, it requires advanced detection mechanisms that not only achieve high accuracy but also provide transparent and understandable explanations of the decisions made.
This paper presents a new framework for deepfake detection, which not only pursues accuracy but, more crucially prioritises the explainability of detection, which is a critical need in legal investigations contexts such as policing and digital forensics. The framework is composed of advanced machine learning models, an explainable AI (XAI) component and three commonly used image processing methods for detecting manipulations, to detect and explain manipulations in deepfake images of human faces. Four independently trained CNN models were developed for the original and processed images, and through decision fusion achieved an overall detection accuracy of 97 %. Moreover, the framework achieved an F1 score of 92 % from a hidden test dataset used in the UK Home Office’s Deepfake Detection Challenge 2024, placing it third out of the competing teams in the image deepfake category. Shapley values were also used to identify the facial features that influenced the models’ detection decisions. This information enabled us to home in on various areas on the face to find features more likely to occur in deepfake images. Through Bayes’ theorem, we presented a human-understandable detection method, achieving 85 % detection accuracy on the test images while maintaining explainability of the detection rationales. Our work demonstrates that combining machine learning, image processing, XAI with human understandable rationales results in a demonstrably effective and practical deepfake detection system that could significantly streamline criminal investigations as performed in policing and digital forensics. Future research will explore the interplay between psychological factors and the acceptance and trust of such frameworks and extend the framework by incorporating additional image processing techniques to enhance detection accuracy.
{"title":"Explainable deepfake detection: A multi-model framework with human-interpretable rationales for legal investigation purposes","authors":"Nitu Bharati, Patrick Wong, Soraya Kouadri Mostéfaoui, Dhouha Kbaier, Jan Collie","doi":"10.1016/j.mlwa.2025.100819","DOIUrl":"10.1016/j.mlwa.2025.100819","url":null,"abstract":"<div><div>The growing spread of deepfake images, combined with the sophistication of machine learning tools and techniques used to produce them, pose serious threats to the integrity of information, individual privacy, and the preservation of public trust. To detect these deepfake images for legal investigation purposes, it requires advanced detection mechanisms that not only achieve high accuracy but also provide transparent and understandable explanations of the decisions made.</div><div>This paper presents a new framework for deepfake detection, which not only pursues accuracy but, more crucially prioritises the explainability of detection, which is a critical need in legal investigations contexts such as policing and digital forensics. The framework is composed of advanced machine learning models, an explainable AI (XAI) component and three commonly used image processing methods for detecting manipulations, to detect and explain manipulations in deepfake images of human faces. Four independently trained CNN models were developed for the original and processed images, and through decision fusion achieved an overall detection accuracy of 97 %. Moreover, the framework achieved an F1 score of 92 % from a hidden test dataset used in the UK Home Office’s Deepfake Detection Challenge 2024, placing it third out of the competing teams in the image deepfake category. Shapley values were also used to identify the facial features that influenced the models’ detection decisions. This information enabled us to home in on various areas on the face to find features more likely to occur in deepfake images. Through Bayes’ theorem, we presented a human-understandable detection method, achieving 85 % detection accuracy on the test images while maintaining explainability of the detection rationales. Our work demonstrates that combining machine learning, image processing, XAI with human understandable rationales results in a demonstrably effective and practical deepfake detection system that could significantly streamline criminal investigations as performed in policing and digital forensics. Future research will explore the interplay between psychological factors and the acceptance and trust of such frameworks and extend the framework by incorporating additional image processing techniques to enhance detection accuracy.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100819"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-03DOI: 10.1016/j.mlwa.2025.100811
Ramin Mousa , Saeed Chamani , Mohammad Morsali , Mohammad Kazzazi , Parsa Hatami , Soroush Sarabi
Skin cancer (SC) is a life-threatening disease where early diagnosis is critical for effective treatment and survival. While deep learning (DL) has advanced skin cancer diagnosis (SCD), current methods generally yield suboptimal accuracy and efficiency due to challenges in extracting multiscale features from dermoscopic images and optimizing complex model parameters through efficient exploration of the space of hyperparameters. To address this, we propose an approach integrating late Discrete Wavelet Transform (DWT) with pre-trained convolutional neural networks (CNNs) and swarm-based optimization. The late DWT decomposes CNN-extracted feature maps into low- and high-frequency components to improve the detection of subtle lesion patterns, while a self-attention mechanism further refines this by weighing feature importance, focusing on relevant diagnostic information. To refine hyperparameters, three novel swarm-based optimizers – Modified Gorilla Troops Optimizer (MGTO), Improved Gray Wolf Optimization (IGWO), and Fox Optimization (FOX) – are employed searching the space of the hyperparameters to fine-tune the model for superior performance. In comparison to existing methods, experiments on the ISIC-2016 and ISIC-2017 datasets show enhanced classification performance, obtaining at least a 1% accuracy gain. Thus, the suggested framework offers a reliable and effective way to diagnose skin cancer automatically.
{"title":"Enhancing skin cancer diagnosis using late discrete wavelet transform and new swarm-based optimizers","authors":"Ramin Mousa , Saeed Chamani , Mohammad Morsali , Mohammad Kazzazi , Parsa Hatami , Soroush Sarabi","doi":"10.1016/j.mlwa.2025.100811","DOIUrl":"10.1016/j.mlwa.2025.100811","url":null,"abstract":"<div><div>Skin cancer (SC) is a life-threatening disease where early diagnosis is critical for effective treatment and survival. While deep learning (DL) has advanced skin cancer diagnosis (SCD), current methods generally yield suboptimal accuracy and efficiency due to challenges in extracting multiscale features from dermoscopic images and optimizing complex model parameters through efficient exploration of the space of hyperparameters. To address this, we propose an approach integrating late Discrete Wavelet Transform (DWT) with pre-trained convolutional neural networks (CNNs) and swarm-based optimization. The late DWT decomposes CNN-extracted feature maps into low- and high-frequency components to improve the detection of subtle lesion patterns, while a self-attention mechanism further refines this by weighing feature importance, focusing on relevant diagnostic information. To refine hyperparameters, three novel swarm-based optimizers – Modified Gorilla Troops Optimizer (MGTO), Improved Gray Wolf Optimization (IGWO), and Fox Optimization (FOX) – are employed searching the space of the hyperparameters to fine-tune the model for superior performance. In comparison to existing methods, experiments on the ISIC-2016 and ISIC-2017 datasets show enhanced classification performance, obtaining at least a 1% accuracy gain. Thus, the suggested framework offers a reliable and effective way to diagnose skin cancer automatically.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100811"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-09DOI: 10.1016/j.mlwa.2026.100839
Arezoo Shafiei Bafti , Mehdi Vafakhah , Vahid Moosavi , Hadi Khosravi Farsani
Predicting streamflow in ungauged watersheds is a key hydrological challenge, commonly addressed through flow duration curve (FDC) regionalization. Although machine learning (ML) models are widely applied, their accuracy depends critically on both the algorithm and input variable selection. This research develops a systematic, quantile-aware ML framework to assess how input selection strategies affect FDC prediction. We evaluate three Gamma Test–based approaches: full variable set, classified variables, and expert opinion, combined with five ML techniques: Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Regression (SVR), Multivariate Adaptive Regression Splines (MARS), Random Forest (RF), and Boosted Regression Trees (BRT). The analysis uses data from 130 hydrometric stations across the Caspian Sea watershed. Results demonstrated that predictive performance varies not only by model but also significantly with flow quantile and input strategy. The ANFIS model enhanced with Fuzzy C-Means clustering (FCM) consistently delivered the highest accuracy. Specifically, low, medium and high flows were best predicted using the full variable set (Q90, R² = 0.94, improved by 623 %), the classified variable and expert opinion approaches (Q50, R² = 0.86, improved by 207.14 %; Q2, R² = 0.86, improved by 207.14 %), respectively. This confirms that no single ML configuration is optimal for all conditions, underscoring the necessity of flow-regime-specific variable selection for robust FDC regionalization in data-scarce areas. Accordingly, for similar watersheds, we recommend the following configurations of the ANFIS-FCM model: the full variable set for low-flow prediction, the classified variable approach for medium-flow prediction, and the expert opinion approach for high-flow prediction.
{"title":"A new framework for input variable selection based on the gamma test machine learning performance in quantile prediction of flow duration curves","authors":"Arezoo Shafiei Bafti , Mehdi Vafakhah , Vahid Moosavi , Hadi Khosravi Farsani","doi":"10.1016/j.mlwa.2026.100839","DOIUrl":"10.1016/j.mlwa.2026.100839","url":null,"abstract":"<div><div>Predicting streamflow in ungauged watersheds is a key hydrological challenge, commonly addressed through flow duration curve (FDC) regionalization. Although machine learning (ML) models are widely applied, their accuracy depends critically on both the algorithm and input variable selection. This research develops a systematic, quantile-aware ML framework to assess how input selection strategies affect FDC prediction. We evaluate three Gamma Test–based approaches: full variable set, classified variables, and expert opinion, combined with five ML techniques: Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Regression (SVR), Multivariate Adaptive Regression Splines (MARS), Random Forest (RF), and Boosted Regression Trees (BRT). The analysis uses data from 130 hydrometric stations across the Caspian Sea watershed. Results demonstrated that predictive performance varies not only by model but also significantly with flow quantile and input strategy. The ANFIS model enhanced with Fuzzy C-Means clustering (FCM) consistently delivered the highest accuracy. Specifically, low, medium and high flows were best predicted using the full variable set (Q90, R² = 0.94, improved by 623 %), the classified variable and expert opinion approaches (Q50, R² = 0.86, improved by 207.14 %; Q2, R² = 0.86, improved by 207.14 %), respectively. This confirms that no single ML configuration is optimal for all conditions, underscoring the necessity of flow-regime-specific variable selection for robust FDC regionalization in data-scarce areas. Accordingly, for similar watersheds, we recommend the following configurations of the ANFIS-FCM model: the full variable set for low-flow prediction, the classified variable approach for medium-flow prediction, and the expert opinion approach for high-flow prediction.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100839"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146188220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detecting advertisements in digitized newspapers is a key step in large-scale media analytics and digital archiving. However, variations in layout, typography, and advertisement design across publishers and time periods cause significant domain shifts that reduce the generalization ability of supervised detectors. This paper presents AdAPT, a confidence-guided pseudo-labeling pipeline for unsupervised domain adaptation in advertisement detection. The proposed method leverages both advertisement-free (Null) and advertisement-containing pages from unlabeled target domains to generate reliable pseudo-labels. By retraining a YOLO-based detector using labeled source data combined with filtered pseudo-labeled target samples, AdAPT achieves robust adaptation without requiring manual annotation. Experiments conducted on two unseen newspapers (Adresseavisen and iTromsø) demonstrate that Null-based pseudo-labeling provides the most stable and accurate adaptation, yielding up to 38% error reduction compared to the baseline. The results highlight AdAPT as a simple, scalable, and annotation-efficient solution for maintaining high-performance advertisement detection across diverse newspaper collections.
{"title":"AdAPT: Advertisement detector adaptation under newspaper domain shift with null-based pseudo-labeling","authors":"Faeze Zakaryapour Sayyad , Tobias Pettersson , Seyed Jalaleddin Mousavirad , Irida Shallari , Mattias O’Nils","doi":"10.1016/j.mlwa.2025.100806","DOIUrl":"10.1016/j.mlwa.2025.100806","url":null,"abstract":"<div><div>Detecting advertisements in digitized newspapers is a key step in large-scale media analytics and digital archiving. However, variations in layout, typography, and advertisement design across publishers and time periods cause significant domain shifts that reduce the generalization ability of supervised detectors. This paper presents AdAPT, a confidence-guided pseudo-labeling pipeline for unsupervised domain adaptation in advertisement detection. The proposed method leverages both advertisement-free (Null) and advertisement-containing pages from unlabeled target domains to generate reliable pseudo-labels. By retraining a YOLO-based detector using labeled source data combined with filtered pseudo-labeled target samples, AdAPT achieves robust adaptation without requiring manual annotation. Experiments conducted on two unseen newspapers (Adresseavisen and iTromsø) demonstrate that Null-based pseudo-labeling provides the most stable and accurate adaptation, yielding up to 38% error reduction compared to the baseline. The results highlight AdAPT as a simple, scalable, and annotation-efficient solution for maintaining high-performance advertisement detection across diverse newspaper collections.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100806"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-29DOI: 10.1016/j.mlwa.2025.100831
Xianyuan Zhu
Accurate identification of crop growth stages is crucial for precision agriculture and automated field management. This study designed and developed an improved Swin Transformer-based detection system for wheat growth stages, with an emphasis on real time deployment on embedded edge devices. Specifically, we incorporate a Progressive Transfer Learning strategy to ensure robust generalization on agricultural data and introduce an Ordinal Regression Loss to effectively mitigate misclassifications in transitional growth stages. The proposed approach integrates a hierarchical Transformer backbone with an optimized deployment pipeline for NVIDIA Jetson Orin NX, supporting gallery images, video streams, and live camera inputs. Experimental evaluation demonstrated that the system achieves consistently high recognition accuracy (above 93%) while maintaining real-time performance (above 12FPS) under different modes, with moderate power consumption (6–8 W). Compared with baseline CNNs (ResNet-50, MobileNetV3) and Transformer models (ViT), the proposed design achieves a favorable balance among accuracy, efficiency, and robustness. These results suggest that the system can contribute to the development of practical agricultural monitoring and provide a step toward intelligent control strategies in precision farming.
作物生长阶段的准确识别对于精准农业和自动化田间管理至关重要。本研究设计并开发了一种改进的基于Swin变压器的小麦生长阶段检测系统,重点是在嵌入式边缘设备上的实时部署。具体而言,我们采用渐进迁移学习策略来确保农业数据的鲁棒泛化,并引入序数回归损失来有效减轻过渡生长阶段的错误分类。所提出的方法集成了一个分层Transformer主干和一个针对NVIDIA Jetson Orin NX的优化部署管道,支持图库图像、视频流和实时摄像机输入。实验评估表明,该系统在不同模式下均能保持较高的识别准确率(93%以上),同时保持实时性(12FPS以上),且功耗适中(6-8 W)。与基线cnn (ResNet-50、MobileNetV3)和Transformer模型(ViT)相比,本文提出的设计在准确率、效率和鲁棒性之间取得了良好的平衡。这些结果表明,该系统可以促进实际农业监测的发展,并为精准农业的智能控制策略提供一步。
{"title":"Real-time wheat growth stage detection via improved Swin transformer for edge devices","authors":"Xianyuan Zhu","doi":"10.1016/j.mlwa.2025.100831","DOIUrl":"10.1016/j.mlwa.2025.100831","url":null,"abstract":"<div><div>Accurate identification of crop growth stages is crucial for precision agriculture and automated field management. This study designed and developed an improved Swin Transformer-based detection system for wheat growth stages, with an emphasis on real time deployment on embedded edge devices. Specifically, we incorporate a Progressive Transfer Learning strategy to ensure robust generalization on agricultural data and introduce an Ordinal Regression Loss to effectively mitigate misclassifications in transitional growth stages. The proposed approach integrates a hierarchical Transformer backbone with an optimized deployment pipeline for NVIDIA Jetson Orin NX, supporting gallery images, video streams, and live camera inputs. Experimental evaluation demonstrated that the system achieves consistently high recognition accuracy (above 93%) while maintaining real-time performance (above 12FPS) under different modes, with moderate power consumption (6–8 W). Compared with baseline CNNs (ResNet-50, MobileNetV3) and Transformer models (ViT), the proposed design achieves a favorable balance among accuracy, efficiency, and robustness. These results suggest that the system can contribute to the development of practical agricultural monitoring and provide a step toward intelligent control strategies in precision farming.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100831"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-03DOI: 10.1016/j.mlwa.2026.100836
Sean T Miller , Keaton A Logan , Ricardo Anderson , Patricia E Cowell , Curtis Busby-Earle , Lisa-Dionne Morris
Machine Learning (ML) has demonstrated strong predictive capabilities in healthcare, often surpassing human performance in pattern recognition and decision-making. However, many high-performing models lack interpretability, which is critical in clinical settings where understanding and trusting predictions is essential. To achieve our objective, we proposed a Multi-Perspective machine learning framework (MPML) that combines established base classifiers with structured perspective-based design and interpretability pipeline. MPML organises features into meaningful subsets, or perspectives, enabling both global and instance-level interpretability. Unlike traditional ensemble methods such as Bagging, Boosting, and Random Forest, MPML delivers significantly higher-quality predictions across all evaluation metrics while maintaining a transparent structure. Applied to a heart disease dataset, MPML not only improves predictive accuracy but also provides detailed, accessible explanations for individual patient outcomes, advancing the potential for practical and ethical deployment of ML in healthcare.
{"title":"Multi-perspective machine learning MPML: A high-performance and interpretable ensemble method for heart disease prediction","authors":"Sean T Miller , Keaton A Logan , Ricardo Anderson , Patricia E Cowell , Curtis Busby-Earle , Lisa-Dionne Morris","doi":"10.1016/j.mlwa.2026.100836","DOIUrl":"10.1016/j.mlwa.2026.100836","url":null,"abstract":"<div><div>Machine Learning (ML) has demonstrated strong predictive capabilities in healthcare, often surpassing human performance in pattern recognition and decision-making. However, many high-performing models lack interpretability, which is critical in clinical settings where understanding and trusting predictions is essential. To achieve our objective, we proposed a Multi-Perspective machine learning framework (MPML) that combines established base classifiers with structured perspective-based design and interpretability pipeline. MPML organises features into meaningful subsets, or perspectives, enabling both global and instance-level interpretability. Unlike traditional ensemble methods such as Bagging, Boosting, and Random Forest, MPML delivers significantly higher-quality predictions across all evaluation metrics while maintaining a transparent structure. Applied to a heart disease dataset, MPML not only improves predictive accuracy but also provides detailed, accessible explanations for individual patient outcomes, advancing the potential for practical and ethical deployment of ML in healthcare.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100836"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145924639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Social-emotional learning (SEL) plays a crucial role in special education, yet current assessment approaches rely heavily on subjective teacher observation, which can be time-consuming and difficult to standardize. Music provides a meaningful medium for evaluating emotional competencies, creating an opportunity for artificial intelligence to support more objective and scalable SEL assessment.
Methods
We propose a lightweight social-emotional music classification model, termed LSEL, designed to identify three SEL-related competencies: Empathetic Perspective-Taking, Outlook, and Problem-Solving. LSEL utilizes 40×128 mel-frequency cepstral coefficient as input to capture core spectral–temporal characteristics relevant to SEL perception. Moreover, we provided an open-source SEM dataset for domain experts, utilizing 591 samples, which consisted of 194 Empathetic, 214 Outlook, and 183 Perspective-Taking samples, to analyze LSEL performance.
Results
LSEL reaching an average accuracy of 96.55 % and mAP of 99.29 % across experiments. With Cohen’s κ averaging 94.32 % and R² reaching 94.15 %, indicating high consistency with ground-truth. Per-category accuracies were similarly stable, including 96.95 % for Empathetic Perspective-Taking, 95.16 % for Outlook, and 95.36 % for Problem-Solving.
Conclusions
The lightweight LSEL framework offers an effective and robust solution for social-emotional music classification, supporting objective SEL assessment in educational contexts. The release of the SEM dataset further contributes to a valuable resource for advancing AI-assisted SEL research.
{"title":"LSEL: A lightweight deep learning model for social-emotional classification of classical music","authors":"Yuan-Jin Lin , Yu-Chi Chou , Shan-Ken Chien , Pen-Chiang Chao , Kuang-Kai Yeh , Yen-Chia Peng , Chen-Hao Tsao , Chih-Yun Chen , Shih-Lun Chen , Kuo-Chen Li , Wei-Chen Tu","doi":"10.1016/j.mlwa.2025.100832","DOIUrl":"10.1016/j.mlwa.2025.100832","url":null,"abstract":"<div><h3>Background/Objectives</h3><div>Social-emotional learning (SEL) plays a crucial role in special education, yet current assessment approaches rely heavily on subjective teacher observation, which can be time-consuming and difficult to standardize. Music provides a meaningful medium for evaluating emotional competencies, creating an opportunity for artificial intelligence to support more objective and scalable SEL assessment.</div></div><div><h3>Methods</h3><div>We propose a lightweight social-emotional music classification model, termed LSEL, designed to identify three SEL-related competencies: Empathetic Perspective-Taking, Outlook, and Problem-Solving. LSEL utilizes 40×128 mel-frequency cepstral coefficient as input to capture core spectral–temporal characteristics relevant to SEL perception. Moreover, we provided an open-source SEM dataset for domain experts, utilizing 591 samples, which consisted of 194 Empathetic, 214 Outlook, and 183 Perspective-Taking samples, to analyze LSEL performance.</div></div><div><h3>Results</h3><div>LSEL reaching an average accuracy of 96.55 % and mAP of 99.29 % across experiments. With Cohen’s κ averaging 94.32 % and R² reaching 94.15 %, indicating high consistency with ground-truth. Per-category accuracies were similarly stable, including 96.95 % for Empathetic Perspective-Taking, 95.16 % for Outlook, and 95.36 % for Problem-Solving.</div></div><div><h3>Conclusions</h3><div>The lightweight LSEL framework offers an effective and robust solution for social-emotional music classification, supporting objective SEL assessment in educational contexts. The release of the SEM dataset further contributes to a valuable resource for advancing AI-assisted SEL research.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100832"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}