Pub Date : 2026-03-01Epub Date: 2025-11-25DOI: 10.1016/j.mlwa.2025.100799
Jeonghoe Lee, Lin Cai
Predicting stock prices is crucial for making informed investment decisions as stock markets significantly influence the global economy. Although previous studies have explored feature importance methods for stock price prediction, comprehensive comparisons of those methods have been limited. This study aims to provide a detailed comparison of different feature importance methods for selecting technical indicators to predict stock prices. Specifically, this research analyzed financial data from the 11 sectors of the NASDAQ. A moving window forecasting framework was implemented to dynamically capture the evolving patterns in financial markets over time. Model-specific feature importance methods were compared with model-agnostic approaches. Multiple machine learning algorithms, including Random Forest (RF), and Multi-layer Neural Network (MNNs), were employed to forecast stock prices. Additionally, extensive hyperparameter tuning was conducted to improve model explainability, contributing to the field of Explainable Artificial Intelligence (XAI). The results highlight the predictive effectiveness of different feature importance methods in selecting optimal technical indicators, thereby offering valuable insights for enhancing stock price forecasting accuracy and model transparency. In summary, this research offers a comprehensive comparison of feature importance methods, emphasizing their application in the selection of technical indicators in a dynamic, rolling prediction setting.
{"title":"Comparing model-specific and model-agnostic features importance methods using machine learning with technical indicators: A NASDAQ sector-based study","authors":"Jeonghoe Lee, Lin Cai","doi":"10.1016/j.mlwa.2025.100799","DOIUrl":"10.1016/j.mlwa.2025.100799","url":null,"abstract":"<div><div>Predicting stock prices is crucial for making informed investment decisions as stock markets significantly influence the global economy. Although previous studies have explored feature importance methods for stock price prediction, comprehensive comparisons of those methods have been limited. This study aims to provide a detailed comparison of different feature importance methods for selecting technical indicators to predict stock prices. Specifically, this research analyzed financial data from the 11 sectors of the NASDAQ. A moving window forecasting framework was implemented to dynamically capture the evolving patterns in financial markets over time. Model-specific feature importance methods were compared with model-agnostic approaches. Multiple machine learning algorithms, including Random Forest (RF), and Multi-layer Neural Network (MNNs), were employed to forecast stock prices. Additionally, extensive hyperparameter tuning was conducted to improve model explainability, contributing to the field of Explainable Artificial Intelligence (XAI). The results highlight the predictive effectiveness of different feature importance methods in selecting optimal technical indicators, thereby offering valuable insights for enhancing stock price forecasting accuracy and model transparency. In summary, this research offers a comprehensive comparison of feature importance methods, emphasizing their application in the selection of technical indicators in a dynamic, rolling prediction setting.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100799"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-09DOI: 10.1016/j.mlwa.2025.100818
Sara Fanati Rashidi , Maryam Olfati , Seyedali Mirjalili , Crina Grosan , Jan Platoš , Vaclav Snášel
This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.
{"title":"A hybrid DEA–fuzzy clustering approach for accurate reference set identification","authors":"Sara Fanati Rashidi , Maryam Olfati , Seyedali Mirjalili , Crina Grosan , Jan Platoš , Vaclav Snášel","doi":"10.1016/j.mlwa.2025.100818","DOIUrl":"10.1016/j.mlwa.2025.100818","url":null,"abstract":"<div><div>This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100818"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-14DOI: 10.1016/j.mlwa.2026.100844
João Montrezol , Hugo S. Oliveira , Hélder P. Oliveira
With the rise of Transformers, Vision Transformers (ViTs) have become a new standard in visual recognition. This has led to the development of numerous architectures with diverse designs and applications. This survey identifies 22 key ViT and hybrid CNN–ViT models, along with 5 top Convolutional Neural Network (CNN) models. These were selected based on their new architecture, relevance to benchmarks, and overall impact. The models are organised using a defined taxonomy formed by CNN-based, pure Transformer-based, and hybrid architectures. We analyse their main components, training methods, and computational features, while assessing performance using reported results on standard benchmarks such as ImageNet and CIFAR, along with our training and fine-tuning evaluations on specific imaging datasets. In addition to accuracy, we look at real-world deployment issues by analysing the trade-offs between accuracy and efficiency in embedded, mobile, and clinical settings. The results indicate that modern CNNs are still very competitive in limited-resource environments, while advanced ViT variants perform well after large-scale pretraining, especially in areas with high variability. Hybrid CNN–ViT architectures, on the other hand, tend to offer the best balance between accuracy, data efficiency, and computational cost. This survey establishes a consolidated benchmark and reference framework for understanding the evolution, capabilities, and practical applicability of contemporary vision architectures.
{"title":"Decoding vision transformer variations for image classification: A guide to performance and usability","authors":"João Montrezol , Hugo S. Oliveira , Hélder P. Oliveira","doi":"10.1016/j.mlwa.2026.100844","DOIUrl":"10.1016/j.mlwa.2026.100844","url":null,"abstract":"<div><div>With the rise of Transformers, Vision Transformers (ViTs) have become a new standard in visual recognition. This has led to the development of numerous architectures with diverse designs and applications. This survey identifies 22 key ViT and hybrid CNN–ViT models, along with 5 top Convolutional Neural Network (CNN) models. These were selected based on their new architecture, relevance to benchmarks, and overall impact. The models are organised using a defined taxonomy formed by CNN-based, pure Transformer-based, and hybrid architectures. We analyse their main components, training methods, and computational features, while assessing performance using reported results on standard benchmarks such as ImageNet and CIFAR, along with our training and fine-tuning evaluations on specific imaging datasets. In addition to accuracy, we look at real-world deployment issues by analysing the trade-offs between accuracy and efficiency in embedded, mobile, and clinical settings. The results indicate that modern CNNs are still very competitive in limited-resource environments, while advanced ViT variants perform well after large-scale pretraining, especially in areas with high variability. Hybrid CNN–ViT architectures, on the other hand, tend to offer the best balance between accuracy, data efficiency, and computational cost. This survey establishes a consolidated benchmark and reference framework for understanding the evolution, capabilities, and practical applicability of contemporary vision architectures.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100844"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In many domains, including online education, healthcare, security, and human–computer interaction, facial emotion recognition (FER) is essential. Real-world FER is still difficult because of factors like head positions, occlusions, illumination shifts, and demographic diversity. Engagement detection system, which is essential in virtual learning platforms is severely challenged by these factors. In this article, we propose ExpressNet-MoE, a novel hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with a Mixture of Experts (MoE) framework to address these challenges. The proposed model dynamically selects the most relevant expert networks for each input, thereby improving generalization and adaptability across diverse datasets. Our methodology involves training ExpressNet-MoE independently on several benchmark datasets after preprocessing facial pictures using BlazeFace for face detection and alignment. To maintain class distribution, stratified sampling is used to divide each dataset into training and testing groups. Our model improves on the accuracy of emotion recognition by utilizing multi-scale feature extraction to collect both global and local facial features. ExpressNet-MoE includes numerous CNN-based feature extractors, a MoE module for adaptive feature selection, and finally a residual network backbone for deep feature learning. To demonstrate efficacy of our proposed model we evaluated it on four widely used datasets: , , RAF-DB, and FER-2013; and compared with current state-of-the-art methods. Our model achieves accuracies of 74.40% 0.45 on , 71.98% 0.66 on , 83.41% 1.06 on RAF-DB, and 67.05% 2.08 on FER-2013. Overall, the findings indicate that the adaptive expert selection and multi-scale feature extraction significantly enhances the robustness of facial emotion recognition across diverse real-world conditions and how it may be used to develop end-to-end emotion recognition systems in practical settings. Reproducible codes and results are made publicly accessible at https://github.com/DeeptimaanB/ExpressNet-MoE.
{"title":"ExpressNet-MoE: A hybrid deep neural network for emotion recognition","authors":"Deeptimaan Banerjee, Prateek Gothwal, Ashis Kumer Biswas","doi":"10.1016/j.mlwa.2025.100830","DOIUrl":"10.1016/j.mlwa.2025.100830","url":null,"abstract":"<div><div>In many domains, including online education, healthcare, security, and human–computer interaction, facial emotion recognition (FER) is essential. Real-world FER is still difficult because of factors like head positions, occlusions, illumination shifts, and demographic diversity. Engagement detection system, which is essential in virtual learning platforms is severely challenged by these factors. In this article, we propose ExpressNet-MoE, a novel hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with a Mixture of Experts (MoE) framework to address these challenges. The proposed model dynamically selects the most relevant expert networks for each input, thereby improving generalization and adaptability across diverse datasets. Our methodology involves training ExpressNet-MoE independently on several benchmark datasets after preprocessing facial pictures using BlazeFace for face detection and alignment. To maintain class distribution, stratified sampling is used to divide each dataset into training and testing groups. Our model improves on the accuracy of emotion recognition by utilizing multi-scale feature extraction to collect both global and local facial features. ExpressNet-MoE includes numerous CNN-based feature extractors, a MoE module for adaptive feature selection, and finally a residual network backbone for deep feature learning. To demonstrate efficacy of our proposed model we evaluated it on four widely used datasets: <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>7</mn></mrow></msub></math></span>, <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>8</mn></mrow></msub></math></span>, RAF-DB, and FER-2013; and compared with current state-of-the-art methods. Our model achieves accuracies of 74.40% <span><math><mo>±</mo></math></span> 0.45 on <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>7</mn></mrow></msub></math></span>, 71.98% <span><math><mo>±</mo></math></span> 0.66 on <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>8</mn></mrow></msub></math></span>, 83.41% <span><math><mo>±</mo></math></span> 1.06 on RAF-DB, and 67.05% <span><math><mo>±</mo></math></span> 2.08 on FER-2013. Overall, the findings indicate that the adaptive expert selection and multi-scale feature extraction significantly enhances the robustness of facial emotion recognition across diverse real-world conditions and how it may be used to develop end-to-end emotion recognition systems in practical settings. Reproducible codes and results are made publicly accessible at <span><span>https://github.com/DeeptimaanB/ExpressNet-MoE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100830"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heart disease, a major cause of death worldwide, accounts for millions of deaths each year. This makes it critical to detect heart disease at an earlier stage so that a treatment plan, including medications and counseling, can be started. Machine learning (ML) algorithms trained on large datasets have made it possible to predict heart disease more effectively. Traditional machine learning approaches provide statistical correlations, but often lack explicit integration of clinical knowledge, which limits their usefulness in real-world scenarios. This paper investigates the use of Large Language Model (LLM) combined with Retrieval-Augmented Generation (RAG) to derive clinically grounded feature relevance based on medical guidelines. A curated corpus of medical guidelines and practice protocols from internationally approved organizations was used to train the RAG pipeline. The features were ranked using LLM powered by RAG, and themost important features were selected and used in a Support Vector Machine (SVM) with a custom kernel. A custom formulation combining linear and non linear functions were explored as an auxiliary modeling component. This enables the model to keep the clinical importance of the features, linear transparency and also captures complex interactions using a polynomial function. This approach is evaluated on the UCI Heart Disease dataset, which includes data from Cleveland, Hungary, Switzerland, and VA Medical Center in Long Beach. This study conducted in two parts one using only Cleveland alone and a full set of data using all 4 regions. This integration of statistical learning with LLM driven reasoning supports cardiovascular risk assessment in a clinically informed manner. This approach helps to identify clinically relevant features for the learning process. On the Cleveland dataset the model achieved an accuracy of 95%, an F1 score of 0.936, and an AUC-ROC of 0.973, but it was comparable with traditional models and without weighted kernel due to the size of the data. When applied on the combined dataset, using the entire UCI dataset, the model achieved an accuracy of 93.3%, F1 score 0.923 and AUC-ROC of 0.961. Statistical testing showed that the weighted and unweighted kernels performed similarly, suggesting that the primary contribution arises from clinically guided feature selection rather than kernel weighting. The combination of statistical methods and reasoning from LLM models improves both the effectiveness and clarity of predictions. This process helps develop clinically informed AI systems for cardiovascular risk assessment. This paper also includes a comparative study of logistic regression, decision tree, random forest, gradient boosting, and support vector machine with RBF, sigmoid, linear and polynomial kernels.
{"title":"Enhanced Heart disease prediction using LLM ranked feature selection, Dynamic custom Kernel","authors":"Nikesh P.L. , Sebastian Terence , Anishin Raj , Jude Immaculate , Deepak Mishra","doi":"10.1016/j.mlwa.2026.100860","DOIUrl":"10.1016/j.mlwa.2026.100860","url":null,"abstract":"<div><div>Heart disease, a major cause of death worldwide, accounts for millions of deaths each year. This makes it critical to detect heart disease at an earlier stage so that a treatment plan, including medications and counseling, can be started. Machine learning (ML) algorithms trained on large datasets have made it possible to predict heart disease more effectively. Traditional machine learning approaches provide statistical correlations, but often lack explicit integration of clinical knowledge, which limits their usefulness in real-world scenarios. This paper investigates the use of Large Language Model (LLM) combined with Retrieval-Augmented Generation (RAG) to derive clinically grounded feature relevance based on medical guidelines. A curated corpus of medical guidelines and practice protocols from internationally approved organizations was used to train the RAG pipeline. The features were ranked using LLM powered by RAG, and themost important features were selected and used in a Support Vector Machine (SVM) with a custom kernel. A custom formulation combining linear and non linear functions were explored as an auxiliary modeling component. This enables the model to keep the clinical importance of the features, linear transparency and also captures complex interactions using a polynomial function. This approach is evaluated on the UCI Heart Disease dataset, which includes data from Cleveland, Hungary, Switzerland, and VA Medical Center in Long Beach. This study conducted in two parts one using only Cleveland alone and a full set of data using all 4 regions. This integration of statistical learning with LLM driven reasoning supports cardiovascular risk assessment in a clinically informed manner. This approach helps to identify clinically relevant features for the learning process. On the Cleveland dataset the model achieved an accuracy of 95%, an F1 score of 0.936, and an AUC-ROC of 0.973, but it was comparable with traditional models and without weighted kernel due to the size of the data. When applied on the combined dataset, using the entire UCI dataset, the model achieved an accuracy of 93.3%, F1 score 0.923 and AUC-ROC of 0.961. Statistical testing showed that the weighted and unweighted kernels performed similarly, suggesting that the primary contribution arises from clinically guided feature selection rather than kernel weighting. The combination of statistical methods and reasoning from LLM models improves both the effectiveness and clarity of predictions. This process helps develop clinically informed AI systems for cardiovascular risk assessment. This paper also includes a comparative study of logistic regression, decision tree, random forest, gradient boosting, and support vector machine with RBF, sigmoid, linear and polynomial kernels.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100860"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146188223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-25DOI: 10.1016/j.mlwa.2025.100797
Nicholas Maurer, Mohammed Abdallah
This work presents a novel adaptive framework for soft error mitigation in space-based systems, designed to resolve the fundamental conflict between system performance and radiation protection. By leveraging a Long Short-Term Memory (LSTM) model to predict real-time solar particle flux, our approach dynamically enables or disables software-based mitigation techniques. This contrasts with the static, "always-on" methods of existing systems, offering a significant improvement in computational efficiency. The proposed LSTM model was trained on NASA solar particle flux data, achieving a mean average error of 7.65e-6, demonstrating its high accuracy in predicting nonlinear particle events. Our simulation, which applies this predictive model to a tiered system of redundant processing, checkpointing, and watchdog timers, shows a substantial reduction in overhead. During the 18,414-second test period, the combined adaptive mitigation methods introduced only 20.75–51.6 s of overhead, representing a 99.4 % reduction in overhead compared to continuous, static mitigation. This research's primary contribution is a demonstrated proof-of-concept for an intelligent, self-adaptive system that can maintain high reliability while drastically improving performance. This approach provides a pathway for utilizing more cost-effective commercial-off-the-shelf (COTS) processors in radiation-intensive environments.
{"title":"Machine learning based adaptive soft error mitigation efficiency","authors":"Nicholas Maurer, Mohammed Abdallah","doi":"10.1016/j.mlwa.2025.100797","DOIUrl":"10.1016/j.mlwa.2025.100797","url":null,"abstract":"<div><div>This work presents a novel adaptive framework for soft error mitigation in space-based systems, designed to resolve the fundamental conflict between system performance and radiation protection. By leveraging a Long Short-Term Memory (LSTM) model to predict real-time solar particle flux, our approach dynamically enables or disables software-based mitigation techniques. This contrasts with the static, \"always-on\" methods of existing systems, offering a significant improvement in computational efficiency. The proposed LSTM model was trained on NASA solar particle flux data, achieving a mean average error of 7.65e-6, demonstrating its high accuracy in predicting nonlinear particle events. Our simulation, which applies this predictive model to a tiered system of redundant processing, checkpointing, and watchdog timers, shows a substantial reduction in overhead. During the 18,414-second test period, the combined adaptive mitigation methods introduced only 20.75–51.6 s of overhead, representing a 99.4 % reduction in overhead compared to continuous, static mitigation. This research's primary contribution is a demonstrated proof-of-concept for an intelligent, self-adaptive system that can maintain high reliability while drastically improving performance. This approach provides a pathway for utilizing more cost-effective commercial-off-the-shelf (COTS) processors in radiation-intensive environments.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100797"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-26DOI: 10.1016/j.mlwa.2025.100800
Ali Asghari
As an unsupervised learning method, clustering is a critical technique in artificial intelligence for organizing raw data into meaningful groups. In this process, data is partitioned based on the internal similarity of members within the same cluster and the maximum external distance from other clusters. Beyond business analytics, healthcare, economics, and other fields, clustering has been widely applied across disciplines. Extracting practical knowledge from large datasets relies on an effective clustering technique. Processing speed, especially for large datasets, handling noisy data and outliers, and ensuring high accuracy are the main challenges in clustering. These problems are especially significant in contemporary applications, where heterogeneous and inherently noisy datasets are prevalent. Combining the Trees Social Relation Algorithm (TSR) with the Queue Learning (QL) algorithm, the proposed approach, TQC (Tree-Queue Clustering), addresses these problems. While the QL algorithm enhances clustering accuracy, the TSR method focuses on accelerating clustering. The suggested approach first divides the data into smaller groups. Then, by effectively computing group memberships, TSR's migration process causes clusters to develop progressively. Handling noise and outliers helps the QL algorithm prevent local optima and improve clustering efficiency. This hybrid approach ensures the formation of high-quality clusters and accelerates convergence. The suggested method is validated across several real-world datasets of varying sizes and properties. Experimental results, evaluated using five performance metrics — MICD, ARI, NMI, ET, and ODR — and compared with eight state-of-the-art algorithms, demonstrate the proposed method's superior performance in both speed and accuracy.
{"title":"TQC: An intelligent clustering approach for large-scale, noisy, and imbalanced data","authors":"Ali Asghari","doi":"10.1016/j.mlwa.2025.100800","DOIUrl":"10.1016/j.mlwa.2025.100800","url":null,"abstract":"<div><div>As an unsupervised learning method, clustering is a critical technique in artificial intelligence for organizing raw data into meaningful groups. In this process, data is partitioned based on the internal similarity of members within the same cluster and the maximum external distance from other clusters. Beyond business analytics, healthcare, economics, and other fields, clustering has been widely applied across disciplines. Extracting practical knowledge from large datasets relies on an effective clustering technique. Processing speed, especially for large datasets, handling noisy data and outliers, and ensuring high accuracy are the main challenges in clustering. These problems are especially significant in contemporary applications, where heterogeneous and inherently noisy datasets are prevalent. Combining the Trees Social Relation Algorithm (TSR) with the Queue Learning (QL) algorithm, the proposed approach, TQC (Tree-Queue Clustering), addresses these problems. While the QL algorithm enhances clustering accuracy, the TSR method focuses on accelerating clustering. The suggested approach first divides the data into smaller groups. Then, by effectively computing group memberships, TSR's migration process causes clusters to develop progressively. Handling noise and outliers helps the QL algorithm prevent local optima and improve clustering efficiency. This hybrid approach ensures the formation of high-quality clusters and accelerates convergence. The suggested method is validated across several real-world datasets of varying sizes and properties. Experimental results, evaluated using five performance metrics — MICD, ARI, NMI, ET, and ODR — and compared with eight state-of-the-art algorithms, demonstrate the proposed method's superior performance in both speed and accuracy.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100800"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-30DOI: 10.1016/j.mlwa.2025.100833
Chandramohan Abhishek , Nadimpalli Raghukiran
The present research showcases a machine-interactive approach for making decisions using a pre-trained natural language processing (NLP) model. The method is developed for 4D (4-dimensional) printing technique selection, as a plurality of variables is involved, such as process, material, design, and sequence selections. Due to the availability of numerous options, arriving at a preferred choice of technique requires expertise and time. The developed method aids in finding assistance from a single source. The approach incorporates bidirectional encoder representations from transformers (BERT), which accommodates parallel meanings of user requests, such as synonyms and adjectives, among others. The closed-loop system is programmed with a set of 7 prompts. It also introduces additional affirmation prompts to navigate both ambiguous phrasing and out-of-scope detection in order to receive a meaningful recommendation from the machine. The rule-governed technique (lightweight rule set) guides the selection of the conformable request during each prompt. The inference-based approach takes user requests, performs objective classification using BERT according to selected criteria, then dynamically filters the data, and recommends suggestions, with an inference time of 0.79 s. The modified model also establishes multi-level relationships among prompts for text classification. k-fold validation reached highest possible accuracy upon training with optimal hyperparameters. The fine-tuned method developed in Python environment can be generalized for other systems. The present research demonstrates the possibility of adapting an openly accessible model for developing a decision-assistance system with minimal personal computational resources.
{"title":"Machine-interactive decision-assistance using a pre-trained natural language processing model for 4D printing technique selection","authors":"Chandramohan Abhishek , Nadimpalli Raghukiran","doi":"10.1016/j.mlwa.2025.100833","DOIUrl":"10.1016/j.mlwa.2025.100833","url":null,"abstract":"<div><div>The present research showcases a machine-interactive approach for making decisions using a pre-trained natural language processing (NLP) model. The method is developed for 4D (4-dimensional) printing technique selection, as a plurality of variables is involved, such as process, material, design, and sequence selections. Due to the availability of numerous options, arriving at a preferred choice of technique requires expertise and time. The developed method aids in finding assistance from a single source. The approach incorporates bidirectional encoder representations from transformers (BERT), which accommodates parallel meanings of user requests, such as synonyms and adjectives, among others. The closed-loop system is programmed with a set of 7 prompts. It also introduces additional affirmation prompts to navigate both ambiguous phrasing and out-of-scope detection in order to receive a meaningful recommendation from the machine. The rule-governed technique (lightweight rule set) guides the selection of the conformable request during each prompt. The inference-based approach takes user requests, performs objective classification using BERT according to selected criteria, then dynamically filters the data, and recommends suggestions, with an inference time of 0.79 s. The modified model also establishes multi-level relationships among prompts for text classification. k-fold validation reached highest possible accuracy upon training with optimal hyperparameters. The fine-tuned method developed in Python environment can be generalized for other systems. The present research demonstrates the possibility of adapting an openly accessible model for developing a decision-assistance system with minimal personal computational resources.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100833"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conducting prior patent searches before developing technologies and filing patent applications in companies or universities is essential for understanding technological trends among competitors and academic institutions, as well as for increasing the likelihood of obtaining patent rights. In these searches, it is important not only to include relevant keywords in the search queries but also to incorporate related terms retrieved from a thesaurus. To support this, methods using word embeddings for automatically extracting such synonyms have recently been proposed. However, patent documents often contain unique expressions and compound terms, such as specialized technical terminology and abstract conceptual terms, which are difficult to accurately capture using existing large language models trained at the token level.
In this study, we investigate a method for extracting synonyms from patent documents by embedding the definition sentences that explain technical terms. The experimental results demonstrate that the proposed method achieves more precise synonym extraction than conventional word embedding approaches, and it can contribute to the expansion of existing thesauri.
Thus, this research is expected to improve the recall of prior art searches and support the automatic extraction of technical elements for identifying technological trends.
{"title":"Synonym extraction from Japanese patent documents using term definition sentences","authors":"Koji Marusaki , Seiya Kawano , Asahi Hentona , Hirofumi Nonaka","doi":"10.1016/j.mlwa.2026.100848","DOIUrl":"10.1016/j.mlwa.2026.100848","url":null,"abstract":"<div><div>Conducting prior patent searches before developing technologies and filing patent applications in companies or universities is essential for understanding technological trends among competitors and academic institutions, as well as for increasing the likelihood of obtaining patent rights. In these searches, it is important not only to include relevant keywords in the search queries but also to incorporate related terms retrieved from a thesaurus. To support this, methods using word embeddings for automatically extracting such synonyms have recently been proposed. However, patent documents often contain unique expressions and compound terms, such as specialized technical terminology and abstract conceptual terms, which are difficult to accurately capture using existing large language models trained at the token level.</div><div>In this study, we investigate a method for extracting synonyms from patent documents by embedding the definition sentences that explain technical terms. The experimental results demonstrate that the proposed method achieves more precise synonym extraction than conventional word embedding approaches, and it can contribute to the expansion of existing thesauri.</div><div>Thus, this research is expected to improve the recall of prior art searches and support the automatic extraction of technical elements for identifying technological trends.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100848"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-29DOI: 10.1016/j.mlwa.2025.100828
Jungmin Eom , Minjun Kang , Myungkeun Yoon , Nikil Dutt , Jinkyu Kim , Jaekoo Lee
Deep learning-based medical AI systems are increasingly deployed for disease diagnosis in decentralized healthcare environments where data are siloed across hospitals and IoT devices and cannot be freely shared due to strict privacy and security regulations. However, most existing continual learning and distributed learning approaches either assume centrally aggregated data or overlook incremental clinical changes, leading to catastrophic forgetting when applied to real-world medical data streams.
This paper introduces a novel healthcare-specific framework that integrates continual learning and distributed learning methods to utilize medical AI models effectively by addressing the practical constraints of the healthcare and medical ecosystem, such as data privacy, security, and changing clinical environments. Through the proposed framework, medical clients, such as hospital devices and IoT-based smart devices, can collaboratively train deep learning-based models on distributed computing resources without sharing sensitive data. Additionally, by considering incremental characteristics in medical environments such as mutations, new diseases, and abnormalities, the proposed framework can improve the disease diagnosis of medical AI models in actual clinical scenarios.
We propose Privacy-preserving Rehearsal-based Continual Split Learning (PRCSL), a healthcare-specific continual split learning framework that combines differential-privacy-based exemplar sharing, a mutual information alignment (MIA) module to correct representation shifts induced by noisy exemplars, and a parameter-free nearest-mean-of-exemplars (NME) classifier to mitigate task-recency bias under non-IID data distributions. o=Across eight benchmark datasets, including four MedMNIST subsets, HAM10000, CCH5000, c=CIFAR,cp=, p=100, and SVHN, PRCSL achieves competitive performance compared with representative continual learning baselines in terms of average accuracy and average forgetting. In particular, PRCSL achieves up to 3.62%p higher average accuracy than the best baseline. These results indicate that PRCSL enables privacy-preserving, communication-efficient, and continually adaptable medical AI in realistic decentralized clinical and IoT-enabled ecosystems. Our code is publicly available at our repository.
{"title":"PRCSL: A privacy-preserving continual split learning framework for decentralized medical diagnosis","authors":"Jungmin Eom , Minjun Kang , Myungkeun Yoon , Nikil Dutt , Jinkyu Kim , Jaekoo Lee","doi":"10.1016/j.mlwa.2025.100828","DOIUrl":"10.1016/j.mlwa.2025.100828","url":null,"abstract":"<div><div>Deep learning-based medical AI systems are increasingly deployed for disease diagnosis in decentralized healthcare environments where data are siloed across hospitals and IoT devices and cannot be freely shared due to strict privacy and security regulations. However, most existing continual learning and distributed learning approaches either assume centrally aggregated data or overlook incremental clinical changes, leading to catastrophic forgetting when applied to real-world medical data streams.</div><div>This paper introduces a novel healthcare-specific framework that integrates continual learning and distributed learning methods to utilize medical AI models effectively by addressing the practical constraints of the healthcare and medical ecosystem, such as data privacy, security, and changing clinical environments. Through the proposed framework, medical clients, such as hospital devices and IoT-based smart devices, can collaboratively train deep learning-based models on distributed computing resources without sharing sensitive data. Additionally, by considering incremental characteristics in medical environments such as mutations, new diseases, and abnormalities, the proposed framework can improve the disease diagnosis of medical AI models in actual clinical scenarios.</div><div>We propose Privacy-preserving Rehearsal-based Continual Split Learning (PRCSL), a healthcare-specific continual split learning framework that combines differential-privacy-based exemplar sharing, a mutual information alignment (MIA) module to correct representation shifts induced by noisy exemplars, and a parameter-free nearest-mean-of-exemplars (NME) classifier to mitigate task-recency bias under non-IID data distributions. o=Across eight benchmark datasets, including four MedMNIST subsets, HAM10000, CCH5000, c=CIFAR,cp=, p=100, and SVHN, PRCSL achieves competitive performance compared with representative continual learning baselines in terms of average accuracy and average forgetting. In particular, PRCSL achieves up to 3.62%p higher average accuracy than the best baseline. These results indicate that PRCSL enables privacy-preserving, communication-efficient, and continually adaptable medical AI in realistic decentralized clinical and IoT-enabled ecosystems. Our code is publicly available at our repository.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100828"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}