首页 > 最新文献

Machine learning with applications最新文献

英文 中文
Comparing model-specific and model-agnostic features importance methods using machine learning with technical indicators: A NASDAQ sector-based study 使用机器学习和技术指标比较模型特定和模型不可知特征重要性方法:一项基于纳斯达克行业的研究
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-11-25 DOI: 10.1016/j.mlwa.2025.100799
Jeonghoe Lee, Lin Cai
Predicting stock prices is crucial for making informed investment decisions as stock markets significantly influence the global economy. Although previous studies have explored feature importance methods for stock price prediction, comprehensive comparisons of those methods have been limited. This study aims to provide a detailed comparison of different feature importance methods for selecting technical indicators to predict stock prices. Specifically, this research analyzed financial data from the 11 sectors of the NASDAQ. A moving window forecasting framework was implemented to dynamically capture the evolving patterns in financial markets over time. Model-specific feature importance methods were compared with model-agnostic approaches. Multiple machine learning algorithms, including Random Forest (RF), and Multi-layer Neural Network (MNNs), were employed to forecast stock prices. Additionally, extensive hyperparameter tuning was conducted to improve model explainability, contributing to the field of Explainable Artificial Intelligence (XAI). The results highlight the predictive effectiveness of different feature importance methods in selecting optimal technical indicators, thereby offering valuable insights for enhancing stock price forecasting accuracy and model transparency. In summary, this research offers a comprehensive comparison of feature importance methods, emphasizing their application in the selection of technical indicators in a dynamic, rolling prediction setting.
由于股市对全球经济的影响很大,预测股价对于做出明智的投资决策至关重要。虽然已有研究探索了特征重要性方法用于股票价格预测,但对这些方法的综合比较有限。本研究旨在对选择技术指标预测股价的不同特征重要性方法进行详细比较。具体来说,本研究分析了纳斯达克11个板块的财务数据。实现了一个移动窗口预测框架,以动态捕捉金融市场随时间变化的模式。将特定于模型的特征重要性方法与不可知模型的方法进行了比较。采用随机森林(Random Forest, RF)和多层神经网络(Multi-layer Neural Network, MNNs)等多种机器学习算法进行股价预测。此外,进行了广泛的超参数调优以提高模型的可解释性,为可解释人工智能(Explainable Artificial Intelligence, XAI)领域做出了贡献。结果突出了不同特征重要性方法在选择最优技术指标方面的预测效果,从而为提高股价预测精度和模型透明度提供了有价值的见解。综上所述,本研究对特征重要性方法进行了全面比较,强调了它们在动态滚动预测环境下技术指标选择中的应用。
{"title":"Comparing model-specific and model-agnostic features importance methods using machine learning with technical indicators: A NASDAQ sector-based study","authors":"Jeonghoe Lee,&nbsp;Lin Cai","doi":"10.1016/j.mlwa.2025.100799","DOIUrl":"10.1016/j.mlwa.2025.100799","url":null,"abstract":"<div><div>Predicting stock prices is crucial for making informed investment decisions as stock markets significantly influence the global economy. Although previous studies have explored feature importance methods for stock price prediction, comprehensive comparisons of those methods have been limited. This study aims to provide a detailed comparison of different feature importance methods for selecting technical indicators to predict stock prices. Specifically, this research analyzed financial data from the 11 sectors of the NASDAQ. A moving window forecasting framework was implemented to dynamically capture the evolving patterns in financial markets over time. Model-specific feature importance methods were compared with model-agnostic approaches. Multiple machine learning algorithms, including Random Forest (RF), and Multi-layer Neural Network (MNNs), were employed to forecast stock prices. Additionally, extensive hyperparameter tuning was conducted to improve model explainability, contributing to the field of Explainable Artificial Intelligence (XAI). The results highlight the predictive effectiveness of different feature importance methods in selecting optimal technical indicators, thereby offering valuable insights for enhancing stock price forecasting accuracy and model transparency. In summary, this research offers a comprehensive comparison of feature importance methods, emphasizing their application in the selection of technical indicators in a dynamic, rolling prediction setting.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100799"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid DEA–fuzzy clustering approach for accurate reference set identification 一种用于准确识别参考集的混合dea -模糊聚类方法
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-12-09 DOI: 10.1016/j.mlwa.2025.100818
Sara Fanati Rashidi , Maryam Olfati , Seyedali Mirjalili , Crina Grosan , Jan Platoš , Vaclav Snášel
This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.
本研究将数据包络分析(DEA)与机器学习(ML)相结合,以解决传统DEA在识别低效决策单元(dmu)参考集方面的关键局限性。在DEA中,低效单元根据基准单元进行评估;然而,一些基准可能是不合适的,甚至是异常值,这可能会扭曲效率边界。此外,当增加一个新的DMU时,必须重新计算整个模型,这对于大型数据集来说,计算成本很高。为了克服这些问题,我们提出了一种结合模糊c均值(FCM)和可能性模糊c均值(PFCM)聚类的混合方法。该方法利用欧几里得距离和隶属度来识别更接近、更相关的参考单元,同时根据实际需要引入灵敏度阈值来控制基准的数量。在两个数据集上验证了该方法的有效性:一个银行数据集和一个包含1,372个样本的钞票认证数据集。结果表明,基于ml框架的参考集与DEA的一致性达到71.6%-98.3%,同时克服了两个主要缺点:(1)对数据集大小的敏感性;(2)包含不适当的参考单元。此外,统计分析,包括置信区间和McNemar的检验,证实了研究结果的稳健性和现实意义。
{"title":"A hybrid DEA–fuzzy clustering approach for accurate reference set identification","authors":"Sara Fanati Rashidi ,&nbsp;Maryam Olfati ,&nbsp;Seyedali Mirjalili ,&nbsp;Crina Grosan ,&nbsp;Jan Platoš ,&nbsp;Vaclav Snášel","doi":"10.1016/j.mlwa.2025.100818","DOIUrl":"10.1016/j.mlwa.2025.100818","url":null,"abstract":"<div><div>This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100818"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of input selection methods for neural networks applied to complex fluid dynamic inverse problem 神经网络输入选择方法在复杂流体动力反问题中的应用比较
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2026-01-12 DOI: 10.1016/j.mlwa.2026.100842
Jaume Luis-Gómez , Guillem Monrós-Andreu , Sergio Iserte , Sergio Chiva , Raúl Martínez-Cuenca
Efficient identification of informative inputs is critical when training Machine Learning (ML) surrogates on large, multi-sensor datasets. In this paper, we benchmark several input selection methods from the literature alongside new methods proposed here. A baseline method based on expert-driven (human) selection is used as a reference. All methods are evaluated on a challenging inverse problem, in which Computational Fluid Dynamic (CFD) simulations are used to train a Deep Neural Network (DNN) to infer unknown momentum source terms from discrete velocity measurements. The proposed methodology does not explicitly depend on the geometry of the domain and is therefore transferable to other problems involving sparse sensor measurements, although domain-specific validation may still be required. The results show that four input selection methods reduce the number of inputs to as few as five, with minimal impact on the mean average predictive error. This corresponds to a forty-fold reduction relative to the original number of inputs. Analysis of the top four inputs shows that each method selects different locations, indicating that multiple combinations can yield similar accurate results. The top four methods significantly outperform the baseline method based on human selection. This study demonstrates that input selection methods reduce computational costs during both training and inference stages. They also lower experimental demands by identifying high-value sensor locations, thereby reducing the number of required sampling points. These findings suggest that input selection methods should be considered standard practice in ML applications with complex scenarios constrained by limited experimental data.
在大型多传感器数据集上训练机器学习(ML)代理时,有效识别信息输入至关重要。在本文中,我们对文献中的几种输入选择方法以及本文提出的新方法进行了基准测试。采用基于专家驱动(人类)选择的基线方法作为参考。所有方法都在一个具有挑战性的反问题上进行了评估,其中使用计算流体动力学(CFD)模拟来训练深度神经网络(DNN),以从离散速度测量中推断未知的动量源项。所提出的方法不明确地依赖于域的几何形状,因此可以转移到涉及稀疏传感器测量的其他问题,尽管可能仍然需要特定领域的验证。结果表明,四种输入选择方法将输入数量减少到5个,对平均预测误差的影响最小。这相当于相对于原始输入数量减少了40倍。对前四种输入的分析表明,每种方法选择的位置不同,表明多种组合可以产生相似的准确结果。前四种方法明显优于基于人为选择的基线方法。该研究表明,输入选择方法在训练和推理阶段都减少了计算成本。它们还通过识别高值传感器位置来降低实验要求,从而减少所需采样点的数量。这些发现表明,在实验数据有限的复杂场景下,输入选择方法应该被视为ML应用程序的标准实践。
{"title":"Comparison of input selection methods for neural networks applied to complex fluid dynamic inverse problem","authors":"Jaume Luis-Gómez ,&nbsp;Guillem Monrós-Andreu ,&nbsp;Sergio Iserte ,&nbsp;Sergio Chiva ,&nbsp;Raúl Martínez-Cuenca","doi":"10.1016/j.mlwa.2026.100842","DOIUrl":"10.1016/j.mlwa.2026.100842","url":null,"abstract":"<div><div>Efficient identification of informative inputs is critical when training Machine Learning (ML) surrogates on large, multi-sensor datasets. In this paper, we benchmark several input selection methods from the literature alongside new methods proposed here. A baseline method based on expert-driven (human) selection is used as a reference. All methods are evaluated on a challenging inverse problem, in which Computational Fluid Dynamic (CFD) simulations are used to train a Deep Neural Network (DNN) to infer unknown momentum source terms from discrete velocity measurements. The proposed methodology does not explicitly depend on the geometry of the domain and is therefore transferable to other problems involving sparse sensor measurements, although domain-specific validation may still be required. The results show that four input selection methods reduce the number of inputs to as few as five, with minimal impact on the mean average predictive error. This corresponds to a forty-fold reduction relative to the original number of inputs. Analysis of the top four inputs shows that each method selects different locations, indicating that multiple combinations can yield similar accurate results. The top four methods significantly outperform the baseline method based on human selection. This study demonstrates that input selection methods reduce computational costs during both training and inference stages. They also lower experimental demands by identifying high-value sensor locations, thereby reducing the number of required sampling points. These findings suggest that input selection methods should be considered standard practice in ML applications with complex scenarios constrained by limited experimental data.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100842"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From tensors to novelties: Low-dimensional representations for anomaly detection in multispectral imagery 从张量到新奇:多光谱图像异常检测的低维表示
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2026-01-31 DOI: 10.1016/j.mlwa.2026.100858
Anthony Chan Chan
Anomaly detection in multispectral imagery must cope with high-dimensional inputs, scarce labeled anomalies and operational constraints. Tensor decompositions offer a structured way to compress such data, but their impact on anomaly detection performance and cost is not well quantified. This work studies how low-rank tensor representations affect one-class detectors on multispectral imagery.
Two decomposition strategies are evaluated as feature extractors: a global CP (PARAFAC) model fitted on training tiles and a per-tile Tucker model. The resulting coefficients are used to train one-class support vector machines, autoencoders and isolation forests on typical examples only, and anomalies are identified through their detector scores. The study uses multispectral Mastcam images from the Mars Science Laboratory Curiosity rover, a dataset with rare labeled novelties.
Experiments cover two evaluation regimes (randomized across subclasses and subclass-specific) and four training sizes n{500,1000,1500,3000}. Under randomized sampling, CP and Tucker improve ROC–AUC over PCA for OC-SVM by approximately 10 to 15 percent at n1500, while autoencoder gains span approximately 2 to 13 percent depending on decomposition and sample size. In subclass-specific tests for structured subclasses (drill-hole, DRT, dump-pile), CP and Tucker yield larger improvements at n1500, with absolute ROC–AUC increases over PCA ranging from approximately 14 to 56 percent, whereas for visually homogeneous subclasses (bedrock, broken rock, float, veins) decompositions rarely improve over PCA and can reduce performance. Computationally, CP can require peak RSS memory above 50 GB, whereas Tucker often remains below 10 GB (8.67 GB vs 52.53 GB in the reported runs, a 84% reduction), albeit with longer runtimes. Overall, the results indicate that tensor decompositions are most valuable as selective enhancements to PCA in multispectral anomaly detection pipelines when multiway structure is informative and training data are limited.
多光谱图像中的异常检测必须处理高维输入、标记异常稀少和操作限制等问题。张量分解提供了一种结构化的方法来压缩这些数据,但它们对异常检测性能和成本的影响并没有很好地量化。这项工作研究了低秩张量表示如何影响多光谱图像上的一类探测器。两种分解策略被评估为特征提取器:一个全局CP (PARAFAC)模型拟合训练块和一个逐块Tucker模型。所得系数仅用于在典型示例上训练一类支持向量机、自动编码器和隔离森林,并通过其检测器得分识别异常。这项研究使用了火星科学实验室好奇号火星车的多光谱Mastcam图像,这是一个罕见的标记新颖的数据集。实验涵盖两种评估机制(随机跨子类和特定子类)和四种训练大小n∈{500,1000,1500,3000}。在随机抽样下,CP和Tucker在n≤1500时将OC-SVM的ROC-AUC提高了约10%至15%,而自动编码器的增益范围约为2%至13%,具体取决于分解和样本大小。在针对结构化亚类(钻孔、DRT、排土堆)的亚类特定测试中,CP和Tucker在n≤1500时产生更大的改进,相对于PCA的绝对oc - auc增加幅度约为14%至56%,而对于视觉上均匀的亚类(基岩、破碎岩石、浮子、矿体),分解很少比PCA改善,而且会降低性能。在计算上,尽管运行时间更长,但CP可能需要超过50 GB的峰值RSS内存,而Tucker通常保持在10 GB以下(8.67 GB vs 52.53 GB,减少了84%)。总的来说,结果表明,当多路结构信息丰富且训练数据有限时,张量分解作为PCA的选择性增强在多光谱异常检测管道中最有价值。
{"title":"From tensors to novelties: Low-dimensional representations for anomaly detection in multispectral imagery","authors":"Anthony Chan Chan","doi":"10.1016/j.mlwa.2026.100858","DOIUrl":"10.1016/j.mlwa.2026.100858","url":null,"abstract":"<div><div>Anomaly detection in multispectral imagery must cope with high-dimensional inputs, scarce labeled anomalies and operational constraints. Tensor decompositions offer a structured way to compress such data, but their impact on anomaly detection performance and cost is not well quantified. This work studies how low-rank tensor representations affect one-class detectors on multispectral imagery.</div><div>Two decomposition strategies are evaluated as feature extractors: a global CP (PARAFAC) model fitted on training tiles and a per-tile Tucker model. The resulting coefficients are used to train one-class support vector machines, autoencoders and isolation forests on typical examples only, and anomalies are identified through their detector scores. The study uses multispectral Mastcam images from the Mars Science Laboratory Curiosity rover, a dataset with rare labeled novelties.</div><div>Experiments cover two evaluation regimes (randomized across subclasses and subclass-specific) and four training sizes <span><math><mrow><mi>n</mi><mo>∈</mo><mrow><mo>{</mo><mn>500</mn><mo>,</mo><mn>1000</mn><mo>,</mo><mn>1500</mn><mo>,</mo><mn>3000</mn><mo>}</mo></mrow></mrow></math></span>. Under randomized sampling, CP and Tucker improve ROC–AUC over PCA for OC-SVM by approximately 10 to 15 percent at <span><math><mrow><mi>n</mi><mo>≤</mo><mn>1500</mn></mrow></math></span>, while autoencoder gains span approximately 2 to 13 percent depending on decomposition and sample size. In subclass-specific tests for structured subclasses (drill-hole, DRT, dump-pile), CP and Tucker yield larger improvements at <span><math><mrow><mi>n</mi><mo>≤</mo><mn>1500</mn></mrow></math></span>, with absolute ROC–AUC increases over PCA ranging from approximately 14 to 56 percent, whereas for visually homogeneous subclasses (bedrock, broken rock, float, veins) decompositions rarely improve over PCA and can reduce performance. Computationally, CP can require peak RSS memory above 50 GB, whereas Tucker often remains below 10 GB (8.67 GB vs 52.53 GB in the reported runs, a 84% reduction), albeit with longer runtimes. Overall, the results indicate that tensor decompositions are most valuable as selective enhancements to PCA in multispectral anomaly detection pipelines when multiway structure is informative and training data are limited.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100858"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146187665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning based adaptive soft error mitigation efficiency 基于机器学习的自适应软错误缓解效率
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-11-25 DOI: 10.1016/j.mlwa.2025.100797
Nicholas Maurer, Mohammed Abdallah
This work presents a novel adaptive framework for soft error mitigation in space-based systems, designed to resolve the fundamental conflict between system performance and radiation protection. By leveraging a Long Short-Term Memory (LSTM) model to predict real-time solar particle flux, our approach dynamically enables or disables software-based mitigation techniques. This contrasts with the static, "always-on" methods of existing systems, offering a significant improvement in computational efficiency. The proposed LSTM model was trained on NASA solar particle flux data, achieving a mean average error of 7.65e-6, demonstrating its high accuracy in predicting nonlinear particle events. Our simulation, which applies this predictive model to a tiered system of redundant processing, checkpointing, and watchdog timers, shows a substantial reduction in overhead. During the 18,414-second test period, the combined adaptive mitigation methods introduced only 20.75–51.6 s of overhead, representing a 99.4 % reduction in overhead compared to continuous, static mitigation. This research's primary contribution is a demonstrated proof-of-concept for an intelligent, self-adaptive system that can maintain high reliability while drastically improving performance. This approach provides a pathway for utilizing more cost-effective commercial-off-the-shelf (COTS) processors in radiation-intensive environments.
这项工作提出了一种新的自适应框架,用于天基系统的软误差缓解,旨在解决系统性能和辐射防护之间的根本冲突。通过利用长短期记忆(LSTM)模型来预测实时太阳粒子通量,我们的方法动态启用或禁用基于软件的缓解技术。这与现有系统的静态“永远在线”方法形成对比,显著提高了计算效率。本文提出的LSTM模型在NASA太阳粒子通量数据上进行了训练,平均误差为7.65e-6,对非线性粒子事件的预测精度较高。我们的模拟将该预测模型应用于冗余处理、检查点和看门狗计时器的分层系统,结果显示开销大幅减少。在18414秒的测试期间,组合的自适应缓解方法只带来了20.75-51.6秒的开销,与连续的静态缓解相比,减少了99.4%的开销。这项研究的主要贡献是对智能自适应系统的概念验证,该系统可以在大幅提高性能的同时保持高可靠性。这种方法为在辐射密集的环境中利用更具成本效益的商用现成(COTS)处理器提供了一条途径。
{"title":"Machine learning based adaptive soft error mitigation efficiency","authors":"Nicholas Maurer,&nbsp;Mohammed Abdallah","doi":"10.1016/j.mlwa.2025.100797","DOIUrl":"10.1016/j.mlwa.2025.100797","url":null,"abstract":"<div><div>This work presents a novel adaptive framework for soft error mitigation in space-based systems, designed to resolve the fundamental conflict between system performance and radiation protection. By leveraging a Long Short-Term Memory (LSTM) model to predict real-time solar particle flux, our approach dynamically enables or disables software-based mitigation techniques. This contrasts with the static, \"always-on\" methods of existing systems, offering a significant improvement in computational efficiency. The proposed LSTM model was trained on NASA solar particle flux data, achieving a mean average error of 7.65e-6, demonstrating its high accuracy in predicting nonlinear particle events. Our simulation, which applies this predictive model to a tiered system of redundant processing, checkpointing, and watchdog timers, shows a substantial reduction in overhead. During the 18,414-second test period, the combined adaptive mitigation methods introduced only 20.75–51.6 s of overhead, representing a 99.4 % reduction in overhead compared to continuous, static mitigation. This research's primary contribution is a demonstrated proof-of-concept for an intelligent, self-adaptive system that can maintain high reliability while drastically improving performance. This approach provides a pathway for utilizing more cost-effective commercial-off-the-shelf (COTS) processors in radiation-intensive environments.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100797"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TQC: An intelligent clustering approach for large-scale, noisy, and imbalanced data TQC:一种针对大规模、嘈杂和不平衡数据的智能聚类方法
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-11-26 DOI: 10.1016/j.mlwa.2025.100800
Ali Asghari
As an unsupervised learning method, clustering is a critical technique in artificial intelligence for organizing raw data into meaningful groups. In this process, data is partitioned based on the internal similarity of members within the same cluster and the maximum external distance from other clusters. Beyond business analytics, healthcare, economics, and other fields, clustering has been widely applied across disciplines. Extracting practical knowledge from large datasets relies on an effective clustering technique. Processing speed, especially for large datasets, handling noisy data and outliers, and ensuring high accuracy are the main challenges in clustering. These problems are especially significant in contemporary applications, where heterogeneous and inherently noisy datasets are prevalent. Combining the Trees Social Relation Algorithm (TSR) with the Queue Learning (QL) algorithm, the proposed approach, TQC (Tree-Queue Clustering), addresses these problems. While the QL algorithm enhances clustering accuracy, the TSR method focuses on accelerating clustering. The suggested approach first divides the data into smaller groups. Then, by effectively computing group memberships, TSR's migration process causes clusters to develop progressively. Handling noise and outliers helps the QL algorithm prevent local optima and improve clustering efficiency. This hybrid approach ensures the formation of high-quality clusters and accelerates convergence. The suggested method is validated across several real-world datasets of varying sizes and properties. Experimental results, evaluated using five performance metrics — MICD, ARI, NMI, ET, and ODR — and compared with eight state-of-the-art algorithms, demonstrate the proposed method's superior performance in both speed and accuracy.
聚类作为一种无监督学习方法,是人工智能中将原始数据组织成有意义组的关键技术。在此过程中,根据同一簇内成员的内部相似度和与其他簇的最大外部距离对数据进行分区。除了业务分析、医疗保健、经济学和其他领域之外,集群还被广泛应用于各个学科。从大型数据集中提取实用知识依赖于有效的聚类技术。处理速度,特别是对于大型数据集,处理噪声数据和异常值,并确保高准确性是聚类的主要挑战。这些问题在当代应用中尤为重要,因为异构和固有噪声数据集很普遍。将树社会关系算法(TSR)与队列学习算法(QL)相结合,提出的树队列聚类(TQC)方法解决了这些问题。QL算法提高了聚类的精度,而TSR方法侧重于加速聚类。建议的方法首先将数据分成较小的组。然后,通过有效地计算群组成员,TSR的迁移过程使集群逐步发展。处理噪声和异常值有助于QL算法防止局部最优,提高聚类效率。这种混合方式保证了高质量集群的形成,加速了收敛。建议的方法在几个不同大小和属性的真实数据集上进行了验证。实验结果,使用五个性能指标(MICD, ARI, NMI, ET和ODR)进行评估,并与八种最先进的算法进行比较,证明了该方法在速度和准确性方面的优越性能。
{"title":"TQC: An intelligent clustering approach for large-scale, noisy, and imbalanced data","authors":"Ali Asghari","doi":"10.1016/j.mlwa.2025.100800","DOIUrl":"10.1016/j.mlwa.2025.100800","url":null,"abstract":"<div><div>As an unsupervised learning method, clustering is a critical technique in artificial intelligence for organizing raw data into meaningful groups. In this process, data is partitioned based on the internal similarity of members within the same cluster and the maximum external distance from other clusters. Beyond business analytics, healthcare, economics, and other fields, clustering has been widely applied across disciplines. Extracting practical knowledge from large datasets relies on an effective clustering technique. Processing speed, especially for large datasets, handling noisy data and outliers, and ensuring high accuracy are the main challenges in clustering. These problems are especially significant in contemporary applications, where heterogeneous and inherently noisy datasets are prevalent. Combining the Trees Social Relation Algorithm (TSR) with the Queue Learning (QL) algorithm, the proposed approach, TQC (Tree-Queue Clustering), addresses these problems. While the QL algorithm enhances clustering accuracy, the TSR method focuses on accelerating clustering. The suggested approach first divides the data into smaller groups. Then, by effectively computing group memberships, TSR's migration process causes clusters to develop progressively. Handling noise and outliers helps the QL algorithm prevent local optima and improve clustering efficiency. This hybrid approach ensures the formation of high-quality clusters and accelerates convergence. The suggested method is validated across several real-world datasets of varying sizes and properties. Experimental results, evaluated using five performance metrics — MICD, ARI, NMI, ET, and ODR — and compared with eight state-of-the-art algorithms, demonstrate the proposed method's superior performance in both speed and accuracy.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100800"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ExpressNet-MoE: A hybrid deep neural network for emotion recognition ExpressNet-MoE:一个用于情感识别的混合深度神经网络
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2026-01-02 DOI: 10.1016/j.mlwa.2025.100830
Deeptimaan Banerjee, Prateek Gothwal, Ashis Kumer Biswas
In many domains, including online education, healthcare, security, and human–computer interaction, facial emotion recognition (FER) is essential. Real-world FER is still difficult because of factors like head positions, occlusions, illumination shifts, and demographic diversity. Engagement detection system, which is essential in virtual learning platforms is severely challenged by these factors. In this article, we propose ExpressNet-MoE, a novel hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with a Mixture of Experts (MoE) framework to address these challenges. The proposed model dynamically selects the most relevant expert networks for each input, thereby improving generalization and adaptability across diverse datasets. Our methodology involves training ExpressNet-MoE independently on several benchmark datasets after preprocessing facial pictures using BlazeFace for face detection and alignment. To maintain class distribution, stratified sampling is used to divide each dataset into training and testing groups. Our model improves on the accuracy of emotion recognition by utilizing multi-scale feature extraction to collect both global and local facial features. ExpressNet-MoE includes numerous CNN-based feature extractors, a MoE module for adaptive feature selection, and finally a residual network backbone for deep feature learning. To demonstrate efficacy of our proposed model we evaluated it on four widely used datasets: AffectNet7, AffectNet8, RAF-DB, and FER-2013; and compared with current state-of-the-art methods. Our model achieves accuracies of 74.40% ± 0.45 on AffectNet7, 71.98% ± 0.66 on AffectNet8, 83.41% ± 1.06 on RAF-DB, and 67.05% ± 2.08 on FER-2013. Overall, the findings indicate that the adaptive expert selection and multi-scale feature extraction significantly enhances the robustness of facial emotion recognition across diverse real-world conditions and how it may be used to develop end-to-end emotion recognition systems in practical settings. Reproducible codes and results are made publicly accessible at https://github.com/DeeptimaanB/ExpressNet-MoE.
在许多领域,包括在线教育、医疗保健、安全和人机交互,面部情感识别(FER)是必不可少的。由于头部位置、遮挡、光照变化和人口多样性等因素,现实世界的FER仍然很困难。这些因素对虚拟学习平台中必不可少的敬业度检测系统构成了严峻的挑战。在本文中,我们提出了ExpressNet-MoE,这是一种新型的混合深度学习架构,将卷积神经网络(cnn)与混合专家(MoE)框架相结合,以解决这些挑战。该模型为每个输入动态选择最相关的专家网络,从而提高了不同数据集的泛化和适应性。我们的方法包括在使用BlazeFace对人脸图像进行预处理后,在几个基准数据集上独立训练ExpressNet-MoE进行人脸检测和对齐。为了保持类分布,采用分层抽样的方法将每个数据集划分为训练组和测试组。我们的模型利用多尺度特征提取来收集全局和局部面部特征,从而提高了情绪识别的准确性。ExpressNet-MoE包括许多基于cnn的特征提取器,用于自适应特征选择的MoE模块,最后是用于深度特征学习的残差网络骨干。为了证明我们提出的模型的有效性,我们在四个广泛使用的数据集上进行了评估:AffectNet7、AffectNet8、RAF-DB和FER-2013;与目前最先进的方法相比。该模型在AffectNet7上的准确率为74.40%±0.45,在AffectNet8上的准确率为71.98%±0.66,在RAF-DB上的准确率为83.41%±1.06,在FER-2013上的准确率为67.05%±2.08。总体而言,研究结果表明,自适应专家选择和多尺度特征提取显著增强了面部情绪识别在不同现实世界条件下的鲁棒性,以及如何将其用于开发实际环境中的端到端情绪识别系统。可复制的代码和结果可在https://github.com/DeeptimaanB/ExpressNet-MoE上公开访问。
{"title":"ExpressNet-MoE: A hybrid deep neural network for emotion recognition","authors":"Deeptimaan Banerjee,&nbsp;Prateek Gothwal,&nbsp;Ashis Kumer Biswas","doi":"10.1016/j.mlwa.2025.100830","DOIUrl":"10.1016/j.mlwa.2025.100830","url":null,"abstract":"<div><div>In many domains, including online education, healthcare, security, and human–computer interaction, facial emotion recognition (FER) is essential. Real-world FER is still difficult because of factors like head positions, occlusions, illumination shifts, and demographic diversity. Engagement detection system, which is essential in virtual learning platforms is severely challenged by these factors. In this article, we propose ExpressNet-MoE, a novel hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with a Mixture of Experts (MoE) framework to address these challenges. The proposed model dynamically selects the most relevant expert networks for each input, thereby improving generalization and adaptability across diverse datasets. Our methodology involves training ExpressNet-MoE independently on several benchmark datasets after preprocessing facial pictures using BlazeFace for face detection and alignment. To maintain class distribution, stratified sampling is used to divide each dataset into training and testing groups. Our model improves on the accuracy of emotion recognition by utilizing multi-scale feature extraction to collect both global and local facial features. ExpressNet-MoE includes numerous CNN-based feature extractors, a MoE module for adaptive feature selection, and finally a residual network backbone for deep feature learning. To demonstrate efficacy of our proposed model we evaluated it on four widely used datasets: <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>7</mn></mrow></msub></math></span>, <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>8</mn></mrow></msub></math></span>, RAF-DB, and FER-2013; and compared with current state-of-the-art methods. Our model achieves accuracies of 74.40% <span><math><mo>±</mo></math></span> 0.45 on <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>7</mn></mrow></msub></math></span>, 71.98% <span><math><mo>±</mo></math></span> 0.66 on <span><math><msub><mrow><mtext>AffectNet</mtext></mrow><mrow><mn>8</mn></mrow></msub></math></span>, 83.41% <span><math><mo>±</mo></math></span> 1.06 on RAF-DB, and 67.05% <span><math><mo>±</mo></math></span> 2.08 on FER-2013. Overall, the findings indicate that the adaptive expert selection and multi-scale feature extraction significantly enhances the robustness of facial emotion recognition across diverse real-world conditions and how it may be used to develop end-to-end emotion recognition systems in practical settings. Reproducible codes and results are made publicly accessible at <span><span>https://github.com/DeeptimaanB/ExpressNet-MoE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100830"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine-interactive decision-assistance using a pre-trained natural language processing model for 4D printing technique selection 使用预训练的自然语言处理模型进行4D打印技术选择的机器交互决策辅助
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-12-30 DOI: 10.1016/j.mlwa.2025.100833
Chandramohan Abhishek , Nadimpalli Raghukiran
The present research showcases a machine-interactive approach for making decisions using a pre-trained natural language processing (NLP) model. The method is developed for 4D (4-dimensional) printing technique selection, as a plurality of variables is involved, such as process, material, design, and sequence selections. Due to the availability of numerous options, arriving at a preferred choice of technique requires expertise and time. The developed method aids in finding assistance from a single source. The approach incorporates bidirectional encoder representations from transformers (BERT), which accommodates parallel meanings of user requests, such as synonyms and adjectives, among others. The closed-loop system is programmed with a set of 7 prompts. It also introduces additional affirmation prompts to navigate both ambiguous phrasing and out-of-scope detection in order to receive a meaningful recommendation from the machine. The rule-governed technique (lightweight rule set) guides the selection of the conformable request during each prompt. The inference-based approach takes user requests, performs objective classification using BERT according to selected criteria, then dynamically filters the data, and recommends suggestions, with an inference time of 0.79 s. The modified model also establishes multi-level relationships among prompts for text classification. k-fold validation reached highest possible accuracy upon training with optimal hyperparameters. The fine-tuned method developed in Python environment can be generalized for other systems. The present research demonstrates the possibility of adapting an openly accessible model for developing a decision-assistance system with minimal personal computational resources.
本研究展示了使用预训练的自然语言处理(NLP)模型进行决策的机器交互方法。该方法是为4D(四维)打印技术选择而开发的,因为涉及多个变量,如工艺,材料,设计和顺序选择。由于可供选择的方法很多,要找到一种最佳的技术需要专业知识和时间。开发的方法有助于从单一来源寻求帮助。该方法结合了来自转换器(BERT)的双向编码器表示,它可以容纳用户请求的并行含义,例如同义词和形容词等。闭环系统由一组7个提示程序编程。它还引入了额外的确认提示,以导航模糊的短语和超出范围的检测,以便从机器接收有意义的推荐。规则控制的技术(轻量级规则集)指导在每个提示期间选择符合的请求。基于推理的方法接受用户请求,根据选择的标准使用BERT进行客观分类,然后动态过滤数据并推荐建议,推理时间为0.79 s。修改后的模型还建立了文本分类提示之间的多级关系。K-fold验证在最优超参数训练后达到最高可能的准确性。在Python环境中开发的微调方法可以推广到其他系统。目前的研究表明,采用开放可访问的模型来开发具有最小个人计算资源的决策辅助系统的可能性。
{"title":"Machine-interactive decision-assistance using a pre-trained natural language processing model for 4D printing technique selection","authors":"Chandramohan Abhishek ,&nbsp;Nadimpalli Raghukiran","doi":"10.1016/j.mlwa.2025.100833","DOIUrl":"10.1016/j.mlwa.2025.100833","url":null,"abstract":"<div><div>The present research showcases a machine-interactive approach for making decisions using a pre-trained natural language processing (NLP) model. The method is developed for 4D (4-dimensional) printing technique selection, as a plurality of variables is involved, such as process, material, design, and sequence selections. Due to the availability of numerous options, arriving at a preferred choice of technique requires expertise and time. The developed method aids in finding assistance from a single source. The approach incorporates bidirectional encoder representations from transformers (BERT), which accommodates parallel meanings of user requests, such as synonyms and adjectives, among others. The closed-loop system is programmed with a set of 7 prompts. It also introduces additional affirmation prompts to navigate both ambiguous phrasing and out-of-scope detection in order to receive a meaningful recommendation from the machine. The rule-governed technique (lightweight rule set) guides the selection of the conformable request during each prompt. The inference-based approach takes user requests, performs objective classification using BERT according to selected criteria, then dynamically filters the data, and recommends suggestions, with an inference time of 0.79 s. The modified model also establishes multi-level relationships among prompts for text classification. k-fold validation reached highest possible accuracy upon training with optimal hyperparameters. The fine-tuned method developed in Python environment can be generalized for other systems. The present research demonstrates the possibility of adapting an openly accessible model for developing a decision-assistance system with minimal personal computational resources.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100833"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synonym extraction from Japanese patent documents using term definition sentences 使用术语定义句从日语专利文件中提取同义词
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2026-01-21 DOI: 10.1016/j.mlwa.2026.100848
Koji Marusaki , Seiya Kawano , Asahi Hentona , Hirofumi Nonaka
Conducting prior patent searches before developing technologies and filing patent applications in companies or universities is essential for understanding technological trends among competitors and academic institutions, as well as for increasing the likelihood of obtaining patent rights. In these searches, it is important not only to include relevant keywords in the search queries but also to incorporate related terms retrieved from a thesaurus. To support this, methods using word embeddings for automatically extracting such synonyms have recently been proposed. However, patent documents often contain unique expressions and compound terms, such as specialized technical terminology and abstract conceptual terms, which are difficult to accurately capture using existing large language models trained at the token level.
In this study, we investigate a method for extracting synonyms from patent documents by embedding the definition sentences that explain technical terms. The experimental results demonstrate that the proposed method achieves more precise synonym extraction than conventional word embedding approaches, and it can contribute to the expansion of existing thesauri.
Thus, this research is expected to improve the recall of prior art searches and support the automatic extraction of technical elements for identifying technological trends.
在开发技术和向公司或大学提交专利申请之前进行事先专利检索,对于了解竞争对手和学术机构之间的技术趋势以及增加获得专利权的可能性至关重要。在这些搜索中,重要的是不仅要在搜索查询中包含相关的关键字,而且要合并从同义词库中检索到的相关术语。为了支持这一点,最近提出了使用词嵌入来自动提取此类同义词的方法。然而,专利文献通常包含独特的表达和复合术语,例如专门的技术术语和抽象的概念术语,使用在令牌级别训练的现有大型语言模型很难准确捕获这些术语。在这项研究中,我们研究了一种通过嵌入解释技术术语的定义句来从专利文件中提取同义词的方法。实验结果表明,该方法比传统的词嵌入方法获得了更精确的同义词提取,并有助于现有同义词库的扩展。因此,本研究有望提高现有技术检索的召回率,并支持自动提取技术元素以识别技术趋势。
{"title":"Synonym extraction from Japanese patent documents using term definition sentences","authors":"Koji Marusaki ,&nbsp;Seiya Kawano ,&nbsp;Asahi Hentona ,&nbsp;Hirofumi Nonaka","doi":"10.1016/j.mlwa.2026.100848","DOIUrl":"10.1016/j.mlwa.2026.100848","url":null,"abstract":"<div><div>Conducting prior patent searches before developing technologies and filing patent applications in companies or universities is essential for understanding technological trends among competitors and academic institutions, as well as for increasing the likelihood of obtaining patent rights. In these searches, it is important not only to include relevant keywords in the search queries but also to incorporate related terms retrieved from a thesaurus. To support this, methods using word embeddings for automatically extracting such synonyms have recently been proposed. However, patent documents often contain unique expressions and compound terms, such as specialized technical terminology and abstract conceptual terms, which are difficult to accurately capture using existing large language models trained at the token level.</div><div>In this study, we investigate a method for extracting synonyms from patent documents by embedding the definition sentences that explain technical terms. The experimental results demonstrate that the proposed method achieves more precise synonym extraction than conventional word embedding approaches, and it can contribute to the expansion of existing thesauri.</div><div>Thus, this research is expected to improve the recall of prior art searches and support the automatic extraction of technical elements for identifying technological trends.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100848"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRCSL: A privacy-preserving continual split learning framework for decentralized medical diagnosis PRCSL:用于分散医疗诊断的隐私保护持续分裂学习框架
IF 4.9 Pub Date : 2026-03-01 Epub Date: 2025-12-29 DOI: 10.1016/j.mlwa.2025.100828
Jungmin Eom , Minjun Kang , Myungkeun Yoon , Nikil Dutt , Jinkyu Kim , Jaekoo Lee
Deep learning-based medical AI systems are increasingly deployed for disease diagnosis in decentralized healthcare environments where data are siloed across hospitals and IoT devices and cannot be freely shared due to strict privacy and security regulations. However, most existing continual learning and distributed learning approaches either assume centrally aggregated data or overlook incremental clinical changes, leading to catastrophic forgetting when applied to real-world medical data streams.
This paper introduces a novel healthcare-specific framework that integrates continual learning and distributed learning methods to utilize medical AI models effectively by addressing the practical constraints of the healthcare and medical ecosystem, such as data privacy, security, and changing clinical environments. Through the proposed framework, medical clients, such as hospital devices and IoT-based smart devices, can collaboratively train deep learning-based models on distributed computing resources without sharing sensitive data. Additionally, by considering incremental characteristics in medical environments such as mutations, new diseases, and abnormalities, the proposed framework can improve the disease diagnosis of medical AI models in actual clinical scenarios.
We propose Privacy-preserving Rehearsal-based Continual Split Learning (PRCSL), a healthcare-specific continual split learning framework that combines differential-privacy-based exemplar sharing, a mutual information alignment (MIA) module to correct representation shifts induced by noisy exemplars, and a parameter-free nearest-mean-of-exemplars (NME) classifier to mitigate task-recency bias under non-IID data distributions. o=Across eight benchmark datasets, including four MedMNIST subsets, HAM10000, CCH5000, c=CIFAR,cp=, p=100, and SVHN, PRCSL achieves competitive performance compared with representative continual learning baselines in terms of average accuracy and average forgetting. In particular, PRCSL achieves up to 3.62%p higher average accuracy than the best baseline. These results indicate that PRCSL enables privacy-preserving, communication-efficient, and continually adaptable medical AI in realistic decentralized clinical and IoT-enabled ecosystems. Our code is publicly available at our repository.
基于深度学习的医疗人工智能系统越来越多地部署在分散的医疗环境中进行疾病诊断,这些环境中的数据分散在医院和物联网设备之间,由于严格的隐私和安全法规,无法自由共享。然而,大多数现有的持续学习和分布式学习方法要么假设集中汇总的数据,要么忽略增量临床变化,在应用于现实世界的医疗数据流时导致灾难性的遗忘。本文介绍了一种新的医疗保健特定框架,该框架集成了持续学习和分布式学习方法,通过解决医疗保健和医疗生态系统的实际限制,如数据隐私、安全性和不断变化的临床环境,有效地利用医疗人工智能模型。通过提出的框架,医疗客户端(如医院设备和基于物联网的智能设备)可以在不共享敏感数据的情况下,在分布式计算资源上协同训练基于深度学习的模型。此外,通过考虑突变、新疾病、异常等医疗环境中的增量特征,该框架可以提高医疗AI模型在实际临床场景中的疾病诊断能力。我们提出了一种基于隐私保护预演的持续分裂学习(PRCSL),这是一种医疗保健特定的持续分裂学习框架,它结合了基于差分隐私的范例共享,一个相互信息校准(MIA)模块来纠正由噪声范例引起的表示移位,以及一个无参数的最接近范例均值(NME)分类器来减轻非iid数据分布下的任务近因偏差。在八个基准数据集上,包括四个MedMNIST子集,HAM10000, CCH5000, c=CIFAR,cp=, p=100和SVHN, PRCSL在平均准确率和平均遗忘方面与代表性的持续学习基线相比具有竞争力。特别是,PRCSL的平均准确度比最佳基线高出3.62%p。这些结果表明,PRCSL能够在现实的分散临床和物联网生态系统中实现隐私保护、通信高效和持续适应性强的医疗人工智能。我们的代码在我们的存储库中是公开的。
{"title":"PRCSL: A privacy-preserving continual split learning framework for decentralized medical diagnosis","authors":"Jungmin Eom ,&nbsp;Minjun Kang ,&nbsp;Myungkeun Yoon ,&nbsp;Nikil Dutt ,&nbsp;Jinkyu Kim ,&nbsp;Jaekoo Lee","doi":"10.1016/j.mlwa.2025.100828","DOIUrl":"10.1016/j.mlwa.2025.100828","url":null,"abstract":"<div><div>Deep learning-based medical AI systems are increasingly deployed for disease diagnosis in decentralized healthcare environments where data are siloed across hospitals and IoT devices and cannot be freely shared due to strict privacy and security regulations. However, most existing continual learning and distributed learning approaches either assume centrally aggregated data or overlook incremental clinical changes, leading to catastrophic forgetting when applied to real-world medical data streams.</div><div>This paper introduces a novel healthcare-specific framework that integrates continual learning and distributed learning methods to utilize medical AI models effectively by addressing the practical constraints of the healthcare and medical ecosystem, such as data privacy, security, and changing clinical environments. Through the proposed framework, medical clients, such as hospital devices and IoT-based smart devices, can collaboratively train deep learning-based models on distributed computing resources without sharing sensitive data. Additionally, by considering incremental characteristics in medical environments such as mutations, new diseases, and abnormalities, the proposed framework can improve the disease diagnosis of medical AI models in actual clinical scenarios.</div><div>We propose Privacy-preserving Rehearsal-based Continual Split Learning (PRCSL), a healthcare-specific continual split learning framework that combines differential-privacy-based exemplar sharing, a mutual information alignment (MIA) module to correct representation shifts induced by noisy exemplars, and a parameter-free nearest-mean-of-exemplars (NME) classifier to mitigate task-recency bias under non-IID data distributions. o=Across eight benchmark datasets, including four MedMNIST subsets, HAM10000, CCH5000, c=CIFAR,cp=, p=100, and SVHN, PRCSL achieves competitive performance compared with representative continual learning baselines in terms of average accuracy and average forgetting. In particular, PRCSL achieves up to 3.62%p higher average accuracy than the best baseline. These results indicate that PRCSL enables privacy-preserving, communication-efficient, and continually adaptable medical AI in realistic decentralized clinical and IoT-enabled ecosystems. Our code is publicly available at our repository.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100828"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine learning with applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1