首页 > 最新文献

Machine learning with applications最新文献

英文 中文
SAFE AI metrics: An integrated approach SAFE AI指标:综合方法
IF 4.9 Pub Date : 2025-12-15 DOI: 10.1016/j.mlwa.2025.100821
Paolo Giudici, Vasily Kolesnikov
We contribute to the field of AI governance with the development of a unified compliance metric that integrates three key dimensions of SAFE Artificial Intelligence: Security, Accuracy, and Explainability. While these aspects are typically assessed in isolation, the proposed approach integrates them into a single and interpretable metric, grounded in a consistent mathematical structure. To develop an integrated framework, the outputs of machine learning models are evaluated under three risk dimensions, that correspond to different input data perturbations: data removal (for accuracy); data poisoning (for security); and feature removal (for explainability). The experimentation of the methodology on both real and simulated datasets shows that the integrated metric improves compliance monitoring and enables a consistent evaluation of AI risks.
我们通过开发统一的合规指标为人工智能治理领域做出贡献,该指标集成了SAFE人工智能的三个关键维度:安全性、准确性和可解释性。虽然这些方面通常是单独评估的,但建议的方法将它们集成到一个统一的、可解释的度量标准中,以一致的数学结构为基础。为了开发一个集成框架,机器学习模型的输出在三个风险维度下进行评估,这些风险维度对应于不同的输入数据扰动:数据删除(为了准确性);数据中毒(为了安全);以及特征删除(为了可解释性)。该方法在真实和模拟数据集上的实验表明,综合指标改善了合规性监控,并能够对人工智能风险进行一致的评估。
{"title":"SAFE AI metrics: An integrated approach","authors":"Paolo Giudici,&nbsp;Vasily Kolesnikov","doi":"10.1016/j.mlwa.2025.100821","DOIUrl":"10.1016/j.mlwa.2025.100821","url":null,"abstract":"<div><div>We contribute to the field of AI governance with the development of a unified compliance metric that integrates three key dimensions of SAFE Artificial Intelligence: Security, Accuracy, and Explainability. While these aspects are typically assessed in isolation, the proposed approach integrates them into a single and interpretable metric, grounded in a consistent mathematical structure. To develop an integrated framework, the outputs of machine learning models are evaluated under three risk dimensions, that correspond to different input data perturbations: data removal (for accuracy); data poisoning (for security); and feature removal (for explainability). The experimentation of the methodology on both real and simulated datasets shows that the integrated metric improves compliance monitoring and enables a consistent evaluation of AI risks.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100821"},"PeriodicalIF":4.9,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive multi-domain uncertainty quantification for digital twin water forecasting 数字孪生水预报的自适应多域不确定性量化
IF 4.9 Pub Date : 2025-12-13 DOI: 10.1016/j.mlwa.2025.100812
Mohammadhossein Homaei , Mehran Tarif , Pablo García Rodríguez , Mar Ávila , Andrés Caro
Machine learning (ML) models are often used to predict demand in digital twins (DTs) of water distribution systems (WDS). However, most models do not provide uncertainty estimation, and this makes risk evaluation limited. In this work, we introduce the first systematic framework for hierarchical uncertainty transfer in regional water networks, because until now no method existed for DT of regional water systems. We propose Adaptive Multi-Village Conformal Prediction (AMV-CP), a method that keeps theoretical guarantees and also allows transfer of uncertainty information between villages that are similar in structure but different in operation. The main ideas are: (i) village-adaptive conformity scores that capture local patterns, (ii) a meta-learning algorithm that reduces calibration cost by 88.6%, and (iii) regime-aware calibration that keeps 94.2% coverage when seasons change. We use eight years of data from six villages with 6174 users in one regional network. The results show a theoretical basis for cross-village transfer and 95.1% empirical coverage (target was 95%), with real-time speed of 120 predictions per second. Early multi-step tests also show 93.7% coverage for 24-hour horizons, with controlled trade-offs. This framework is the first systematic method for controlled uncertainty transfer in infrastructure DTs, with theoretical guarantees under ϕ-mixing and practical deployment. Our multi-village tests demonstrate the value of meta-learning for uncertainty estimation and make a base method that can be used in other hierarchical infrastructure systems. The system is validated in a Mediterranean rural network, but generalization to other climates, urban settings, and cascading systems needs further empirical study.
机器学习(ML)模型通常用于预测供水系统(WDS)的数字双胞胎(dt)的需求。然而,大多数模型不提供不确定性估计,这使得风险评估受到限制。在这项工作中,我们引入了第一个系统框架,用于区域水网的层次不确定性传递,因为到目前为止还没有方法用于区域水系的DT。本文提出了一种既保持理论保证,又允许结构相似但运行方式不同的村落间不确定性信息传递的自适应多村落共形预测(AMV-CP)方法。主要思想是:(i)捕获当地模式的村庄适应性一致性分数,(ii)将校准成本降低88.6%的元学习算法,以及(iii)在季节变化时保持94.2%覆盖率的制度感知校准。我们使用了来自6个村庄的8年数据,在一个区域网络中有6174名用户。结果表明,跨村转移具有理论基础,经验覆盖率为95.1%(目标为95%),实时预测速度为120次/秒。早期的多步骤测试也显示,24小时的覆盖率为93.7%,并有控制的权衡。该框架是基础设施dt中受控不确定性传递的第一个系统方法,具有在ϕ-mix和实际部署下的理论保证。我们的多村测试证明了元学习对不确定性估计的价值,并为其他分层基础设施系统提供了一个基础方法。该系统在地中海农村网络中得到验证,但推广到其他气候、城市环境和级联系统需要进一步的实证研究。
{"title":"Adaptive multi-domain uncertainty quantification for digital twin water forecasting","authors":"Mohammadhossein Homaei ,&nbsp;Mehran Tarif ,&nbsp;Pablo García Rodríguez ,&nbsp;Mar Ávila ,&nbsp;Andrés Caro","doi":"10.1016/j.mlwa.2025.100812","DOIUrl":"10.1016/j.mlwa.2025.100812","url":null,"abstract":"<div><div>Machine learning (ML) models are often used to predict demand in digital twins (DTs) of water distribution systems (WDS). However, most models do not provide uncertainty estimation, and this makes risk evaluation limited. In this work, we introduce the first systematic framework for hierarchical uncertainty transfer in regional water networks, because until now no method existed for DT of regional water systems. We propose Adaptive Multi-Village Conformal Prediction (AMV-CP), a method that keeps theoretical guarantees and also allows transfer of uncertainty information between villages that are similar in structure but different in operation. The main ideas are: (i) village-adaptive conformity scores that capture local patterns, (ii) a meta-learning algorithm that reduces calibration cost by 88.6%, and (iii) regime-aware calibration that keeps 94.2% coverage when seasons change. We use eight years of data from six villages with 6174 users in one regional network. The results show a theoretical basis for cross-village transfer and 95.1% empirical coverage (target was 95%), with real-time speed of 120 predictions per second. Early multi-step tests also show 93.7% coverage for 24-hour horizons, with controlled trade-offs. This framework is the first systematic method for controlled uncertainty transfer in infrastructure DTs, with theoretical guarantees under <span><math><mi>ϕ</mi></math></span>-mixing and practical deployment. Our multi-village tests demonstrate the value of meta-learning for uncertainty estimation and make a base method that can be used in other hierarchical infrastructure systems. The system is validated in a Mediterranean rural network, but generalization to other climates, urban settings, and cascading systems needs further empirical study.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100812"},"PeriodicalIF":4.9,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DefMoN: A reproducible framework for theory-grounded synthetic data generation in affective AI DefMoN:情感人工智能中基于理论的合成数据生成的可复制框架
IF 4.9 Pub Date : 2025-12-13 DOI: 10.1016/j.mlwa.2025.100817
Ryan SangBaek Kim
<div><div>Modern NLP systems excel when labels are abundant but struggle with <em>high-inference</em> constructs that are costly to annotate and risky to synthesize without constraints. We present the <em>Defensive Motivational Node</em> framework, henceforth DefMoN (formerly DMN), which <em>operationalizes</em> Vaillant’s hierarchy of ego defenses and Plutchik’s psychoevolutionary emotions into a controllable generative process for text. We release <strong>DMN-Syn v1.0</strong> — a quadri-lingual (EN/KO/FR/KA) corpus of 300 theory-constrained utterances — together with a complete, versioned research compendium (data, code, seeds, QC manifests, and evaluation scripts) archived on Zenodo (Kim, 2025). The full package is permanently available at <span><span>https://doi.org/10.5281/zenodo.17101927</span><svg><path></path></svg></span>.</div><div>On the modeling side, we treat defense recognition as 10-way sentence classification and fine-tune a multilingual Transformer (XLM-R) <em>only</em> on DMN-Syn v1. In-domain performance is high (EN macro-<span><math><mrow><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><mn>0</mn><mo>.</mo><mn>97</mn></mrow></math></span>, MCC <span><math><mrow><mo>=</mo><mn>0</mn><mo>.</mo><mn>96</mn></mrow></math></span>; KO macro-<span><math><mrow><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><mn>0</mn><mo>.</mo><mn>96</mn></mrow></math></span>), and zero-shot transfer is strong (EN<span><math><mo>→</mo></math></span>KO macro-<span><math><mrow><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><mn>0</mn><mo>.</mo><mn>81</mn></mrow></math></span>). When evaluated on a small, anonymized real-world benchmark, the model reaches Macro <span><math><mrow><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><mn>0</mn><mo>.</mo><mn>62</mn></mrow></math></span> with <em>zero</em> real training data, then rises to 0.76 with only <span><math><mrow><mi>k</mi><mo>=</mo><mn>64</mn></mrow></math></span> supervised examples per class. Human annotators on that same benchmark agree with each other at <span><math><mrow><mi>κ</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>68</mn></mrow></math></span>, <span><math><mrow><mi>α</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>66</mn></mrow></math></span>. This shows that DefMoN is not a turnkey classifier, but a <em>theory-grounded primer</em> that enables data-efficient adaptation toward human-level ambiguity <em>theory-grounded primer</em> that enables data-efficient alignment to <em>schema-coded human benchmark</em> without large-scale annotation. We additionally quantify <em>reliability</em>—reporting ECE/MCE and coverage–performance curves for selective prediction—and show robustness under group-aware splits (template/scenario disjoint) and cue ablations, establishing structural coherence in the absence of large-scale human trials.</div><div>Beyond raw scores, we foreground <em>auditability</em>. Each instance in <strong>DMN-Syn v1</st
现代NLP系统在标签丰富时表现出色,但在注释成本高且不受约束的合成风险大的高推理结构中挣扎。我们提出了防御性动机节点框架,即DefMoN(以前的DMN),它将Vaillant的自我防御层次和Plutchik的心理进化情感运作为一个可控的文本生成过程。我们发布了DMN-Syn v1.0——一个四语(EN/KO/FR/KA)语料库,包含300个理论约束的话语——以及一个完整的、版本化的研究纲要(数据、代码、种子、QC清单和评估脚本),存档在Zenodo (Kim, 2025)上。完整的软件包可以在https://doi.org/10.5281/zenodo.17101927.On建模端永久获得,我们将防御识别视为10路句子分类,并仅在DMN-Syn v1上微调多语言转换器(XLM-R)。域内性能高(EN macro-F1=0.97, MCC =0.96; KO macro-F1=0.96),零射转移强(EN→KO macro-F1=0.81)。当在一个小的、匿名的真实世界基准上进行评估时,模型在零真实训练数据的情况下达到Macro F1=0.62,然后在每个类只有k=64个监督样本的情况下上升到0.76。在相同的基准上,人类注释者在κ=0.68, α=0.66时彼此一致。这表明DefMoN不是一个交钥匙分类器,而是一个基于理论的引物,它可以实现对人类级别歧义的数据高效适应;基于理论的引物可以实现对模式编码的人类基准的数据高效校准,而无需大规模注释。我们还量化了可靠性报告ECE/MCE和覆盖率-性能曲线,以进行选择性预测,并显示了在群体意识分裂(模板/场景脱节)和线索消融下的稳健性,在没有大规模人体试验的情况下建立了结构一致性。除了原始分数,我们还强调可审计性。DMN-Syn v1中的每个实例都具有固定的种子,分组分裂和护栏,以防止标签泄漏和构造漂移;发布验证器、清单和代码以进行字节精确复制。结果支持理论约束合成作为昂贵的专家标记和无约束的LLM生成之间的实用中间路径,特别是对于低资源和跨语言设置。通过使用心理学理论作为明确的生成约束,而不是事后解释,DefMoN将合成数据工作重新定义为机器学习结构的操作化。该框架(i)标准化护栏,以最大限度地减少偏差放大和漂移,(ii)提供小但理论密集的语料库,训练具有不确定性意识的可靠分类器,以及(iii)提供可审计的工件(种子、清单、验证器),使其能够重现并扩展到新的防御、语言和对话级别设置。术语和品牌。在之前的版本和存储库中,我们使用首字母缩略词DMN来表示防御性动机节点。为了避免与神经科学的“默认模式网络”混淆,本文采用首字母缩略词DefMoN来表示整个框架。保留了遗留数据集和存储库名称(例如,DMN-Syn v1),以保持先前工作的连续性和可重复性。因此,在整篇论文中,我们使用:DefMoN作为整体框架和方法,DMN- syn v1用于发布的数据集及其工件,“DMN节点”用于指该数据集中的单个(防御、情感、场景)元组。这种分裂是有意的。
{"title":"DefMoN: A reproducible framework for theory-grounded synthetic data generation in affective AI","authors":"Ryan SangBaek Kim","doi":"10.1016/j.mlwa.2025.100817","DOIUrl":"10.1016/j.mlwa.2025.100817","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Modern NLP systems excel when labels are abundant but struggle with &lt;em&gt;high-inference&lt;/em&gt; constructs that are costly to annotate and risky to synthesize without constraints. We present the &lt;em&gt;Defensive Motivational Node&lt;/em&gt; framework, henceforth DefMoN (formerly DMN), which &lt;em&gt;operationalizes&lt;/em&gt; Vaillant’s hierarchy of ego defenses and Plutchik’s psychoevolutionary emotions into a controllable generative process for text. We release &lt;strong&gt;DMN-Syn v1.0&lt;/strong&gt; — a quadri-lingual (EN/KO/FR/KA) corpus of 300 theory-constrained utterances — together with a complete, versioned research compendium (data, code, seeds, QC manifests, and evaluation scripts) archived on Zenodo (Kim, 2025). The full package is permanently available at &lt;span&gt;&lt;span&gt;https://doi.org/10.5281/zenodo.17101927&lt;/span&gt;&lt;svg&gt;&lt;path&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;On the modeling side, we treat defense recognition as 10-way sentence classification and fine-tune a multilingual Transformer (XLM-R) &lt;em&gt;only&lt;/em&gt; on DMN-Syn v1. In-domain performance is high (EN macro-&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;97&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;, MCC &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;96&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;; KO macro-&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;96&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;), and zero-shot transfer is strong (EN&lt;span&gt;&lt;math&gt;&lt;mo&gt;→&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;KO macro-&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;81&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;). When evaluated on a small, anonymized real-world benchmark, the model reaches Macro &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;62&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; with &lt;em&gt;zero&lt;/em&gt; real training data, then rises to 0.76 with only &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;64&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; supervised examples per class. Human annotators on that same benchmark agree with each other at &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;κ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;68&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;, &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;α&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;mn&gt;66&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. This shows that DefMoN is not a turnkey classifier, but a &lt;em&gt;theory-grounded primer&lt;/em&gt; that enables data-efficient adaptation toward human-level ambiguity &lt;em&gt;theory-grounded primer&lt;/em&gt; that enables data-efficient alignment to &lt;em&gt;schema-coded human benchmark&lt;/em&gt; without large-scale annotation. We additionally quantify &lt;em&gt;reliability&lt;/em&gt;—reporting ECE/MCE and coverage–performance curves for selective prediction—and show robustness under group-aware splits (template/scenario disjoint) and cue ablations, establishing structural coherence in the absence of large-scale human trials.&lt;/div&gt;&lt;div&gt;Beyond raw scores, we foreground &lt;em&gt;auditability&lt;/em&gt;. Each instance in &lt;strong&gt;DMN-Syn v1&lt;/st","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100817"},"PeriodicalIF":4.9,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning and the geometry of compactness in stability and generalization 深度学习与几何的紧致稳定性与泛化
IF 4.9 Pub Date : 2025-12-13 DOI: 10.1016/j.mlwa.2025.100820
Mohammad Meysami , Ali Lotfi , Sehar Saleem
Deep learning models often continue to generalize well even when they have far more parameters than available training examples. This observation naturally leads to two questions: why does training remain stable, and why do the resulting predictors generalize at all? To address these questions, we return to the classical Extreme Value Theorem and interpret modern training as optimization over compact sets in parameter space or function space. Our main results show that continuity together with coercive or Lipschitz based regularization gives existence of minimizers and uniform control of the excess risk, by bounding rare high loss events. We apply this framework to weight decay, gradient penalties, and spectral normalization, and we introduce simple diagnostics that monitor compactness in parameter space, representation space, and function space. Experiments on synthetic examples, standard image data sets (MNIST, CIFAR ten, Tiny ImageNet), and the UCI Adult tabular task are consistent with the theory: mild regularization leads to smoother optimization, reduced variation across random seeds, and better robustness and calibration while preserving accuracy. Taken together, these results highlight compactness as a practical geometric guideline for training stable and reliable deep networks.
深度学习模型通常可以很好地泛化,即使它们的参数远远超过可用的训练样本。这一观察结果自然引出了两个问题:为什么训练保持稳定?为什么最终的预测结果具有普遍性?为了解决这些问题,我们回到经典的极值定理,并将现代训练解释为参数空间或函数空间中的紧集合上的优化。我们的主要结果表明,连续性与强制正则化或基于Lipschitz的正则化一起,通过约束罕见的高损失事件,可以得到最小值的存在性和对超额风险的均匀控制。我们将此框架应用于权重衰减、梯度惩罚和谱归一化,并介绍了简单的诊断方法,用于监控参数空间、表示空间和函数空间中的紧凑性。在合成示例、标准图像数据集(MNIST、CIFAR 10、Tiny ImageNet)和UCI Adult表格任务上的实验与理论一致:轻度正则化导致更平滑的优化,减少随机种子之间的变化,在保持准确性的同时具有更好的鲁棒性和校准性。综上所述,这些结果强调了紧凑性作为训练稳定可靠的深度网络的实用几何准则。
{"title":"Deep learning and the geometry of compactness in stability and generalization","authors":"Mohammad Meysami ,&nbsp;Ali Lotfi ,&nbsp;Sehar Saleem","doi":"10.1016/j.mlwa.2025.100820","DOIUrl":"10.1016/j.mlwa.2025.100820","url":null,"abstract":"<div><div>Deep learning models often continue to generalize well even when they have far more parameters than available training examples. This observation naturally leads to two questions: why does training remain stable, and why do the resulting predictors generalize at all? To address these questions, we return to the classical Extreme Value Theorem and interpret modern training as optimization over compact sets in parameter space or function space. Our main results show that continuity together with coercive or Lipschitz based regularization gives existence of minimizers and uniform control of the excess risk, by bounding rare high loss events. We apply this framework to weight decay, gradient penalties, and spectral normalization, and we introduce simple diagnostics that monitor compactness in parameter space, representation space, and function space. Experiments on synthetic examples, standard image data sets (MNIST, CIFAR ten, Tiny ImageNet), and the UCI Adult tabular task are consistent with the theory: mild regularization leads to smoother optimization, reduced variation across random seeds, and better robustness and calibration while preserving accuracy. Taken together, these results highlight compactness as a practical geometric guideline for training stable and reliable deep networks.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100820"},"PeriodicalIF":4.9,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectrogram-Based Deep Learning Models for Acoustic Identification of Honey Bees in Complex Environmental Noises 基于谱图的复杂环境噪声下蜜蜂声学识别深度学习模型
IF 4.9 Pub Date : 2025-12-12 DOI: 10.1016/j.mlwa.2025.100807
Muhammad Anus Khan , Bilal Hassan Khan , Shafiq ur Rehman Khan , Ali Raza , Asif Raza , Shehzad Ashraf Chaudhry
The rapid decline of honey bee populations presents an urgent ecological and agricultural concern, demanding innovative and scalable monitoring solutions. This study proposes a deep learning-based system for non-invasive classification of honey bee buzzing sounds to distinguish bee activity from complex environmental noise—a fundamental challenge for real-world acoustic monitoring. Traditional machine learning models using features like Mel Frequency Cepstral Coefficients (MFCCs) and spectral statistics performed well on curated datasets but failed under natural conditions due to overlapping acoustic signatures and inconsistent recordings.
To address this gap, we built a diverse dataset combining public bee audio with recordings from the Honeybee Research Center at the National Agricultural Research Centre (NARC), Pakistan, capturing various devices and natural environments. Audio signals were converted into mel spectrograms and chromograms, enabling pattern learning via pre-trained convolutional neural networks. Among tested architectures—EfficientNetB0, ResNet50, and MobileNetV2—MobileNetV2 achieved the highest generalization, with 95.29% accuracy on spectrograms and over 90% on chromograms under an 80% confidence threshold.
Data augmentation improved robustness to noise, while transfer learning enhanced adaptability. This work forms part of a broader project to develop a mobile application for real-time hive health monitoring in natural environments, where distinguishing bee buzzing from other sounds is the crucial first step. Beyond binary classification, the proposed approach offers potential for detecting hive health issues through acoustic patterns, supporting early interventions and contributing to global bee conservation efforts.
蜜蜂种群的迅速减少引起了迫切的生态和农业关注,需要创新和可扩展的监测解决方案。本研究提出了一种基于深度学习的系统,用于蜜蜂嗡嗡声的非侵入性分类,以区分蜜蜂活动和复杂的环境噪声——这是现实世界声学监测的一个基本挑战。使用Mel频倒谱系数(MFCCs)和频谱统计等特征的传统机器学习模型在整理的数据集上表现良好,但在自然条件下由于声学特征重叠和记录不一致而失败。为了解决这一差距,我们建立了一个多样化的数据集,将公共蜜蜂音频与巴基斯坦国家农业研究中心(NARC)蜜蜂研究中心的录音结合起来,捕捉各种设备和自然环境。音频信号被转换成mel谱图和色图,通过预训练的卷积神经网络进行模式学习。在测试的体系结构中,效率netb0、ResNet50和MobileNetV2-MobileNetV2实现了最高的泛化,在80%的置信度阈值下,光谱图的准确度为95.29%,色谱图的准确度超过90%。数据增强增强了对噪声的鲁棒性,而迁移学习增强了自适应性。这项工作是一个更广泛的项目的一部分,该项目旨在开发一个移动应用程序,用于在自然环境中实时监测蜂箱健康,在这个项目中,区分蜜蜂嗡嗡声和其他声音是至关重要的第一步。除了二元分类之外,所提出的方法还提供了通过声学模式检测蜂巢健康问题的潜力,支持早期干预,并为全球蜜蜂保护工作做出贡献。
{"title":"Spectrogram-Based Deep Learning Models for Acoustic Identification of Honey Bees in Complex Environmental Noises","authors":"Muhammad Anus Khan ,&nbsp;Bilal Hassan Khan ,&nbsp;Shafiq ur Rehman Khan ,&nbsp;Ali Raza ,&nbsp;Asif Raza ,&nbsp;Shehzad Ashraf Chaudhry","doi":"10.1016/j.mlwa.2025.100807","DOIUrl":"10.1016/j.mlwa.2025.100807","url":null,"abstract":"<div><div>The rapid decline of honey bee populations presents an urgent ecological and agricultural concern, demanding innovative and scalable monitoring solutions. This study proposes a deep learning-based system for non-invasive classification of honey bee buzzing sounds to distinguish bee activity from complex environmental noise—a fundamental challenge for real-world acoustic monitoring. Traditional machine learning models using features like Mel Frequency Cepstral Coefficients (MFCCs) and spectral statistics performed well on curated datasets but failed under natural conditions due to overlapping acoustic signatures and inconsistent recordings.</div><div>To address this gap, we built a diverse dataset combining public bee audio with recordings from the Honeybee Research Center at the National Agricultural Research Centre (NARC), Pakistan, capturing various devices and natural environments. Audio signals were converted into mel spectrograms and chromograms, enabling pattern learning via pre-trained convolutional neural networks. Among tested architectures—EfficientNetB0, ResNet50, and MobileNetV2—MobileNetV2 achieved the highest generalization, with 95.29% accuracy on spectrograms and over 90% on chromograms under an 80% confidence threshold.</div><div>Data augmentation improved robustness to noise, while transfer learning enhanced adaptability. This work forms part of a broader project to develop a mobile application for real-time hive health monitoring in natural environments, where distinguishing bee buzzing from other sounds is the crucial first step. Beyond binary classification, the proposed approach offers potential for detecting hive health issues through acoustic patterns, supporting early interventions and contributing to global bee conservation efforts.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100807"},"PeriodicalIF":4.9,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid DEA–fuzzy clustering approach for accurate reference set identification 一种用于准确识别参考集的混合dea -模糊聚类方法
IF 4.9 Pub Date : 2025-12-09 DOI: 10.1016/j.mlwa.2025.100818
Sara Fanati Rashidi , Maryam Olfati , Seyedali Mirjalili , Crina Grosan , Jan Platoš , Vaclav Snášel
This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.
本研究将数据包络分析(DEA)与机器学习(ML)相结合,以解决传统DEA在识别低效决策单元(dmu)参考集方面的关键局限性。在DEA中,低效单元根据基准单元进行评估;然而,一些基准可能是不合适的,甚至是异常值,这可能会扭曲效率边界。此外,当增加一个新的DMU时,必须重新计算整个模型,这对于大型数据集来说,计算成本很高。为了克服这些问题,我们提出了一种结合模糊c均值(FCM)和可能性模糊c均值(PFCM)聚类的混合方法。该方法利用欧几里得距离和隶属度来识别更接近、更相关的参考单元,同时根据实际需要引入灵敏度阈值来控制基准的数量。在两个数据集上验证了该方法的有效性:一个银行数据集和一个包含1,372个样本的钞票认证数据集。结果表明,基于ml框架的参考集与DEA的一致性达到71.6%-98.3%,同时克服了两个主要缺点:(1)对数据集大小的敏感性;(2)包含不适当的参考单元。此外,统计分析,包括置信区间和McNemar的检验,证实了研究结果的稳健性和现实意义。
{"title":"A hybrid DEA–fuzzy clustering approach for accurate reference set identification","authors":"Sara Fanati Rashidi ,&nbsp;Maryam Olfati ,&nbsp;Seyedali Mirjalili ,&nbsp;Crina Grosan ,&nbsp;Jan Platoš ,&nbsp;Vaclav Snášel","doi":"10.1016/j.mlwa.2025.100818","DOIUrl":"10.1016/j.mlwa.2025.100818","url":null,"abstract":"<div><div>This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100818"},"PeriodicalIF":4.9,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic discovery of robust risk groups from limited survival data across biomedical modalities 从生物医学模式的有限生存数据中自动发现稳健的风险群体
IF 4.9 Pub Date : 2025-12-08 DOI: 10.1016/j.mlwa.2025.100814
Ethar Alzaid , George Wright , Mark Eastwood , Piotr Keller , Fayyaz Minhas
Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).
基于医疗数据的生存预测通常受到稀缺标签的限制,从而限制了完全监督模型的有效性。此外,大多数现有方法产生的确定性风险评分没有传达可靠性,这阻碍了可解释性和临床可信度。为了解决这些挑战,我们引入了T-SURE,这是一个转导生存排名和风险分层框架,可以从标记和未标记的患者中共同学习,以减少对大型注释队列的依赖。它还估计一个拒绝分数,识别高不确定性的情况下,使选择性弃权时,信心是低的。T-SURE生成一个单一的风险评分,实现(1)基于生存风险的患者排名,(2)对风险组的自动分配,以及(3)对不确定预测的选择性拒绝。我们在来自癌症基因组图谱(TCGA)的泛癌症数据集上广泛评估了该模型,使用了基因表达谱、全幻灯片图像、病理报告和临床信息。该模型在排名和风险分层方面优于现有方法,特别是在有限的标记数据方案中。当不确定样本被拒绝时,它也显示出性能的持续改进,同时在数据集上保持统计学上显著的分层。T-SURE作为一个可靠的组件集成在计算病理学管道中,通过指导风险特异性治疗和监测决策,并通过高排斥评分标记不明确或罕见的病例,以供进一步研究。为了支持可重复性,T-SURE的完整实现公开可在:(匿名)。
{"title":"Automatic discovery of robust risk groups from limited survival data across biomedical modalities","authors":"Ethar Alzaid ,&nbsp;George Wright ,&nbsp;Mark Eastwood ,&nbsp;Piotr Keller ,&nbsp;Fayyaz Minhas","doi":"10.1016/j.mlwa.2025.100814","DOIUrl":"10.1016/j.mlwa.2025.100814","url":null,"abstract":"<div><div>Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100814"},"PeriodicalIF":4.9,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging multimodal large language models to extract mechanistic insights from biomedical visuals: A case study on COVID-19 and neurodegenerative diseases 利用多模态大语言模型从生物医学视觉中提取机制见解:COVID-19和神经退行性疾病的案例研究
IF 4.9 Pub Date : 2025-12-06 DOI: 10.1016/j.mlwa.2025.100816
Elizaveta Popova , Marc Jacobs , Martin Hofmann-Apitius , Negin Sadat Babaiha

Background

The COVID-19 pandemic has intensified concerns about its long-term neurological impact, with evidence linking SARS-CoV-2 infection to neurodegenerative diseases (NDDs) such as Alzheimer’s (AD) and Parkinson’s (PD). Patients with these conditions face higher risk of severe COVID-19 outcomes and may experience accelerated cognitive or motor decline following infection. Proposed mechanisms—including neuroinflammation, blood–brain barrier (BBB) disruption, and abnormal protein aggregation—closely mirror core features of neurodegenerative pathology. However, current knowledge remains fragmented across text, figures, and pathway diagrams, limiting integration into computational models that could reveal systemic patterns.

Results

To address this gap, we applied GPT-4 Omni (GPT-4o), a multimodal large language model (LLM), to extract mechanistic insights from biomedical figures. Over 10,000 images were retrieved through targeted searches on COVID-19 and neurodegeneration; after automated and manual filtering, a curated subset was analyzed. GPT-4o extracted biological relationships as semantic triples, grouped into six mechanistic categories—including microglial activation and barrier disruption—using ontology-guided similarity and assembled into a Neo4j knowledge graph (KG). Accuracy was evaluated against a gold-standard dataset of expert-annotated images using Biomedical Bidirectional Encoder Representations from Transformers (BioBERT)–based semantic matching. This evaluation enabled prompt tuning, threshold optimization, and hyperparameter assessment. Results show that GPT-4o successfully recovers both established and novel mechanisms, yielding interpretable outputs that illuminate complex biological links between SARS-CoV-2 and neurodegeneration.

Conclusions

This study demonstrates the potential of multimodal LLMs to mine biomedical visual data at scale. By complementing text mining and integrating figure-derived knowledge, our framework advances understanding of COVID-19–related neurodegeneration and supports future translational research.
COVID-19大流行加剧了人们对其长期神经系统影响的担忧,有证据表明SARS-CoV-2感染与阿尔茨海默病(AD)和帕金森病(PD)等神经退行性疾病(ndd)有关。患有这些疾病的患者面临COVID-19严重后果的更高风险,并可能在感染后加速认知或运动能力下降。提出的机制-包括神经炎症,血脑屏障(BBB)破坏和异常蛋白质聚集-密切反映了神经退行性病理的核心特征。然而,目前的知识仍然分散在文本、图形和路径图中,限制了集成到可以揭示系统模式的计算模型中。为了解决这一差距,我们应用了多模态大语言模型(LLM) GPT-4 Omni (gpt - 40)来从生物医学数据中提取机制见解。通过针对COVID-19和神经退行性疾病的定向搜索检索了1万多张图像;在自动和手动过滤之后,分析了一个精心挑选的子集。gpt - 40将生物关系提取为语义三元组,使用本体引导的相似性将其分为六个机制类别,包括小胶质细胞激活和屏障破坏,并组装成Neo4j知识图(KG)。使用基于变形金刚的生物医学双向编码器表示(BioBERT)的语义匹配,对专家注释图像的金标准数据集进行准确性评估。该评估支持即时调优、阈值优化和超参数评估。结果表明,gpt - 40成功恢复了已建立的和新的机制,产生了可解释的输出,阐明了SARS-CoV-2与神经变性之间的复杂生物学联系。本研究证明了多模态llm在大规模挖掘生物医学视觉数据方面的潜力。通过补充文本挖掘和整合图形衍生知识,我们的框架促进了对covid -19相关神经变性的理解,并支持未来的转化研究。
{"title":"Leveraging multimodal large language models to extract mechanistic insights from biomedical visuals: A case study on COVID-19 and neurodegenerative diseases","authors":"Elizaveta Popova ,&nbsp;Marc Jacobs ,&nbsp;Martin Hofmann-Apitius ,&nbsp;Negin Sadat Babaiha","doi":"10.1016/j.mlwa.2025.100816","DOIUrl":"10.1016/j.mlwa.2025.100816","url":null,"abstract":"<div><h3>Background</h3><div>The COVID-19 pandemic has intensified concerns about its long-term neurological impact, with evidence linking SARS-CoV-2 infection to neurodegenerative diseases (NDDs) such as Alzheimer’s (AD) and Parkinson’s (PD). Patients with these conditions face higher risk of severe COVID-19 outcomes and may experience accelerated cognitive or motor decline following infection. Proposed mechanisms—including neuroinflammation, blood–brain barrier (BBB) disruption, and abnormal protein aggregation—closely mirror core features of neurodegenerative pathology. However, current knowledge remains fragmented across text, figures, and pathway diagrams, limiting integration into computational models that could reveal systemic patterns.</div></div><div><h3>Results</h3><div>To address this gap, we applied GPT-4 Omni (GPT-4o), a multimodal large language model (LLM), to extract mechanistic insights from biomedical figures. Over 10,000 images were retrieved through targeted searches on COVID-19 and neurodegeneration; after automated and manual filtering, a curated subset was analyzed. GPT-4o extracted biological relationships as semantic triples, grouped into six mechanistic categories—including microglial activation and barrier disruption—using ontology-guided similarity and assembled into a Neo4j knowledge graph (KG). Accuracy was evaluated against a gold-standard dataset of expert-annotated images using Biomedical Bidirectional Encoder Representations from Transformers (BioBERT)–based semantic matching. This evaluation enabled prompt tuning, threshold optimization, and hyperparameter assessment. Results show that GPT-4o successfully recovers both established and novel mechanisms, yielding interpretable outputs that illuminate complex biological links between SARS-CoV-2 and neurodegeneration.</div></div><div><h3>Conclusions</h3><div>This study demonstrates the potential of multimodal LLMs to mine biomedical visual data at scale. By complementing text mining and integrating figure-derived knowledge, our framework advances understanding of COVID-19–related neurodegeneration and supports future translational research.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100816"},"PeriodicalIF":4.9,"publicationDate":"2025-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of the remaining charge retention time of an electric vehicle battery 电动汽车电池剩余电荷保持时间的估计
IF 4.9 Pub Date : 2025-12-05 DOI: 10.1016/j.mlwa.2025.100813
Chourik Fousseni , Martin Otis , Khaled Ziane
Accurately estimating the remaining driving time (RDT) of an electric vehicle (EV) battery is essential for optimizing energy management and enhancing user experience. However, traditional estimation methods do not adequately account for the influence of temperature, driving characteristics and vehicle driving time, leading to less accurate predictions and suboptimal range management. To address these limitations, this study presents a method for estimating the remaining charge retention time by integrating temperature and driving characteristics, which refines predictions and improves model reliability. Furthermore, data from the National Big Data Alliance for New Energy Vehicles (NDANEV) were employed to develop a predictive model based on machine learning (ML) models. The different ML models compared in this study are Linear Regression, LSTM, RF, Prophet, LightGBM, and XGBoost. The model performance was evaluated using mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination (R2) and the prediction runtime to assess the prediction accuracy. The results show that the R2 values for Prophet, Random Forest, LSTM, XGBoost, and LightGBM are 0.91, 0.94, 0.95, 0.94, and 0.94 respectively. This suggests that XGBoost outperforms the other models, providing the most accurate estimate of the remaining driving time. In addition, the result confirms that considering driving characteristics and ambient temperature improves the reliability and robustness of estimations. These advancements contribute to more efficient energy management and optimized charging strategies.
准确估算电动汽车电池的剩余行驶时间(RDT)对于优化能源管理和提升用户体验至关重要。然而,传统的估算方法没有充分考虑温度、驾驶特性和车辆行驶时间的影响,导致预测精度较低,里程管理不理想。为了解决这些限制,本研究提出了一种通过集成温度和驱动特性来估计剩余电荷保留时间的方法,该方法可以改进预测并提高模型的可靠性。此外,利用国家新能源汽车大数据联盟(NDANEV)的数据开发了基于机器学习(ML)模型的预测模型。本研究中比较的不同机器学习模型有线性回归、LSTM、RF、Prophet、LightGBM和XGBoost。采用平均绝对误差(MAE)、均方根误差(RMSE)、决定系数(R2)和预测运行时间对模型性能进行评价,以评估预测精度。结果表明,Prophet、Random Forest、LSTM、XGBoost和LightGBM的R2值分别为0.91、0.94、0.95、0.94和0.94。这表明XGBoost优于其他模型,提供了最准确的剩余驾驶时间估计。此外,结果证实,考虑驾驶特性和环境温度提高了估计的可靠性和鲁棒性。这些进步有助于更有效的能源管理和优化充电策略。
{"title":"Estimation of the remaining charge retention time of an electric vehicle battery","authors":"Chourik Fousseni ,&nbsp;Martin Otis ,&nbsp;Khaled Ziane","doi":"10.1016/j.mlwa.2025.100813","DOIUrl":"10.1016/j.mlwa.2025.100813","url":null,"abstract":"<div><div>Accurately estimating the remaining driving time (RDT) of an electric vehicle (EV) battery is essential for optimizing energy management and enhancing user experience. However, traditional estimation methods do not adequately account for the influence of temperature, driving characteristics and vehicle driving time, leading to less accurate predictions and suboptimal range management. To address these limitations, this study presents a method for estimating the remaining charge retention time by integrating temperature and driving characteristics, which refines predictions and improves model reliability. Furthermore, data from the National Big Data Alliance for New Energy Vehicles (NDANEV) were employed to develop a predictive model based on machine learning (ML) models. The different ML models compared in this study are Linear Regression, LSTM, RF, Prophet, LightGBM, and XGBoost. The model performance was evaluated using mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></math></span>) and the prediction runtime to assess the prediction accuracy. The results show that the <span><math><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></math></span> values for Prophet, Random Forest, LSTM, XGBoost, and LightGBM are 0.91, 0.94, 0.95, 0.94, and 0.94 respectively. This suggests that XGBoost outperforms the other models, providing the most accurate estimate of the remaining driving time. In addition, the result confirms that considering driving characteristics and ambient temperature improves the reliability and robustness of estimations. These advancements contribute to more efficient energy management and optimized charging strategies.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100813"},"PeriodicalIF":4.9,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing marine mammal monitoring: Large-scale UAV delphinidae datasets and robust motion tracking for group size estimation 推进海洋哺乳动物监测:大规模无人机飞燕科数据集和鲁棒运动跟踪群体大小估计
IF 4.9 Pub Date : 2025-12-04 DOI: 10.1016/j.mlwa.2025.100808
Leonardo Viegas Filipe , João Canelas , Mário Vieira , Francisco Correia da Fonseca , André Cid , Joana Castro , Inês Machado
Reliable estimates of dolphin abundance are essential for conservation and impact assessment, yet manual analysis of aerial surveys is time-consuming and difficult to scale. This paper presents an end-to-end pipeline for automatic dolphin counting from unmanned aerial vehicle (UAV) video that combines modern object detection and multi-object tracking. We construct a large detection dataset of 64,705 images with 225,305 dolphin bounding boxes and a tracking dataset of 54,274 frames with 207,850 boxes and 603 unique tracks, derived from UAV line-transect surveys. Using these data, we train a YOLO11-based detector that achieves a precision of approximately 0.93 across a range of sea states. For tracking, we adopt BoT-SORT and tune its parameters with a genetic algorithm using a multi-metric objective, reducing ID fragmentation by about 29% relative to default settings. Recent YOLO-based cetacean detectors trained on UAV imagery of beluga whales report precision/recall around 0.92/0.92 for adults and 0.94/0.89 for calves, but rely on DeepSORT tracking whose MOTA remains below 0.5 and must be boosted to roughly 0.7 with post-hoc trajectory post-processing. In this context, our pipeline offers competitive detection performance, substantially larger and fully documented detection and tracking benchmarks, and GA-optimized tracking without manual post-processing. Applied to dolphin group counting, the full pipeline attains a mean absolute error of 1.24 on a held-out validation set, demonstrating that UAV-based automated counting can support robust, scalable monitoring of coastal dolphin populations.
海豚数量的可靠估计对保育和影响评估至关重要,但人工分析航空调查既耗时又难以衡量。提出了一种结合现代目标检测和多目标跟踪的端到端无人机视频海豚自动计数方法。我们构建了一个由64,705张图像组成的大型检测数据集,其中包含225,305个海豚边界框;以及一个由54,274帧图像组成的跟踪数据集,其中包含207,850个框和603个独特的轨迹。利用这些数据,我们训练了一个基于yolo11的探测器,该探测器在各种海况下的精度约为0.93。对于跟踪,我们采用BoT-SORT并使用多度量目标的遗传算法调整其参数,相对于默认设置减少了约29%的ID碎片。最近基于yolo的鲸类探测器在无人机图像上训练的白鲸报告精度/召回率约为0.92/0.92,幼鲸为0.94/0.89,但依赖于深度排序跟踪,其MOTA仍然低于0.5,必须通过事后轨迹后处理提高到大约0.7。在这种情况下,我们的管道提供具有竞争力的检测性能,更大的和完整文档的检测和跟踪基准,以及无需手动后处理的ga优化跟踪。应用于海豚种群计数,整个管道的平均绝对误差为1.24,这表明基于无人机的自动计数可以支持强大的、可扩展的沿海海豚种群监测。
{"title":"Advancing marine mammal monitoring: Large-scale UAV delphinidae datasets and robust motion tracking for group size estimation","authors":"Leonardo Viegas Filipe ,&nbsp;João Canelas ,&nbsp;Mário Vieira ,&nbsp;Francisco Correia da Fonseca ,&nbsp;André Cid ,&nbsp;Joana Castro ,&nbsp;Inês Machado","doi":"10.1016/j.mlwa.2025.100808","DOIUrl":"10.1016/j.mlwa.2025.100808","url":null,"abstract":"<div><div>Reliable estimates of dolphin abundance are essential for conservation and impact assessment, yet manual analysis of aerial surveys is time-consuming and difficult to scale. This paper presents an end-to-end pipeline for automatic dolphin counting from unmanned aerial vehicle (UAV) video that combines modern object detection and multi-object tracking. We construct a large detection dataset of 64,705 images with 225,305 dolphin bounding boxes and a tracking dataset of 54,274 frames with 207,850 boxes and 603 unique tracks, derived from UAV line-transect surveys. Using these data, we train a YOLO11-based detector that achieves a precision of approximately 0.93 across a range of sea states. For tracking, we adopt BoT-SORT and tune its parameters with a genetic algorithm using a multi-metric objective, reducing ID fragmentation by about 29% relative to default settings. Recent YOLO-based cetacean detectors trained on UAV imagery of beluga whales report precision/recall around 0.92/0.92 for adults and 0.94/0.89 for calves, but rely on DeepSORT tracking whose MOTA remains below 0.5 and must be boosted to roughly 0.7 with post-hoc trajectory post-processing. In this context, our pipeline offers competitive detection performance, substantially larger and fully documented detection and tracking benchmarks, and GA-optimized tracking without manual post-processing. Applied to dolphin group counting, the full pipeline attains a mean absolute error of 1.24 on a held-out validation set, demonstrating that UAV-based automated counting can support robust, scalable monitoring of coastal dolphin populations.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100808"},"PeriodicalIF":4.9,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine learning with applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1