首页 > 最新文献

Machine learning with applications最新文献

英文 中文
Spectrogram-Based Deep Learning Models for Acoustic Identification of Honey Bees in Complex Environmental Noises 基于谱图的复杂环境噪声下蜜蜂声学识别深度学习模型
IF 4.9 Pub Date : 2025-12-12 DOI: 10.1016/j.mlwa.2025.100807
Muhammad Anus Khan , Bilal Hassan Khan , Shafiq ur Rehman Khan , Ali Raza , Asif Raza , Shehzad Ashraf Chaudhry
The rapid decline of honey bee populations presents an urgent ecological and agricultural concern, demanding innovative and scalable monitoring solutions. This study proposes a deep learning-based system for non-invasive classification of honey bee buzzing sounds to distinguish bee activity from complex environmental noise—a fundamental challenge for real-world acoustic monitoring. Traditional machine learning models using features like Mel Frequency Cepstral Coefficients (MFCCs) and spectral statistics performed well on curated datasets but failed under natural conditions due to overlapping acoustic signatures and inconsistent recordings.
To address this gap, we built a diverse dataset combining public bee audio with recordings from the Honeybee Research Center at the National Agricultural Research Centre (NARC), Pakistan, capturing various devices and natural environments. Audio signals were converted into mel spectrograms and chromograms, enabling pattern learning via pre-trained convolutional neural networks. Among tested architectures—EfficientNetB0, ResNet50, and MobileNetV2—MobileNetV2 achieved the highest generalization, with 95.29% accuracy on spectrograms and over 90% on chromograms under an 80% confidence threshold.
Data augmentation improved robustness to noise, while transfer learning enhanced adaptability. This work forms part of a broader project to develop a mobile application for real-time hive health monitoring in natural environments, where distinguishing bee buzzing from other sounds is the crucial first step. Beyond binary classification, the proposed approach offers potential for detecting hive health issues through acoustic patterns, supporting early interventions and contributing to global bee conservation efforts.
蜜蜂种群的迅速减少引起了迫切的生态和农业关注,需要创新和可扩展的监测解决方案。本研究提出了一种基于深度学习的系统,用于蜜蜂嗡嗡声的非侵入性分类,以区分蜜蜂活动和复杂的环境噪声——这是现实世界声学监测的一个基本挑战。使用Mel频倒谱系数(MFCCs)和频谱统计等特征的传统机器学习模型在整理的数据集上表现良好,但在自然条件下由于声学特征重叠和记录不一致而失败。为了解决这一差距,我们建立了一个多样化的数据集,将公共蜜蜂音频与巴基斯坦国家农业研究中心(NARC)蜜蜂研究中心的录音结合起来,捕捉各种设备和自然环境。音频信号被转换成mel谱图和色图,通过预训练的卷积神经网络进行模式学习。在测试的体系结构中,效率netb0、ResNet50和MobileNetV2-MobileNetV2实现了最高的泛化,在80%的置信度阈值下,光谱图的准确度为95.29%,色谱图的准确度超过90%。数据增强增强了对噪声的鲁棒性,而迁移学习增强了自适应性。这项工作是一个更广泛的项目的一部分,该项目旨在开发一个移动应用程序,用于在自然环境中实时监测蜂箱健康,在这个项目中,区分蜜蜂嗡嗡声和其他声音是至关重要的第一步。除了二元分类之外,所提出的方法还提供了通过声学模式检测蜂巢健康问题的潜力,支持早期干预,并为全球蜜蜂保护工作做出贡献。
{"title":"Spectrogram-Based Deep Learning Models for Acoustic Identification of Honey Bees in Complex Environmental Noises","authors":"Muhammad Anus Khan ,&nbsp;Bilal Hassan Khan ,&nbsp;Shafiq ur Rehman Khan ,&nbsp;Ali Raza ,&nbsp;Asif Raza ,&nbsp;Shehzad Ashraf Chaudhry","doi":"10.1016/j.mlwa.2025.100807","DOIUrl":"10.1016/j.mlwa.2025.100807","url":null,"abstract":"<div><div>The rapid decline of honey bee populations presents an urgent ecological and agricultural concern, demanding innovative and scalable monitoring solutions. This study proposes a deep learning-based system for non-invasive classification of honey bee buzzing sounds to distinguish bee activity from complex environmental noise—a fundamental challenge for real-world acoustic monitoring. Traditional machine learning models using features like Mel Frequency Cepstral Coefficients (MFCCs) and spectral statistics performed well on curated datasets but failed under natural conditions due to overlapping acoustic signatures and inconsistent recordings.</div><div>To address this gap, we built a diverse dataset combining public bee audio with recordings from the Honeybee Research Center at the National Agricultural Research Centre (NARC), Pakistan, capturing various devices and natural environments. Audio signals were converted into mel spectrograms and chromograms, enabling pattern learning via pre-trained convolutional neural networks. Among tested architectures—EfficientNetB0, ResNet50, and MobileNetV2—MobileNetV2 achieved the highest generalization, with 95.29% accuracy on spectrograms and over 90% on chromograms under an 80% confidence threshold.</div><div>Data augmentation improved robustness to noise, while transfer learning enhanced adaptability. This work forms part of a broader project to develop a mobile application for real-time hive health monitoring in natural environments, where distinguishing bee buzzing from other sounds is the crucial first step. Beyond binary classification, the proposed approach offers potential for detecting hive health issues through acoustic patterns, supporting early interventions and contributing to global bee conservation efforts.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100807"},"PeriodicalIF":4.9,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid DEA–fuzzy clustering approach for accurate reference set identification 一种用于准确识别参考集的混合dea -模糊聚类方法
IF 4.9 Pub Date : 2025-12-09 DOI: 10.1016/j.mlwa.2025.100818
Sara Fanati Rashidi , Maryam Olfati , Seyedali Mirjalili , Crina Grosan , Jan Platoš , Vaclav Snášel
This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.
本研究将数据包络分析(DEA)与机器学习(ML)相结合,以解决传统DEA在识别低效决策单元(dmu)参考集方面的关键局限性。在DEA中,低效单元根据基准单元进行评估;然而,一些基准可能是不合适的,甚至是异常值,这可能会扭曲效率边界。此外,当增加一个新的DMU时,必须重新计算整个模型,这对于大型数据集来说,计算成本很高。为了克服这些问题,我们提出了一种结合模糊c均值(FCM)和可能性模糊c均值(PFCM)聚类的混合方法。该方法利用欧几里得距离和隶属度来识别更接近、更相关的参考单元,同时根据实际需要引入灵敏度阈值来控制基准的数量。在两个数据集上验证了该方法的有效性:一个银行数据集和一个包含1,372个样本的钞票认证数据集。结果表明,基于ml框架的参考集与DEA的一致性达到71.6%-98.3%,同时克服了两个主要缺点:(1)对数据集大小的敏感性;(2)包含不适当的参考单元。此外,统计分析,包括置信区间和McNemar的检验,证实了研究结果的稳健性和现实意义。
{"title":"A hybrid DEA–fuzzy clustering approach for accurate reference set identification","authors":"Sara Fanati Rashidi ,&nbsp;Maryam Olfati ,&nbsp;Seyedali Mirjalili ,&nbsp;Crina Grosan ,&nbsp;Jan Platoš ,&nbsp;Vaclav Snášel","doi":"10.1016/j.mlwa.2025.100818","DOIUrl":"10.1016/j.mlwa.2025.100818","url":null,"abstract":"<div><div>This study integrates Data Envelopment Analysis (DEA) with Machine Learning (ML) to address key limitations of traditional DEA in identifying reference sets for inefficient Decision-Making Units (DMUs). In DEA, inefficient units are evaluated against benchmark units; however, some benchmarks may be inappropriate or even outliers, which can distort the efficiency frontier. Moreover, when a new DMU is added, the entire model must be recalculated, resulting in high computational costs for large datasets. To overcome these issues, we propose a hybrid approach that combines Fuzzy C-Means (FCM) and Possibilistic Fuzzy C-Means (PFCM) clustering. By leveraging Euclidean distance and membership degrees, the method identifies closer and more relevant reference units, while a sensitivity threshold is introduced to control the number of benchmarks according to practical requirements. The effectiveness of the proposed method is validated on two datasets: a banking dataset and a banknote authentication dataset with 1,372 samples. Results show that the reference sets derived from this ML-based framework achieve 71.6%–98.3% agreement with DEA, while overcoming two major drawbacks: (1) sensitivity to dataset size and (2) inclusion of inappropriate reference units. Furthermore, statistical analyses, including confidence intervals and McNemar’s test, confirm the robustness and practical significance of the findings.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100818"},"PeriodicalIF":4.9,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic discovery of robust risk groups from limited survival data across biomedical modalities 从生物医学模式的有限生存数据中自动发现稳健的风险群体
IF 4.9 Pub Date : 2025-12-08 DOI: 10.1016/j.mlwa.2025.100814
Ethar Alzaid , George Wright , Mark Eastwood , Piotr Keller , Fayyaz Minhas
Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).
基于医疗数据的生存预测通常受到稀缺标签的限制,从而限制了完全监督模型的有效性。此外,大多数现有方法产生的确定性风险评分没有传达可靠性,这阻碍了可解释性和临床可信度。为了解决这些挑战,我们引入了T-SURE,这是一个转导生存排名和风险分层框架,可以从标记和未标记的患者中共同学习,以减少对大型注释队列的依赖。它还估计一个拒绝分数,识别高不确定性的情况下,使选择性弃权时,信心是低的。T-SURE生成一个单一的风险评分,实现(1)基于生存风险的患者排名,(2)对风险组的自动分配,以及(3)对不确定预测的选择性拒绝。我们在来自癌症基因组图谱(TCGA)的泛癌症数据集上广泛评估了该模型,使用了基因表达谱、全幻灯片图像、病理报告和临床信息。该模型在排名和风险分层方面优于现有方法,特别是在有限的标记数据方案中。当不确定样本被拒绝时,它也显示出性能的持续改进,同时在数据集上保持统计学上显著的分层。T-SURE作为一个可靠的组件集成在计算病理学管道中,通过指导风险特异性治疗和监测决策,并通过高排斥评分标记不明确或罕见的病例,以供进一步研究。为了支持可重复性,T-SURE的完整实现公开可在:(匿名)。
{"title":"Automatic discovery of robust risk groups from limited survival data across biomedical modalities","authors":"Ethar Alzaid ,&nbsp;George Wright ,&nbsp;Mark Eastwood ,&nbsp;Piotr Keller ,&nbsp;Fayyaz Minhas","doi":"10.1016/j.mlwa.2025.100814","DOIUrl":"10.1016/j.mlwa.2025.100814","url":null,"abstract":"<div><div>Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100814"},"PeriodicalIF":4.9,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging multimodal large language models to extract mechanistic insights from biomedical visuals: A case study on COVID-19 and neurodegenerative diseases 利用多模态大语言模型从生物医学视觉中提取机制见解:COVID-19和神经退行性疾病的案例研究
IF 4.9 Pub Date : 2025-12-06 DOI: 10.1016/j.mlwa.2025.100816
Elizaveta Popova , Marc Jacobs , Martin Hofmann-Apitius , Negin Sadat Babaiha

Background

The COVID-19 pandemic has intensified concerns about its long-term neurological impact, with evidence linking SARS-CoV-2 infection to neurodegenerative diseases (NDDs) such as Alzheimer’s (AD) and Parkinson’s (PD). Patients with these conditions face higher risk of severe COVID-19 outcomes and may experience accelerated cognitive or motor decline following infection. Proposed mechanisms—including neuroinflammation, blood–brain barrier (BBB) disruption, and abnormal protein aggregation—closely mirror core features of neurodegenerative pathology. However, current knowledge remains fragmented across text, figures, and pathway diagrams, limiting integration into computational models that could reveal systemic patterns.

Results

To address this gap, we applied GPT-4 Omni (GPT-4o), a multimodal large language model (LLM), to extract mechanistic insights from biomedical figures. Over 10,000 images were retrieved through targeted searches on COVID-19 and neurodegeneration; after automated and manual filtering, a curated subset was analyzed. GPT-4o extracted biological relationships as semantic triples, grouped into six mechanistic categories—including microglial activation and barrier disruption—using ontology-guided similarity and assembled into a Neo4j knowledge graph (KG). Accuracy was evaluated against a gold-standard dataset of expert-annotated images using Biomedical Bidirectional Encoder Representations from Transformers (BioBERT)–based semantic matching. This evaluation enabled prompt tuning, threshold optimization, and hyperparameter assessment. Results show that GPT-4o successfully recovers both established and novel mechanisms, yielding interpretable outputs that illuminate complex biological links between SARS-CoV-2 and neurodegeneration.

Conclusions

This study demonstrates the potential of multimodal LLMs to mine biomedical visual data at scale. By complementing text mining and integrating figure-derived knowledge, our framework advances understanding of COVID-19–related neurodegeneration and supports future translational research.
COVID-19大流行加剧了人们对其长期神经系统影响的担忧,有证据表明SARS-CoV-2感染与阿尔茨海默病(AD)和帕金森病(PD)等神经退行性疾病(ndd)有关。患有这些疾病的患者面临COVID-19严重后果的更高风险,并可能在感染后加速认知或运动能力下降。提出的机制-包括神经炎症,血脑屏障(BBB)破坏和异常蛋白质聚集-密切反映了神经退行性病理的核心特征。然而,目前的知识仍然分散在文本、图形和路径图中,限制了集成到可以揭示系统模式的计算模型中。为了解决这一差距,我们应用了多模态大语言模型(LLM) GPT-4 Omni (gpt - 40)来从生物医学数据中提取机制见解。通过针对COVID-19和神经退行性疾病的定向搜索检索了1万多张图像;在自动和手动过滤之后,分析了一个精心挑选的子集。gpt - 40将生物关系提取为语义三元组,使用本体引导的相似性将其分为六个机制类别,包括小胶质细胞激活和屏障破坏,并组装成Neo4j知识图(KG)。使用基于变形金刚的生物医学双向编码器表示(BioBERT)的语义匹配,对专家注释图像的金标准数据集进行准确性评估。该评估支持即时调优、阈值优化和超参数评估。结果表明,gpt - 40成功恢复了已建立的和新的机制,产生了可解释的输出,阐明了SARS-CoV-2与神经变性之间的复杂生物学联系。本研究证明了多模态llm在大规模挖掘生物医学视觉数据方面的潜力。通过补充文本挖掘和整合图形衍生知识,我们的框架促进了对covid -19相关神经变性的理解,并支持未来的转化研究。
{"title":"Leveraging multimodal large language models to extract mechanistic insights from biomedical visuals: A case study on COVID-19 and neurodegenerative diseases","authors":"Elizaveta Popova ,&nbsp;Marc Jacobs ,&nbsp;Martin Hofmann-Apitius ,&nbsp;Negin Sadat Babaiha","doi":"10.1016/j.mlwa.2025.100816","DOIUrl":"10.1016/j.mlwa.2025.100816","url":null,"abstract":"<div><h3>Background</h3><div>The COVID-19 pandemic has intensified concerns about its long-term neurological impact, with evidence linking SARS-CoV-2 infection to neurodegenerative diseases (NDDs) such as Alzheimer’s (AD) and Parkinson’s (PD). Patients with these conditions face higher risk of severe COVID-19 outcomes and may experience accelerated cognitive or motor decline following infection. Proposed mechanisms—including neuroinflammation, blood–brain barrier (BBB) disruption, and abnormal protein aggregation—closely mirror core features of neurodegenerative pathology. However, current knowledge remains fragmented across text, figures, and pathway diagrams, limiting integration into computational models that could reveal systemic patterns.</div></div><div><h3>Results</h3><div>To address this gap, we applied GPT-4 Omni (GPT-4o), a multimodal large language model (LLM), to extract mechanistic insights from biomedical figures. Over 10,000 images were retrieved through targeted searches on COVID-19 and neurodegeneration; after automated and manual filtering, a curated subset was analyzed. GPT-4o extracted biological relationships as semantic triples, grouped into six mechanistic categories—including microglial activation and barrier disruption—using ontology-guided similarity and assembled into a Neo4j knowledge graph (KG). Accuracy was evaluated against a gold-standard dataset of expert-annotated images using Biomedical Bidirectional Encoder Representations from Transformers (BioBERT)–based semantic matching. This evaluation enabled prompt tuning, threshold optimization, and hyperparameter assessment. Results show that GPT-4o successfully recovers both established and novel mechanisms, yielding interpretable outputs that illuminate complex biological links between SARS-CoV-2 and neurodegeneration.</div></div><div><h3>Conclusions</h3><div>This study demonstrates the potential of multimodal LLMs to mine biomedical visual data at scale. By complementing text mining and integrating figure-derived knowledge, our framework advances understanding of COVID-19–related neurodegeneration and supports future translational research.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100816"},"PeriodicalIF":4.9,"publicationDate":"2025-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of the remaining charge retention time of an electric vehicle battery 电动汽车电池剩余电荷保持时间的估计
IF 4.9 Pub Date : 2025-12-05 DOI: 10.1016/j.mlwa.2025.100813
Chourik Fousseni , Martin Otis , Khaled Ziane
Accurately estimating the remaining driving time (RDT) of an electric vehicle (EV) battery is essential for optimizing energy management and enhancing user experience. However, traditional estimation methods do not adequately account for the influence of temperature, driving characteristics and vehicle driving time, leading to less accurate predictions and suboptimal range management. To address these limitations, this study presents a method for estimating the remaining charge retention time by integrating temperature and driving characteristics, which refines predictions and improves model reliability. Furthermore, data from the National Big Data Alliance for New Energy Vehicles (NDANEV) were employed to develop a predictive model based on machine learning (ML) models. The different ML models compared in this study are Linear Regression, LSTM, RF, Prophet, LightGBM, and XGBoost. The model performance was evaluated using mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination (R2) and the prediction runtime to assess the prediction accuracy. The results show that the R2 values for Prophet, Random Forest, LSTM, XGBoost, and LightGBM are 0.91, 0.94, 0.95, 0.94, and 0.94 respectively. This suggests that XGBoost outperforms the other models, providing the most accurate estimate of the remaining driving time. In addition, the result confirms that considering driving characteristics and ambient temperature improves the reliability and robustness of estimations. These advancements contribute to more efficient energy management and optimized charging strategies.
准确估算电动汽车电池的剩余行驶时间(RDT)对于优化能源管理和提升用户体验至关重要。然而,传统的估算方法没有充分考虑温度、驾驶特性和车辆行驶时间的影响,导致预测精度较低,里程管理不理想。为了解决这些限制,本研究提出了一种通过集成温度和驱动特性来估计剩余电荷保留时间的方法,该方法可以改进预测并提高模型的可靠性。此外,利用国家新能源汽车大数据联盟(NDANEV)的数据开发了基于机器学习(ML)模型的预测模型。本研究中比较的不同机器学习模型有线性回归、LSTM、RF、Prophet、LightGBM和XGBoost。采用平均绝对误差(MAE)、均方根误差(RMSE)、决定系数(R2)和预测运行时间对模型性能进行评价,以评估预测精度。结果表明,Prophet、Random Forest、LSTM、XGBoost和LightGBM的R2值分别为0.91、0.94、0.95、0.94和0.94。这表明XGBoost优于其他模型,提供了最准确的剩余驾驶时间估计。此外,结果证实,考虑驾驶特性和环境温度提高了估计的可靠性和鲁棒性。这些进步有助于更有效的能源管理和优化充电策略。
{"title":"Estimation of the remaining charge retention time of an electric vehicle battery","authors":"Chourik Fousseni ,&nbsp;Martin Otis ,&nbsp;Khaled Ziane","doi":"10.1016/j.mlwa.2025.100813","DOIUrl":"10.1016/j.mlwa.2025.100813","url":null,"abstract":"<div><div>Accurately estimating the remaining driving time (RDT) of an electric vehicle (EV) battery is essential for optimizing energy management and enhancing user experience. However, traditional estimation methods do not adequately account for the influence of temperature, driving characteristics and vehicle driving time, leading to less accurate predictions and suboptimal range management. To address these limitations, this study presents a method for estimating the remaining charge retention time by integrating temperature and driving characteristics, which refines predictions and improves model reliability. Furthermore, data from the National Big Data Alliance for New Energy Vehicles (NDANEV) were employed to develop a predictive model based on machine learning (ML) models. The different ML models compared in this study are Linear Regression, LSTM, RF, Prophet, LightGBM, and XGBoost. The model performance was evaluated using mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></math></span>) and the prediction runtime to assess the prediction accuracy. The results show that the <span><math><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></math></span> values for Prophet, Random Forest, LSTM, XGBoost, and LightGBM are 0.91, 0.94, 0.95, 0.94, and 0.94 respectively. This suggests that XGBoost outperforms the other models, providing the most accurate estimate of the remaining driving time. In addition, the result confirms that considering driving characteristics and ambient temperature improves the reliability and robustness of estimations. These advancements contribute to more efficient energy management and optimized charging strategies.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100813"},"PeriodicalIF":4.9,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing marine mammal monitoring: Large-scale UAV delphinidae datasets and robust motion tracking for group size estimation 推进海洋哺乳动物监测:大规模无人机飞燕科数据集和鲁棒运动跟踪群体大小估计
IF 4.9 Pub Date : 2025-12-04 DOI: 10.1016/j.mlwa.2025.100808
Leonardo Viegas Filipe , João Canelas , Mário Vieira , Francisco Correia da Fonseca , André Cid , Joana Castro , Inês Machado
Reliable estimates of dolphin abundance are essential for conservation and impact assessment, yet manual analysis of aerial surveys is time-consuming and difficult to scale. This paper presents an end-to-end pipeline for automatic dolphin counting from unmanned aerial vehicle (UAV) video that combines modern object detection and multi-object tracking. We construct a large detection dataset of 64,705 images with 225,305 dolphin bounding boxes and a tracking dataset of 54,274 frames with 207,850 boxes and 603 unique tracks, derived from UAV line-transect surveys. Using these data, we train a YOLO11-based detector that achieves a precision of approximately 0.93 across a range of sea states. For tracking, we adopt BoT-SORT and tune its parameters with a genetic algorithm using a multi-metric objective, reducing ID fragmentation by about 29% relative to default settings. Recent YOLO-based cetacean detectors trained on UAV imagery of beluga whales report precision/recall around 0.92/0.92 for adults and 0.94/0.89 for calves, but rely on DeepSORT tracking whose MOTA remains below 0.5 and must be boosted to roughly 0.7 with post-hoc trajectory post-processing. In this context, our pipeline offers competitive detection performance, substantially larger and fully documented detection and tracking benchmarks, and GA-optimized tracking without manual post-processing. Applied to dolphin group counting, the full pipeline attains a mean absolute error of 1.24 on a held-out validation set, demonstrating that UAV-based automated counting can support robust, scalable monitoring of coastal dolphin populations.
海豚数量的可靠估计对保育和影响评估至关重要,但人工分析航空调查既耗时又难以衡量。提出了一种结合现代目标检测和多目标跟踪的端到端无人机视频海豚自动计数方法。我们构建了一个由64,705张图像组成的大型检测数据集,其中包含225,305个海豚边界框;以及一个由54,274帧图像组成的跟踪数据集,其中包含207,850个框和603个独特的轨迹。利用这些数据,我们训练了一个基于yolo11的探测器,该探测器在各种海况下的精度约为0.93。对于跟踪,我们采用BoT-SORT并使用多度量目标的遗传算法调整其参数,相对于默认设置减少了约29%的ID碎片。最近基于yolo的鲸类探测器在无人机图像上训练的白鲸报告精度/召回率约为0.92/0.92,幼鲸为0.94/0.89,但依赖于深度排序跟踪,其MOTA仍然低于0.5,必须通过事后轨迹后处理提高到大约0.7。在这种情况下,我们的管道提供具有竞争力的检测性能,更大的和完整文档的检测和跟踪基准,以及无需手动后处理的ga优化跟踪。应用于海豚种群计数,整个管道的平均绝对误差为1.24,这表明基于无人机的自动计数可以支持强大的、可扩展的沿海海豚种群监测。
{"title":"Advancing marine mammal monitoring: Large-scale UAV delphinidae datasets and robust motion tracking for group size estimation","authors":"Leonardo Viegas Filipe ,&nbsp;João Canelas ,&nbsp;Mário Vieira ,&nbsp;Francisco Correia da Fonseca ,&nbsp;André Cid ,&nbsp;Joana Castro ,&nbsp;Inês Machado","doi":"10.1016/j.mlwa.2025.100808","DOIUrl":"10.1016/j.mlwa.2025.100808","url":null,"abstract":"<div><div>Reliable estimates of dolphin abundance are essential for conservation and impact assessment, yet manual analysis of aerial surveys is time-consuming and difficult to scale. This paper presents an end-to-end pipeline for automatic dolphin counting from unmanned aerial vehicle (UAV) video that combines modern object detection and multi-object tracking. We construct a large detection dataset of 64,705 images with 225,305 dolphin bounding boxes and a tracking dataset of 54,274 frames with 207,850 boxes and 603 unique tracks, derived from UAV line-transect surveys. Using these data, we train a YOLO11-based detector that achieves a precision of approximately 0.93 across a range of sea states. For tracking, we adopt BoT-SORT and tune its parameters with a genetic algorithm using a multi-metric objective, reducing ID fragmentation by about 29% relative to default settings. Recent YOLO-based cetacean detectors trained on UAV imagery of beluga whales report precision/recall around 0.92/0.92 for adults and 0.94/0.89 for calves, but rely on DeepSORT tracking whose MOTA remains below 0.5 and must be boosted to roughly 0.7 with post-hoc trajectory post-processing. In this context, our pipeline offers competitive detection performance, substantially larger and fully documented detection and tracking benchmarks, and GA-optimized tracking without manual post-processing. Applied to dolphin group counting, the full pipeline attains a mean absolute error of 1.24 on a held-out validation set, demonstrating that UAV-based automated counting can support robust, scalable monitoring of coastal dolphin populations.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100808"},"PeriodicalIF":4.9,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predictive modeling of error categories in English-Slovak machine translation using automatic evaluation metrics 基于自动评价指标的英语-斯洛伐克语机器翻译错误分类预测建模
IF 4.9 Pub Date : 2025-12-04 DOI: 10.1016/j.mlwa.2025.100810
Dasa Munkova , Lucia Benkova , Michal Munk , Lubomir Benko , Petr Hajek
This paper presents a language-specific adaptation for automatic identification of machine translation (MT) errors using a comprehensive set of open-source evaluation metrics. The approach focuses on the English–Slovak translation direction, addressing challenges posed by Slovak’s highly inflectional and low-resource nature. Predictive models were developed for five key error categories (Predication, Modal and communication sentence framework, Syntactic-semantic correlativeness, Compound/complex sentences, and Lexical semantics) by employing forward stepwise regression and validated through bootstrapping techniques. The models estimate the probability of error occurrence in MT segments, demonstrating stable and comparable performance across training and test datasets, as measured by Somers’ D. While human expert evaluation remains essential for verifying flagged segments, the proposed approach significantly reduces evaluator workload by prioritizing likely error-containing segments. This methodology offers a scalable and adaptable framework for MT quality assessment across languages and text styles, with potential to improve automated translation evaluation and post-editing processes.
本文提出了一种针对机器翻译(MT)错误自动识别的特定语言适配,使用了一套全面的开源评估指标。该方法侧重于英语-斯洛伐克语翻译方向,解决斯洛伐克语高度屈折和低资源性质所带来的挑战。采用前向逐步回归方法,对预测、情态与交际句框架、句法语义相关性、复合句/复合句、词汇语义等5个关键错误类别建立预测模型,并通过自举技术进行验证。该模型估计了MT片段中错误发生的概率,在训练和测试数据集上展示了稳定和可比较的性能,正如Somers ' d所测量的那样。尽管人类专家评估对于验证标记的片段仍然至关重要,但所提出的方法通过优先考虑可能包含错误的片段,大大减少了评估人员的工作量。该方法为跨语言和文本风格的翻译质量评估提供了一个可扩展和适应性强的框架,具有改进自动翻译评估和后期编辑过程的潜力。
{"title":"Predictive modeling of error categories in English-Slovak machine translation using automatic evaluation metrics","authors":"Dasa Munkova ,&nbsp;Lucia Benkova ,&nbsp;Michal Munk ,&nbsp;Lubomir Benko ,&nbsp;Petr Hajek","doi":"10.1016/j.mlwa.2025.100810","DOIUrl":"10.1016/j.mlwa.2025.100810","url":null,"abstract":"<div><div>This paper presents a language-specific adaptation for automatic identification of machine translation (MT) errors using a comprehensive set of open-source evaluation metrics. The approach focuses on the English–Slovak translation direction, addressing challenges posed by Slovak’s highly inflectional and low-resource nature. Predictive models were developed for five key error categories (Predication, Modal and communication sentence framework, Syntactic-semantic correlativeness, Compound/complex sentences, and Lexical semantics) by employing forward stepwise regression and validated through bootstrapping techniques. The models estimate the probability of error occurrence in MT segments, demonstrating stable and comparable performance across training and test datasets, as measured by Somers’ D. While human expert evaluation remains essential for verifying flagged segments, the proposed approach significantly reduces evaluator workload by prioritizing likely error-containing segments. This methodology offers a scalable and adaptable framework for MT quality assessment across languages and text styles, with potential to improve automated translation evaluation and post-editing processes.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100810"},"PeriodicalIF":4.9,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing skin cancer diagnosis using late discrete wavelet transform and new swarm-based optimizers 用晚期离散小波变换和新的群体优化器增强皮肤癌诊断
IF 4.9 Pub Date : 2025-12-03 DOI: 10.1016/j.mlwa.2025.100811
Ramin Mousa , Saeed Chamani , Mohammad Morsali , Mohammad Kazzazi , Parsa Hatami , Soroush Sarabi
Skin cancer (SC) is a life-threatening disease where early diagnosis is critical for effective treatment and survival. While deep learning (DL) has advanced skin cancer diagnosis (SCD), current methods generally yield suboptimal accuracy and efficiency due to challenges in extracting multiscale features from dermoscopic images and optimizing complex model parameters through efficient exploration of the space of hyperparameters. To address this, we propose an approach integrating late Discrete Wavelet Transform (DWT) with pre-trained convolutional neural networks (CNNs) and swarm-based optimization. The late DWT decomposes CNN-extracted feature maps into low- and high-frequency components to improve the detection of subtle lesion patterns, while a self-attention mechanism further refines this by weighing feature importance, focusing on relevant diagnostic information. To refine hyperparameters, three novel swarm-based optimizers – Modified Gorilla Troops Optimizer (MGTO), Improved Gray Wolf Optimization (IGWO), and Fox Optimization (FOX) – are employed searching the space of the hyperparameters to fine-tune the model for superior performance. In comparison to existing methods, experiments on the ISIC-2016 and ISIC-2017 datasets show enhanced classification performance, obtaining at least a 1% accuracy gain. Thus, the suggested framework offers a reliable and effective way to diagnose skin cancer automatically.
皮肤癌(SC)是一种危及生命的疾病,早期诊断对有效治疗和生存至关重要。虽然深度学习(DL)具有先进的皮肤癌诊断(SCD),但由于从皮肤镜图像中提取多尺度特征以及通过有效探索超参数空间来优化复杂模型参数的挑战,目前的方法通常产生次优的准确性和效率。为了解决这个问题,我们提出了一种将晚期离散小波变换(DWT)与预训练卷积神经网络(cnn)和基于群的优化相结合的方法。后期DWT将cnn提取的特征映射分解为低频和高频分量,以提高对细微病变模式的检测,而自关注机制通过权衡特征重要性进一步细化,专注于相关的诊断信息。为了优化超参数,采用了三种新的基于群体的优化器-改进的大猩猩部队优化器(MGTO),改进的灰狼优化器(IGWO)和狐狸优化器(Fox) -搜索超参数的空间来微调模型以获得更好的性能。与现有方法相比,在ISIC-2016和ISIC-2017数据集上的实验表明,该方法的分类性能得到了提高,准确率至少提高了1%。因此,该框架提供了一种可靠有效的皮肤癌自动诊断方法。
{"title":"Enhancing skin cancer diagnosis using late discrete wavelet transform and new swarm-based optimizers","authors":"Ramin Mousa ,&nbsp;Saeed Chamani ,&nbsp;Mohammad Morsali ,&nbsp;Mohammad Kazzazi ,&nbsp;Parsa Hatami ,&nbsp;Soroush Sarabi","doi":"10.1016/j.mlwa.2025.100811","DOIUrl":"10.1016/j.mlwa.2025.100811","url":null,"abstract":"<div><div>Skin cancer (SC) is a life-threatening disease where early diagnosis is critical for effective treatment and survival. While deep learning (DL) has advanced skin cancer diagnosis (SCD), current methods generally yield suboptimal accuracy and efficiency due to challenges in extracting multiscale features from dermoscopic images and optimizing complex model parameters through efficient exploration of the space of hyperparameters. To address this, we propose an approach integrating late Discrete Wavelet Transform (DWT) with pre-trained convolutional neural networks (CNNs) and swarm-based optimization. The late DWT decomposes CNN-extracted feature maps into low- and high-frequency components to improve the detection of subtle lesion patterns, while a self-attention mechanism further refines this by weighing feature importance, focusing on relevant diagnostic information. To refine hyperparameters, three novel swarm-based optimizers – Modified Gorilla Troops Optimizer (MGTO), Improved Gray Wolf Optimization (IGWO), and Fox Optimization (FOX) – are employed searching the space of the hyperparameters to fine-tune the model for superior performance. In comparison to existing methods, experiments on the ISIC-2016 and ISIC-2017 datasets show enhanced classification performance, obtaining at least a 1% accuracy gain. Thus, the suggested framework offers a reliable and effective way to diagnose skin cancer automatically.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100811"},"PeriodicalIF":4.9,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ISO-DeTr: A novel detection transformer for industrial small object detection ISO-DeTr:一种用于工业小物体检测的新型检测变压器
IF 4.9 Pub Date : 2025-12-02 DOI: 10.1016/j.mlwa.2025.100809
Faisal Saeed , Anand Paul
Effectively detecting and assessing real-time structural and ecological parameters in contemporary manufacturing environments poses significant challenges, particularly in identifying minute objects within product images. The swift evolution of the industrial sector underscores the necessity for intelligent manufacturing environments to uphold stringent product quality standards. However, accelerating production processes at high speeds heightens the risk of defective product outcomes. This research addresses the challenges inherent in small object detection within industrial contexts, proposing an innovative detection transformer model tailored to modern manufacturing environments. The proposed model integrates a feature-enhanced multi-head self-attention block (FEMSA), merging cross-channel communication network and multiple multi-head self-attention (MSA) components to refine image features. A query proposal network is also introduced within the detection transformer framework to discern high-ranking proposals using Intersection over Union (IoU) and Non-Maximum Suppression (NMS) algorithms. Through extensive experimentation on custom industrial small objects, our proposed model demonstrates superior performance compared to existing models based on Non-Maximum Suppression and transformers. By tackling the challenges associated with small object detection, our model contributes to the dynamic synchronization between virtual and physical manufacturing realms, enhancing quality control in industrial production.
在当代制造环境中,有效地检测和评估实时结构和生态参数提出了重大挑战,特别是在识别产品图像中的微小物体方面。工业部门的快速发展强调了智能制造环境维护严格的产品质量标准的必要性。然而,高速加速生产过程会增加产品缺陷的风险。本研究解决了工业环境中小物体检测固有的挑战,提出了一种适合现代制造环境的创新检测变压器模型。该模型集成了一个特征增强的多头自注意块(FEMSA),融合了跨信道通信网络和多个多头自注意(MSA)组件来细化图像特征。在检测变压器框架中还引入了一个查询提议网络,该网络使用交联(IoU)和非最大抑制(NMS)算法来识别高级提议。通过在定制工业小型对象上的大量实验,与基于非最大抑制和变压器的现有模型相比,我们提出的模型表现出优越的性能。通过解决与小物体检测相关的挑战,我们的模型有助于虚拟和物理制造领域之间的动态同步,增强工业生产中的质量控制。
{"title":"ISO-DeTr: A novel detection transformer for industrial small object detection","authors":"Faisal Saeed ,&nbsp;Anand Paul","doi":"10.1016/j.mlwa.2025.100809","DOIUrl":"10.1016/j.mlwa.2025.100809","url":null,"abstract":"<div><div>Effectively detecting and assessing real-time structural and ecological parameters in contemporary manufacturing environments poses significant challenges, particularly in identifying minute objects within product images. The swift evolution of the industrial sector underscores the necessity for intelligent manufacturing environments to uphold stringent product quality standards. However, accelerating production processes at high speeds heightens the risk of defective product outcomes. This research addresses the challenges inherent in small object detection within industrial contexts, proposing an innovative detection transformer model tailored to modern manufacturing environments. The proposed model integrates a feature-enhanced multi-head self-attention block (FEMSA), merging cross-channel communication network and multiple multi-head self-attention (MSA) components to refine image features. A query proposal network is also introduced within the detection transformer framework to discern high-ranking proposals using Intersection over Union (IoU) and Non-Maximum Suppression (NMS) algorithms. Through extensive experimentation on custom industrial small objects, our proposed model demonstrates superior performance compared to existing models based on Non-Maximum Suppression and transformers. By tackling the challenges associated with small object detection, our model contributes to the dynamic synchronization between virtual and physical manufacturing realms, enhancing quality control in industrial production.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100809"},"PeriodicalIF":4.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdAPT: Advertisement detector adaptation under newspaper domain shift with null-based pseudo-labeling AdAPT:基于null伪标记的报纸域偏移广告检测器自适应
IF 4.9 Pub Date : 2025-12-01 DOI: 10.1016/j.mlwa.2025.100806
Faeze Zakaryapour Sayyad , Tobias Pettersson , Seyed Jalaleddin Mousavirad , Irida Shallari , Mattias O’Nils
Detecting advertisements in digitized newspapers is a key step in large-scale media analytics and digital archiving. However, variations in layout, typography, and advertisement design across publishers and time periods cause significant domain shifts that reduce the generalization ability of supervised detectors. This paper presents AdAPT, a confidence-guided pseudo-labeling pipeline for unsupervised domain adaptation in advertisement detection. The proposed method leverages both advertisement-free (Null) and advertisement-containing pages from unlabeled target domains to generate reliable pseudo-labels. By retraining a YOLO-based detector using labeled source data combined with filtered pseudo-labeled target samples, AdAPT achieves robust adaptation without requiring manual annotation. Experiments conducted on two unseen newspapers (Adresseavisen and iTromsø) demonstrate that Null-based pseudo-labeling provides the most stable and accurate adaptation, yielding up to 38% error reduction compared to the baseline. The results highlight AdAPT as a simple, scalable, and annotation-efficient solution for maintaining high-performance advertisement detection across diverse newspaper collections.
在数字化报纸中检测广告是大规模媒体分析和数字化存档的关键步骤。然而,布局、排版和广告设计在出版商和时间段上的变化会导致显著的领域转移,从而降低监督检测器的泛化能力。该文提出了一种基于置信度引导的伪标记管道,用于广告检测中的无监督域自适应。该方法利用来自未标记目标域的无广告(Null)和包含广告的页面来生成可靠的伪标签。通过将标记的源数据与过滤后的伪标记目标样本相结合,对基于yolo的检测器进行再训练,AdAPT无需手动标注即可实现鲁棒自适应。在两份看不见的报纸(Adresseavisen和iTromsø)上进行的实验表明,基于空值的伪标记提供了最稳定和准确的自适应,与基线相比,误差减少了38%。结果表明,AdAPT是一种简单、可扩展且注释高效的解决方案,可在不同的报纸集合中维护高性能的广告检测。
{"title":"AdAPT: Advertisement detector adaptation under newspaper domain shift with null-based pseudo-labeling","authors":"Faeze Zakaryapour Sayyad ,&nbsp;Tobias Pettersson ,&nbsp;Seyed Jalaleddin Mousavirad ,&nbsp;Irida Shallari ,&nbsp;Mattias O’Nils","doi":"10.1016/j.mlwa.2025.100806","DOIUrl":"10.1016/j.mlwa.2025.100806","url":null,"abstract":"<div><div>Detecting advertisements in digitized newspapers is a key step in large-scale media analytics and digital archiving. However, variations in layout, typography, and advertisement design across publishers and time periods cause significant domain shifts that reduce the generalization ability of supervised detectors. This paper presents AdAPT, a confidence-guided pseudo-labeling pipeline for unsupervised domain adaptation in advertisement detection. The proposed method leverages both advertisement-free (Null) and advertisement-containing pages from unlabeled target domains to generate reliable pseudo-labels. By retraining a YOLO-based detector using labeled source data combined with filtered pseudo-labeled target samples, AdAPT achieves robust adaptation without requiring manual annotation. Experiments conducted on two unseen newspapers (Adresseavisen and iTromsø) demonstrate that Null-based pseudo-labeling provides the most stable and accurate adaptation, yielding up to 38% error reduction compared to the baseline. The results highlight AdAPT as a simple, scalable, and annotation-efficient solution for maintaining high-performance advertisement detection across diverse newspaper collections.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100806"},"PeriodicalIF":4.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine learning with applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1