COVID-19 detection from optimized features of breathing audio signals using explainable ensemble machine learning

IF 3.2 Q3 Mathematics Results in Control and Optimization Pub Date : 2025-02-15 DOI:10.1016/j.rico.2025.100538
Shafrin Sultana , A. B. M. Aowlad Hossain , Jahangir Alam
{"title":"COVID-19 detection from optimized features of breathing audio signals using explainable ensemble machine learning","authors":"Shafrin Sultana ,&nbsp;A. B. M. Aowlad Hossain ,&nbsp;Jahangir Alam","doi":"10.1016/j.rico.2025.100538","DOIUrl":null,"url":null,"abstract":"<div><div>The automatic detection of COVID-19 using smartphone-recorded breathing signals in a ubiquitous and non-invasive way holds great promise. However, achieving accurate detection is challenging due to breathing signals' noisy and non-stationary nature, lack of distinguishable features, and imbalanced COVID/non-COVID data scenarios. This paper proposes an explainable ensemble learning-based framework for COVID-19 detection that extracts features from breathing signals through multiresolution analysis. First, we extract 165-dimensional features from the decomposed coefficients of a two-level discrete wavelet transformed (DWT) signal. From these, 27 optimized features are selected using the Recursive Feature Elimination with Cross-Validation (RFECV) technique. The level-2 DWT decomposed approximation coefficients retain frequencies in the 0–150 Hz range, aligning with human breathing frequencies. We utilize an ensemble model comprising decision trees, random forests, gradient boost, and XGBoost classifiers with a majority voting strategy for the detection task. A balanced and augmented dataset is prepared using the publicly available Coswara dataset. The results show that the ensemble approach improves accuracy compared to the individual models. Further, we explore the model's interpretability using Shapley additive explanations values, finding that the model places primary importance on features such as the RMS value, higher pitch of short-time Fourier transform, and higher frequency components of the Mel spectrogram, which align well with known COVID-related breathing characteristics. A comparison with related works demonstrates the effectiveness of our proposed feature extraction and ensemble framework, achieving an accuracy of 97.5 % and specificity of 95.24 %. These findings can potentially support smartphone-based COVID-19 detection applications using breathing signals.</div></div>","PeriodicalId":34733,"journal":{"name":"Results in Control and Optimization","volume":"18 ","pages":"Article 100538"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Control and Optimization","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666720725000244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

The automatic detection of COVID-19 using smartphone-recorded breathing signals in a ubiquitous and non-invasive way holds great promise. However, achieving accurate detection is challenging due to breathing signals' noisy and non-stationary nature, lack of distinguishable features, and imbalanced COVID/non-COVID data scenarios. This paper proposes an explainable ensemble learning-based framework for COVID-19 detection that extracts features from breathing signals through multiresolution analysis. First, we extract 165-dimensional features from the decomposed coefficients of a two-level discrete wavelet transformed (DWT) signal. From these, 27 optimized features are selected using the Recursive Feature Elimination with Cross-Validation (RFECV) technique. The level-2 DWT decomposed approximation coefficients retain frequencies in the 0–150 Hz range, aligning with human breathing frequencies. We utilize an ensemble model comprising decision trees, random forests, gradient boost, and XGBoost classifiers with a majority voting strategy for the detection task. A balanced and augmented dataset is prepared using the publicly available Coswara dataset. The results show that the ensemble approach improves accuracy compared to the individual models. Further, we explore the model's interpretability using Shapley additive explanations values, finding that the model places primary importance on features such as the RMS value, higher pitch of short-time Fourier transform, and higher frequency components of the Mel spectrogram, which align well with known COVID-related breathing characteristics. A comparison with related works demonstrates the effectiveness of our proposed feature extraction and ensemble framework, achieving an accuracy of 97.5 % and specificity of 95.24 %. These findings can potentially support smartphone-based COVID-19 detection applications using breathing signals.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用可解释的集成机器学习从呼吸音频信号的优化特征中检测COVID-19
利用智能手机记录的呼吸信号以无创的方式自动检测COVID-19具有很大的前景。然而,由于呼吸信号具有噪声和非平稳性质,缺乏可区分的特征,以及COVID/非COVID数据场景的不平衡,实现准确检测具有挑战性。本文提出了一种可解释的基于集成学习的COVID-19检测框架,该框架通过多分辨率分析从呼吸信号中提取特征。首先,我们从两级离散小波变换(DWT)信号的分解系数中提取165维特征。从这些特征中,使用递归特征消除交叉验证(RFECV)技术选择27个优化特征。2级DWT分解近似系数保留0-150 Hz范围内的频率,与人类呼吸频率一致。我们使用一个集成模型,包括决策树、随机森林、梯度增强和XGBoost分类器,并使用多数投票策略进行检测任务。使用公开可用的Coswara数据集准备平衡和增强的数据集。结果表明,与单个模型相比,集成方法提高了精度。此外,我们使用Shapley加性解释值探索了模型的可解释性,发现该模型非常重视RMS值、短时傅里叶变换的高音调和Mel谱图的高频成分等特征,这些特征与已知的与covid相关的呼吸特征非常吻合。通过与相关文献的比较,验证了本文提出的特征提取和集成框架的有效性,其准确率为97.5%,特异性为95.24%。这些发现可能会支持基于智能手机的COVID-19检测应用程序,这些应用程序使用呼吸信号。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Results in Control and Optimization
Results in Control and Optimization Mathematics-Control and Optimization
CiteScore
3.00
自引率
0.00%
发文量
51
审稿时长
91 days
期刊最新文献
Enhanced relaxation-based local stabilization of T-S fuzzy systems: Achieving a simultaneous reduction in conservatism and complexity Deep–adaptive fuzzy predictive navigation framework for stable and intelligent mobile robot control Optimality in constrained fractional robust optimization problems The zero-dispersion regime of energy-critical fractional nonlinear equations Machine-learning-enhanced glowworm swarm optimization for energy-efficient multi-hop routing in wireless sensor networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1