An explainable and accurate transformer-based deep learning model for wheeze classification utilizing real-world pediatric data.

IF 3.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Scientific Reports Pub Date : 2025-02-15 DOI:10.1038/s41598-025-89533-9
Beom Joon Kim, Jeong Hyeon Mun, Dae Hwan Hwang, Dong In Suh, Changwon Lim, Kyunghoon Kim
{"title":"An explainable and accurate transformer-based deep learning model for wheeze classification utilizing real-world pediatric data.","authors":"Beom Joon Kim, Jeong Hyeon Mun, Dae Hwan Hwang, Dong In Suh, Changwon Lim, Kyunghoon Kim","doi":"10.1038/s41598-025-89533-9","DOIUrl":null,"url":null,"abstract":"<p><p>Auscultation is a method that involves listening to sounds from the patient's body, mainly using a stethoscope, to diagnose diseases. The stethoscope allows for non-invasive, real-time diagnosis, and it is ideal for diagnosing respiratory diseases and first aid. However, accurate interpretation of respiratory sounds using a stethoscope is a subjective process that requires considerable expertise from clinicians. To overcome the shortcomings of existing stethoscopes, research is actively being conducted to develop an artificial intelligence deep learning model that can interpret breathing sounds recorded through electronic stethoscopes. Most recent studies in this area have focused on CNN-based respiratory sound classification models. However, such CNN models are limited in their ability to accurately interpret conditions that require longer overall length and more detailed context. Therefore, in the present work, we apply the Transformer model-based Audio Spectrogram Transformer (AST) model to our actual clinical practice data. This prospective study targeted children who visited the pediatric departments of two university hospitals in South Korea from 2019 to 2020. A pediatric pulmonologist recorded breath sounds, and a pediatric breath sound dataset was constructed through double-blind verification. We then developed a deep learning model that applied the pre-trained weights of the AST model to our data with a total of 194 wheezes and 531 other respiratory sounds. We compared the performance of the proposed model with that of a previously published CNN-based model and also conducted performance tests using previous datasets. To ensure the reliability of the proposed model, we visualized the classification process using Score-Class Activation Mapping (Score-CAM). Our model had an accuracy of 91.1%, area under the curve (AUC) of 86.6%, precision of 88.2%, recall of 76.9%, and F1-score of 82.2%. Ultimately, the proposed transformer-based model showed high accuracy in wheezing detection, and the decision-making process of the model was also verified to be reliable. The artificial intelligence deep learning model we have developed and described in this study is expected to help accurately diagnose pediatric respiratory diseases in real-world clinical practice.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"5656"},"PeriodicalIF":3.8000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11829976/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-89533-9","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Auscultation is a method that involves listening to sounds from the patient's body, mainly using a stethoscope, to diagnose diseases. The stethoscope allows for non-invasive, real-time diagnosis, and it is ideal for diagnosing respiratory diseases and first aid. However, accurate interpretation of respiratory sounds using a stethoscope is a subjective process that requires considerable expertise from clinicians. To overcome the shortcomings of existing stethoscopes, research is actively being conducted to develop an artificial intelligence deep learning model that can interpret breathing sounds recorded through electronic stethoscopes. Most recent studies in this area have focused on CNN-based respiratory sound classification models. However, such CNN models are limited in their ability to accurately interpret conditions that require longer overall length and more detailed context. Therefore, in the present work, we apply the Transformer model-based Audio Spectrogram Transformer (AST) model to our actual clinical practice data. This prospective study targeted children who visited the pediatric departments of two university hospitals in South Korea from 2019 to 2020. A pediatric pulmonologist recorded breath sounds, and a pediatric breath sound dataset was constructed through double-blind verification. We then developed a deep learning model that applied the pre-trained weights of the AST model to our data with a total of 194 wheezes and 531 other respiratory sounds. We compared the performance of the proposed model with that of a previously published CNN-based model and also conducted performance tests using previous datasets. To ensure the reliability of the proposed model, we visualized the classification process using Score-Class Activation Mapping (Score-CAM). Our model had an accuracy of 91.1%, area under the curve (AUC) of 86.6%, precision of 88.2%, recall of 76.9%, and F1-score of 82.2%. Ultimately, the proposed transformer-based model showed high accuracy in wheezing detection, and the decision-making process of the model was also verified to be reliable. The artificial intelligence deep learning model we have developed and described in this study is expected to help accurately diagnose pediatric respiratory diseases in real-world clinical practice.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
听诊是一种主要使用听诊器聆听病人身体发出的声音来诊断疾病的方法。听诊器可以进行无创、实时的诊断,是诊断呼吸系统疾病和急救的理想工具。然而,使用听诊器准确判读呼吸音是一个主观的过程,需要临床医生具备相当的专业知识。为了克服现有听诊器的缺点,目前正在积极开展研究,以开发一种能够解读通过电子听诊器记录的呼吸音的人工智能深度学习模型。最近这方面的研究大多集中在基于 CNN 的呼吸音分类模型上。然而,这类 CNN 模型在准确解读需要更长总长度和更详细上下文的条件方面能力有限。因此,在本研究中,我们将基于变换器模型的音频频谱图变换器(AST)模型应用到实际临床实践数据中。这项前瞻性研究的对象是 2019 年至 2020 年期间在韩国两所大学医院儿科就诊的儿童。儿科肺科医生记录了呼吸音,并通过双盲验证构建了儿科呼吸音数据集。然后,我们开发了一个深度学习模型,将 AST 模型的预训练权重应用到我们的数据中,共包含 194 次喘息声和 531 次其他呼吸声。我们将所提模型的性能与之前发布的基于 CNN 的模型进行了比较,并使用之前的数据集进行了性能测试。为确保所提模型的可靠性,我们使用分数-类别激活映射(Score-CAM)将分类过程可视化。我们的模型准确率为 91.1%,曲线下面积 (AUC) 为 86.6%,精确率为 88.2%,召回率为 76.9%,F1 分数为 82.2%。最终,所提出的基于变压器的模型在喘息检测方面表现出了较高的准确性,模型的决策过程也得到了可靠的验证。我们在本研究中开发和描述的人工智能深度学习模型有望在实际临床实践中帮助准确诊断儿科呼吸系统疾病。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Scientific Reports
Scientific Reports Natural Science Disciplines-
CiteScore
7.50
自引率
4.30%
发文量
19567
审稿时长
3.9 months
期刊介绍: We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.
期刊最新文献
An 8-point scale lung ultrasound scoring network fusing local detail and global features. An evolutionary prediction model for enterprise basic research based on knowledge graph. Surrogate sensitivity analysis of facet optical coatings produced without and with in situ design reoptimization. Clinical observation of esculin and digitalisglycosides eye drops with 0.3% sodium hyaluronate eye drops for dry eye disease: a randomized controlled trial. Losing half the crown hardly affects the stem growth of a xeric southern beech population.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1