Detecting Cough Recordings in Crowdsourced Data Using CNN-RNN

R. Sharan, Hao Xiong, S. Berkovsky
{"title":"Detecting Cough Recordings in Crowdsourced Data Using CNN-RNN","authors":"R. Sharan, Hao Xiong, S. Berkovsky","doi":"10.1109/BHI56158.2022.9926896","DOIUrl":null,"url":null,"abstract":"The sound of cough is an important indicator of the condition of the respiratory system. Automatic cough sound evaluation can aid the diagnosis of respiratory diseases. Large crowdsourced cough sound datasets have recently been used by several groups around the world to develop cough classification models. However, not all recordings in these datasets contain cough sounds. As such, it is important to screen the recordings for the presence of cough sounds before developing cough classification models. This work proposes a method to screen crowdsourced audio recordings for cough sounds using deep learning methods. The proposed approach divides the audio recording into overlapping frames and converts each frame into a mel-spectrogram representation. A pretrained convolutional neural network for audio classification is trained to learn the spectral characteristics of cough and non-cough frames from its mel-spectrogram representation. It is combined with a recurrent neural network to learn the dependencies between the sequence of frames. The proposed method is evaluated on 400 crowdsourced audio recordings, manually annotated as cough or non-cough. An accuracy of 0.9800 (AUC of 0.9973) is achieved in classifying cough and non-cough recordings using the proposed method. The trained network is used to analyze the remaining audio recordings in the dataset, identifying only about 67% of recordings as containing usable cough sounds. This shows the need to exercise caution when using crowdsourced cough data.","PeriodicalId":347210,"journal":{"name":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","volume":"10 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BHI56158.2022.9926896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The sound of cough is an important indicator of the condition of the respiratory system. Automatic cough sound evaluation can aid the diagnosis of respiratory diseases. Large crowdsourced cough sound datasets have recently been used by several groups around the world to develop cough classification models. However, not all recordings in these datasets contain cough sounds. As such, it is important to screen the recordings for the presence of cough sounds before developing cough classification models. This work proposes a method to screen crowdsourced audio recordings for cough sounds using deep learning methods. The proposed approach divides the audio recording into overlapping frames and converts each frame into a mel-spectrogram representation. A pretrained convolutional neural network for audio classification is trained to learn the spectral characteristics of cough and non-cough frames from its mel-spectrogram representation. It is combined with a recurrent neural network to learn the dependencies between the sequence of frames. The proposed method is evaluated on 400 crowdsourced audio recordings, manually annotated as cough or non-cough. An accuracy of 0.9800 (AUC of 0.9973) is achieved in classifying cough and non-cough recordings using the proposed method. The trained network is used to analyze the remaining audio recordings in the dataset, identifying only about 67% of recordings as containing usable cough sounds. This shows the need to exercise caution when using crowdsourced cough data.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用CNN-RNN检测众包数据中的咳嗽录音
咳嗽声是反映呼吸系统状况的重要指标。自动咳嗽声评价有助于呼吸道疾病的诊断。大型众包咳嗽声数据集最近被世界各地的几个小组用于开发咳嗽分类模型。然而,并非这些数据集中的所有录音都包含咳嗽声。因此,在建立咳嗽分类模型之前,筛选咳嗽声音的录音是很重要的。这项工作提出了一种使用深度学习方法筛选咳嗽声音的众包录音的方法。该方法将音频记录划分为重叠帧,并将每帧转换为梅尔谱图表示。一个预训练的卷积神经网络用于音频分类,从其梅尔谱图表示中学习咳嗽帧和非咳嗽帧的频谱特征。它与递归神经网络相结合来学习帧序列之间的依赖关系。该方法在400个众包录音上进行了评估,手工标注为咳嗽或非咳嗽。使用该方法对咳嗽和非咳嗽录音进行分类,准确率为0.9800 (AUC为0.9973)。经过训练的网络用于分析数据集中剩余的录音,仅识别出约67%的录音包含可用的咳嗽声。这表明在使用众包咳嗽数据时需要谨慎。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
BEBOP: Bidirectional dEep Brain cOnnectivity maPping Stabilizing Skeletal Pose Estimation using mmWave Radar via Dynamic Model and Filtering Behavioral Data Categorization for Transformers-based Models in Digital Health Gender Difference in Prognosis of Patients with Heart Failure: A Propensity Score Matching Analysis Influence of Sensor Position and Body Movements on Radar-Based Heart Rate Monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1