海报:基于Vggish嵌入的音频分类器改善帕金森病的诊断

2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) Pub Date : 2020-12-01 DOI:10.1145/3384420.3431775

Sruthi Kurada, Abhinav Kurada

{"title":"海报:基于Vggish嵌入的音频分类器改善帕金森病的诊断","authors":"Sruthi Kurada, Abhinav Kurada","doi":"10.1145/3384420.3431775","DOIUrl":null,"url":null,"abstract":"The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.","PeriodicalId":193143,"journal":{"name":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis\",\"authors\":\"Sruthi Kurada, Abhinav Kurada\",\"doi\":\"10.1145/3384420.3431775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.\",\"PeriodicalId\":193143,\"journal\":{\"name\":\"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3384420.3431775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3384420.3431775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

缺乏高度预测和易于应用的帕金森病(PD)生物标志物严重阻碍了病情的诊断和后续监测。然而，由于高达90%的PD患者表现出语言异常，因此使用患者声音作为快速诊断措施已显示出重大的前景。过去对创建基于语音的自动诊断工具的研究依赖于专家手工制作的音频特征集，这些特征集可以捕捉患者的发音、发音和韵律特性。不仅对PD音频诊断特征集的理想内容存在有限的共识，而且手动选择的特征可能无法充分利用底层数据的预测能力。在这项研究中，我们展示了使用VGGish嵌入的好处，这是一种更通用和更高吞吐量的特征提取策略，用于基于语音的PD诊断。我们的顶级基于vggish的模型在检测PD方面达到了87%的准确率，并且显著优于使用多个手工特征集、mel频率倒谱系数集以及ImageNet预训练的卷积神经网络提取策略训练的模型。VGGish模型与临床确定的UPDRS III-18言语恶化评分在PD诊断方面也具有很强的竞争力。这些结果证明了VGGish嵌入在创建快速准确的基于语音的PD分类模型方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis

The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)

自引率

0.00%

发文量