Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.

IF 1.1 4区 医学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY Folia Phoniatrica et Logopaedica Pub Date : 2025-02-21 DOI:10.1159/000544852
Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin, Eric J Hunter
{"title":"Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.","authors":"Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin, Eric J Hunter","doi":"10.1159/000544852","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The Acoustic Voice Quality Index (AVQI) and smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessing of voice quality in persons seeking voice care across many languages. This study aims to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.</p><p><strong>Methods: </strong>This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (Random Forest (RF), k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), and Decision Tree (DT)) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden index.</p><p><strong>Results: </strong>A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered the best balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.</p><p><strong>Conclusions: </strong>Machine learning shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.</p>","PeriodicalId":12114,"journal":{"name":"Folia Phoniatrica et Logopaedica","volume":" ","pages":"1-28"},"PeriodicalIF":1.1000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Folia Phoniatrica et Logopaedica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000544852","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: The Acoustic Voice Quality Index (AVQI) and smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessing of voice quality in persons seeking voice care across many languages. This study aims to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.

Methods: This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (Random Forest (RF), k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), and Decision Tree (DT)) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden index.

Results: A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered the best balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.

Conclusions: Machine learning shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Folia Phoniatrica et Logopaedica
Folia Phoniatrica et Logopaedica AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-OTORHINOLARYNGOLOGY
CiteScore
2.30
自引率
10.00%
发文量
28
审稿时长
>12 weeks
期刊介绍: Published since 1947, ''Folia Phoniatrica et Logopaedica'' provides a forum for international research on the anatomy, physiology, and pathology of structures of the speech, language, and hearing mechanisms. Original papers published in this journal report new findings on basic function, assessment, management, and test development in communication sciences and disorders, as well as experiments designed to test specific theories of speech, language, and hearing function. Review papers of high quality are also welcomed.
期刊最新文献
Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning. Development and validation of a Japanese outcome tool for perceptual assessment of speech in patients with cleft palate. Treatment effects of combined transoral injection laryngoplasty with short voice therapy in patients with unilateral vocal fold immobility - a pilot study. Validation of an Arabic questionnaire to assess pediatric behavioral feeding disorders. Effect of systematic effortful swallowing exercise on the activation level of the submental muscles and tongue strength in older adults.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1