Development and Validation of a Machine Learning Method Using Vocal Biomarkers for Identifying Frailty in Community-Dwelling Older Adults: Cross-Sectional Study.
Taehwan Kim, Jung-Yeon Choi, Myung Jin Ko, Kwang-Il Kim
{"title":"Development and Validation of a Machine Learning Method Using Vocal Biomarkers for Identifying Frailty in Community-Dwelling Older Adults: Cross-Sectional Study.","authors":"Taehwan Kim, Jung-Yeon Choi, Myung Jin Ko, Kwang-Il Kim","doi":"10.2196/57298","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The two most commonly used methods to identify frailty are the frailty phenotype and the frailty index. However, both methods have limitations in clinical application. In addition, methods for measuring frailty have not yet been standardized.</p><p><strong>Objective: </strong>We aimed to develop and validate a classification model for predicting frailty status using vocal biomarkers in community-dwelling older adults, based on voice recordings obtained from the picture description task (PDT).</p><p><strong>Methods: </strong>We recruited 127 participants aged 50 years and older and collected clinical information through a short form of the Comprehensive Geriatric Assessment scale. Voice recordings were collected with a tablet device during the Korean version of the PDT, and we preprocessed audio data to remove background noise before feature extraction. Three artificial intelligence (AI) models were developed for identifying frailty status: SpeechAI (using speech data only), DemoAI (using demographic data only), and DemoSpeechAI (combining both data types).</p><p><strong>Results: </strong>Our models were trained and evaluated on the basis of 5-fold cross-validation for 127 participants and compared. The SpeechAI model, using deep learning-based acoustic features, outperformed in terms of accuracy and area under the receiver operating characteristic curve (AUC), 80.4% (95% CI 76.89%-83.91%) and 0.89 (95% CI 0.86-0.92), respectively, while the model using only demographics showed an accuracy of 67.96% (95% CI 67.63%-68.29%) and an AUC of 0.74 (95% CI 0.73-0.75). The SpeechAI model outperformed the model using only demographics significantly in AUC (t4=8.705 [2-sided]; P<.001). The DemoSpeechAI model, which combined demographics with deep learning-based acoustic features, showed superior performance (accuracy 85.6%, 95% CI 80.03%-91.17% and AUC 0.93, 95% CI 0.89-0.97), but there was no significant difference in AUC between the SpeechAI and DemoSpeechAI models (t4=1.057 [2-sided]; P=.35). Compared with models using traditional acoustic features from the openSMILE toolkit, the SpeechAI model demonstrated superior performance (AUC 0.89) over traditional methods (logistic regression: AUC 0.62; decision tree: AUC 0.57; random forest: AUC 0.66).</p><p><strong>Conclusions: </strong>Our findings demonstrate that vocal biomarkers derived from deep learning-based acoustic features can be effectively used to predict frailty status in community-dwelling older adults. The SpeechAI model showed promising accuracy and AUC, outperforming models based solely on demographic data or traditional acoustic features. Furthermore, while the combined DemoSpeechAI model showed slightly improved performance over the SpeechAI model, the difference was not statistically significant. These results suggest that speech-based AI models offer a noninvasive, scalable method for frailty detection, potentially streamlining assessments in clinical and community settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e57298"},"PeriodicalIF":3.1000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11756832/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/57298","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The two most commonly used methods to identify frailty are the frailty phenotype and the frailty index. However, both methods have limitations in clinical application. In addition, methods for measuring frailty have not yet been standardized.
Objective: We aimed to develop and validate a classification model for predicting frailty status using vocal biomarkers in community-dwelling older adults, based on voice recordings obtained from the picture description task (PDT).
Methods: We recruited 127 participants aged 50 years and older and collected clinical information through a short form of the Comprehensive Geriatric Assessment scale. Voice recordings were collected with a tablet device during the Korean version of the PDT, and we preprocessed audio data to remove background noise before feature extraction. Three artificial intelligence (AI) models were developed for identifying frailty status: SpeechAI (using speech data only), DemoAI (using demographic data only), and DemoSpeechAI (combining both data types).
Results: Our models were trained and evaluated on the basis of 5-fold cross-validation for 127 participants and compared. The SpeechAI model, using deep learning-based acoustic features, outperformed in terms of accuracy and area under the receiver operating characteristic curve (AUC), 80.4% (95% CI 76.89%-83.91%) and 0.89 (95% CI 0.86-0.92), respectively, while the model using only demographics showed an accuracy of 67.96% (95% CI 67.63%-68.29%) and an AUC of 0.74 (95% CI 0.73-0.75). The SpeechAI model outperformed the model using only demographics significantly in AUC (t4=8.705 [2-sided]; P<.001). The DemoSpeechAI model, which combined demographics with deep learning-based acoustic features, showed superior performance (accuracy 85.6%, 95% CI 80.03%-91.17% and AUC 0.93, 95% CI 0.89-0.97), but there was no significant difference in AUC between the SpeechAI and DemoSpeechAI models (t4=1.057 [2-sided]; P=.35). Compared with models using traditional acoustic features from the openSMILE toolkit, the SpeechAI model demonstrated superior performance (AUC 0.89) over traditional methods (logistic regression: AUC 0.62; decision tree: AUC 0.57; random forest: AUC 0.66).
Conclusions: Our findings demonstrate that vocal biomarkers derived from deep learning-based acoustic features can be effectively used to predict frailty status in community-dwelling older adults. The SpeechAI model showed promising accuracy and AUC, outperforming models based solely on demographic data or traditional acoustic features. Furthermore, while the combined DemoSpeechAI model showed slightly improved performance over the SpeechAI model, the difference was not statistically significant. These results suggest that speech-based AI models offer a noninvasive, scalable method for frailty detection, potentially streamlining assessments in clinical and community settings.
背景:鉴定脆弱最常用的两种方法是脆弱表型和脆弱指数。然而,这两种方法在临床应用中都有局限性。此外,衡量脆弱性的方法尚未标准化。目的:基于图片描述任务(PDT)中获得的语音记录,我们旨在开发并验证一个基于声音生物标志物预测社区老年人虚弱状态的分类模型。方法:我们招募了127名年龄在50岁及以上的参与者,通过一份简短的老年综合评估量表收集临床信息。韩国版PDT使用平板设备收集录音,在特征提取之前对音频数据进行预处理,去除背景噪声。开发了三种用于识别脆弱状态的人工智能(AI)模型:SpeechAI(仅使用语音数据),DemoAI(仅使用人口统计数据)和demospeech hai(结合两种数据类型)。结果:我们的模型在127名参与者的5倍交叉验证的基础上进行了训练和评估,并进行了比较。使用基于深度学习的声学特征的SpeechAI模型在准确度和接收者工作特征曲线(AUC)下的面积方面分别优于80.4% (95% CI 76.89%-83.91%)和0.89 (95% CI 0.86-0.92),而仅使用人口统计学的模型显示准确率为67.96% (95% CI 67.63%-68.29%)和AUC为0.74 (95% CI 0.73-0.75)。speech hai模型在AUC上显著优于仅使用人口统计数据的模型(t4=8.705[双侧]);结论:我们的研究结果表明,基于深度学习的声学特征衍生的声音生物标志物可以有效地用于预测社区居住老年人的虚弱状态。speech hai模型显示出良好的准确性和AUC,优于仅基于人口统计数据或传统声学特征的模型。此外,虽然组合DemoSpeechAI模型的性能比SpeechAI模型略有提高,但差异没有统计学意义。这些结果表明,基于语音的人工智能模型提供了一种无创、可扩展的脆弱性检测方法,有可能简化临床和社区环境中的评估。
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.