从非语义、声学语音特征检测轻度认知障碍:弗雷明汉心脏研究

IF 5 Q1 GERIATRICS & GERONTOLOGY JMIR Aging Pub Date : 2024-08-22 DOI:10.2196/55126
Huitong Ding, Adrian Lister, Cody Karjadi, Rhoda Au, Honghuang Lin, Brian Bischoff, Phillip H Hwang
{"title":"从非语义、声学语音特征检测轻度认知障碍:弗雷明汉心脏研究","authors":"Huitong Ding, Adrian Lister, Cody Karjadi, Rhoda Au, Honghuang Lin, Brian Bischoff, Phillip H Hwang","doi":"10.2196/55126","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>With the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment.</p><p><strong>Objective: </strong>This study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability.</p><p><strong>Methods: </strong>The study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined.</p><p><strong>Results: </strong>An optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings.</p><p><strong>Conclusions: </strong>This study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.</p>","PeriodicalId":36245,"journal":{"name":"JMIR Aging","volume":"7 ","pages":"e55126"},"PeriodicalIF":5.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11377909/pdf/","citationCount":"0","resultStr":"{\"title\":\"Detection of Mild Cognitive Impairment From Non-Semantic, Acoustic Voice Features: The Framingham Heart Study.\",\"authors\":\"Huitong Ding, Adrian Lister, Cody Karjadi, Rhoda Au, Honghuang Lin, Brian Bischoff, Phillip H Hwang\",\"doi\":\"10.2196/55126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>With the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment.</p><p><strong>Objective: </strong>This study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability.</p><p><strong>Methods: </strong>The study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined.</p><p><strong>Results: </strong>An optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings.</p><p><strong>Conclusions: </strong>This study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.</p>\",\"PeriodicalId\":36245,\"journal\":{\"name\":\"JMIR Aging\",\"volume\":\"7 \",\"pages\":\"e55126\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11377909/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Aging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/55126\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GERIATRICS & GERONTOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Aging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/55126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GERIATRICS & GERONTOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:随着全球人口老龄化和阿尔茨海默病及相关痴呆症(ADRDs)负担的不断加重,人们越来越重视识别轻度认知障碍(MCI),以便及时采取干预措施,从而有可能延缓临床痴呆症的发病。一个人说话是一项复杂的认知任务,涉及多个认知领域。音频数据收集的简便性凸显了使用人类语音作为认知评估工具的潜在成本效益和非侵入性:本研究旨在构建一个机器学习管道,将说话者日记化、特征提取、特征选择和分类结合在一起,以确定一组从语音记录中提取的声学特征,这些特征具有很强的 MCI 检测能力:研究对象包括 100 名 MCI 病例和 100 名认知正常的对照者,他们的年龄、性别和教育程度与弗雷明汉心脏研究的结果相匹配。研究人员记录了参与者在神经心理测试中的口语回答,并对记录的音频进行了处理,以便从包含测试者和参与者声音的录音中识别出每个参与者的声音片段。然后使用 OpenSMILE 和 Praat 软件从这些语音片段中提取出 6385 个综合声学特征。随后,利用在 MCI 和认知正常组之间表现出显著差异的特征,构建了一个随机森林模型来对认知状态进行分类。结果表明,29 个特征的最佳子集在 MCI 和认知正常组之间具有显著差异:结果:确定了 29 个特征的最佳子集,其接收器操作特征曲线下面积为 0.87,95% CI 为 0.81-0.94。对 MCI 分类最重要的声音特征是填充停顿的次数(重要性得分=0.09,P=3.10E-08)。根据不同长度的语音记录得出的声音特征训练出的模型在性能上没有本质区别:这项研究展示了监测语音的非语义和声学特征变化作为早期 ADRD 检测方法的潜力,并激发了未来使用人类语音作为大脑健康衡量标准的机会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detection of Mild Cognitive Impairment From Non-Semantic, Acoustic Voice Features: The Framingham Heart Study.

Background: With the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment.

Objective: This study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability.

Methods: The study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined.

Results: An optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings.

Conclusions: This study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JMIR Aging
JMIR Aging Social Sciences-Health (social science)
CiteScore
6.50
自引率
4.10%
发文量
71
审稿时长
12 weeks
期刊最新文献
Functional Monitoring of Patients With Knee Osteoarthritis Based on Multidimensional Wearable Plantar Pressure Features: Cross-Sectional Study. Social Robots and Sensors for Enhanced Aging at Home: Mixed Methods Study With a Focus on Mobility and Socioeconomic Factors. Subjective Cognitive Concerns are Associated with Worse Performance on Mobile-App Based Cognitive Assessment: An Observational Study in Cognitively Normal Older Adults. Examining Whether Patient Portal and Video Visit Use Differs by Race and Ethnicity Among Older Adults in a US Integrated Health Care Delivery System: Cross-Sectional Electronic Health Record and Survey-Based Study. Exploring the Landscape of Standards and Guidelines in AgeTech Design and Development: Scoping Review and Thematic Analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1