Creating a Spanish Speech Corpus to Develop Digital Dementia Biomarkers Using Machine Learning

2022 IEEE Mexican International Conference on Computer Science (ENC) Pub Date : 2022-08-24 DOI:10.1109/ENC56672.2022.9882903

L. Cabrera-Leyva, Jesús Favela Vara, Dagoberto Cruz-Sandoval, Diana Leticia Paniagua Santos, Maricruz Huerta Jauregui

引用次数: 0

Abstract

Dementia is one of the most prevalent diseases affecting older adults in Mexico. There has been increasing interest in the development of digital biomarkers of dementia based on the analysis of speech. The availability of high-quality speech corpus is important to advance this line of research. However, there are no publicly available dataset in Spanish for this purpose. Therefore, we describe a protocol to capture Spanish audio from older adults for dementia research. We describe the lessons learned and adjustments to the protocol that emerged from a pilot study.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用机器学习创建西班牙语语料库以开发数字痴呆症生物标志物

痴呆症是影响墨西哥老年人的最普遍疾病之一。基于语言分析的痴呆症数字生物标志物的开发越来越受到关注。高质量语音语料库的可用性对于推进这方面的研究非常重要。然而，没有西班牙语的公开可用数据集用于此目的。因此，我们描述了一种从老年人中获取西班牙语音频用于痴呆症研究的方案。我们描述了从试点研究中获得的经验教训和对方案的调整。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE Mexican International Conference on Computer Science (ENC)

自引率

0.00%

发文量