Feasibility Study of Parkinson's Speech Disorder Evaluation With Pre-Trained Deep Learning Model for Speech-to-Text Analysis.

Q3 Medicine Korean Journal of Neurotrauma Pub Date : 2024-09-23 eCollection Date: 2024-09-01 DOI:10.13004/kjnt.2024.20.e30
Kwang Hyeon Kim, Byung-Jou Lee, Hae-Won Koo
{"title":"Feasibility Study of Parkinson's Speech Disorder Evaluation With Pre-Trained Deep Learning Model for Speech-to-Text Analysis.","authors":"Kwang Hyeon Kim, Byung-Jou Lee, Hae-Won Koo","doi":"10.13004/kjnt.2024.20.e30","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study investigates the feasibility of employing a pre-trained deep learning wave-to-vec model for speech-to-text analysis in individuals with speech disorders arising from Parkinson's disease (PD).</p><p><strong>Methods: </strong>A publicly available dataset containing speech recordings including the Hoehn and Yahr (H&Y) staging, Movement Disorder Society Unified Parkinson's Disease Rating Scale (UPDRS) Part I, UPDRS Part II scores, and gender information from both healthy controls (HC) and those diagnosed with PD was utilized. Employing the Wav2Vec model, a speech-to-text analysis method was implemented on PD patient data. Tasks conducted included word letter classification, word match probability assessment, and analysis of speech waveform characteristics as provided by the model's output.</p><p><strong>Results: </strong>For the dataset comprising 20 cases, among individuals with PD, the H&Y score averaged 2.50±0.67, the UPDRS II-part 5 score averaged 0.70±1.00, and the UPDRS III-part 18 score averaged 0.80±0.98. Additionally, the number of words derived from decoded text subsequent to speech recognition was evaluated, resulting in mean values of 299.10±16.79 and 259.80±93.39 for the HC and PD groups, respectively. Furthermore, the calculated degree of agreement for all syllables was based on the speech process. The accuracy for the reading sentences was observed to be 0.31 and 0.10, respectively.</p><p><strong>Conclusion: </strong>This study aimed to demonstrate the effectiveness of wave-to-vec in enhancing speech-to-text analysis for patients with speech disorders. The findings could pave the way for the development of clinical tools for improved diagnosis, evaluation, and communication support for this population.</p>","PeriodicalId":36879,"journal":{"name":"Korean Journal of Neurotrauma","volume":"20 3","pages":"168-179"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11450341/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Journal of Neurotrauma","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13004/kjnt.2024.20.e30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study investigates the feasibility of employing a pre-trained deep learning wave-to-vec model for speech-to-text analysis in individuals with speech disorders arising from Parkinson's disease (PD).

Methods: A publicly available dataset containing speech recordings including the Hoehn and Yahr (H&Y) staging, Movement Disorder Society Unified Parkinson's Disease Rating Scale (UPDRS) Part I, UPDRS Part II scores, and gender information from both healthy controls (HC) and those diagnosed with PD was utilized. Employing the Wav2Vec model, a speech-to-text analysis method was implemented on PD patient data. Tasks conducted included word letter classification, word match probability assessment, and analysis of speech waveform characteristics as provided by the model's output.

Results: For the dataset comprising 20 cases, among individuals with PD, the H&Y score averaged 2.50±0.67, the UPDRS II-part 5 score averaged 0.70±1.00, and the UPDRS III-part 18 score averaged 0.80±0.98. Additionally, the number of words derived from decoded text subsequent to speech recognition was evaluated, resulting in mean values of 299.10±16.79 and 259.80±93.39 for the HC and PD groups, respectively. Furthermore, the calculated degree of agreement for all syllables was based on the speech process. The accuracy for the reading sentences was observed to be 0.31 and 0.10, respectively.

Conclusion: This study aimed to demonstrate the effectiveness of wave-to-vec in enhancing speech-to-text analysis for patients with speech disorders. The findings could pave the way for the development of clinical tools for improved diagnosis, evaluation, and communication support for this population.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用预训练的深度学习模型进行帕金森氏症语言障碍评估的可行性研究》,用于语音到文本分析。
研究目的本研究调查了在帕金森病(PD)引起的言语障碍患者中使用预训练深度学习波形-vec模型进行语音-文本分析的可行性:我们利用了一个公开可用的数据集,其中包含语音录音,包括健康对照组(HC)和被诊断为帕金森病患者的 Hoehn and Yahr(H&Y)分期、运动障碍协会统一帕金森病评分量表(UPDRS)第一部分、UPDRS 第二部分评分和性别信息。采用 Wav2Vec 模型,对帕金森病患者数据实施了语音到文本分析方法。分析任务包括单词字母分类、单词匹配概率评估以及分析模型输出提供的语音波形特征:结果:在由 20 个病例组成的数据集中,PD 患者的 H&Y 评分平均为 2.50±0.67,UPDRS II 第 5 部分评分平均为 0.70±1.00,UPDRS III 第 18 部分评分平均为 0.80±0.98。此外,还对语音识别后从解码文本中得出的单词数进行了评估,结果是HC组和PD组的平均值分别为(299.10±16.79)和(259.80±93.39)。此外,所有音节的一致度计算均基于语音过程。阅读句子的准确度分别为 0.31 和 0.10:本研究旨在证明 wave-to-vec 在增强言语障碍患者的语音到文本分析方面的有效性。研究结果可为开发临床工具铺平道路,以改善对这一人群的诊断、评估和交流支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.10
自引率
0.00%
发文量
41
期刊最新文献
Letter to the Editor: Commentary on Acute Paraparesis Caused by Spinal Epidural Fluid After Balloon Kyphoplasty for Traumatic Avascular Necrosis: A Case Report (Korean J Neurotrauma 2023;19:398-402). Should Hypertonic Saline Be Considered for the Treatment of Intracranial Hypertension? A Review of Current Evidence and Clinical Practices. Pain Intervention for Osteoporotic Compression Fracture, From Physical Therapy to Surgery: A Literature Review. KJNT Symposium 2024: A Starting Point for a Leap Forward. Feasibility Study of Parkinson's Speech Disorder Evaluation With Pre-Trained Deep Learning Model for Speech-to-Text Analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1