IDEA: An Italian Dysarthric Speech Database

Marco Marini, Mauro Viganò, M. Corbo, M. Zettin, Gloria Simoncini, B. Fattori, Clelia D'Anna, M. Donati, L. Fanucci
{"title":"IDEA: An Italian Dysarthric Speech Database","authors":"Marco Marini, Mauro Viganò, M. Corbo, M. Zettin, Gloria Simoncini, B. Fattori, Clelia D'Anna, M. Donati, L. Fanucci","doi":"10.1109/SLT48900.2021.9383467","DOIUrl":null,"url":null,"abstract":"This paper describes IDEA a database of Italian dysarthric speech produced by 45 speakers affected by 8 different pathologies. Neurologic diagnoses were collected from the subjects’ medical records, while dysarthria assessment was conducted by a speech language pathologist and neurologist. The total number of records is 16794. The speech material consists of 211 isolated common words recorded by a single condenser microphone. The words that refer to an ambient assisted living scenario, have been selected to cover as widely as possible all Italian phonemes.The recordings, supervised by a speech pathologist, were recorded through the RECORDIA software that was developed specifically for this task. It allows multiple recording procedures depending on the patient severity and it includes an electronic record for storing patients’ clinical data. All the recordings in IDEA are annotated with a TextGrid file which defines the boundaries of the speech within the wave file and other types of notes about the record.This paper also includes preliminary experiments on the recorded data to train an automatic speech recognition system from a baseline Kaldi recipe. We trained HMM and DNN models and the results shows 11.75% and 14.99% of WER respectively.","PeriodicalId":243211,"journal":{"name":"2021 IEEE Spoken Language Technology Workshop (SLT)","volume":"330 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT48900.2021.9383467","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

This paper describes IDEA a database of Italian dysarthric speech produced by 45 speakers affected by 8 different pathologies. Neurologic diagnoses were collected from the subjects’ medical records, while dysarthria assessment was conducted by a speech language pathologist and neurologist. The total number of records is 16794. The speech material consists of 211 isolated common words recorded by a single condenser microphone. The words that refer to an ambient assisted living scenario, have been selected to cover as widely as possible all Italian phonemes.The recordings, supervised by a speech pathologist, were recorded through the RECORDIA software that was developed specifically for this task. It allows multiple recording procedures depending on the patient severity and it includes an electronic record for storing patients’ clinical data. All the recordings in IDEA are annotated with a TextGrid file which defines the boundaries of the speech within the wave file and other types of notes about the record.This paper also includes preliminary experiments on the recorded data to train an automatic speech recognition system from a baseline Kaldi recipe. We trained HMM and DNN models and the results shows 11.75% and 14.99% of WER respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
想法:一个意大利困境语言数据库
本文描述了IDEA,这是一个由45名受8种不同病理影响的说话者产生的意大利语发音困难的数据库。神经学诊断从受试者的医疗记录中收集,构音障碍评估由言语语言病理学家和神经学家进行。记录总数为16794条。该语音材料由211个孤立的常用词组成,由单个电容式麦克风记录。这些词指的是环境辅助生活场景,已经被选择尽可能广泛地覆盖所有意大利语音素。录音由语言病理学家监督,通过专门为这项任务开发的RECORDIA软件进行记录。它允许根据患者的严重程度进行多种记录程序,并包括用于存储患者临床数据的电子记录。IDEA中的所有录音都用TextGrid文件进行注释,该文件定义了波文件中语音的边界和关于记录的其他类型的注释。本文还包括对记录数据的初步实验,以训练一个基于基线卡尔迪配方的自动语音识别系统。我们训练HMM和DNN模型,结果分别达到了11.75%和14.99%的WER。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Through the Words of Viewers: Using Comment-Content Entangled Network for Humor Impression Recognition Analysis of Multimodal Features for Speaking Proficiency Scoring in an Interview Dialogue Convolution-Based Attention Model With Positional Encoding For Streaming Speech Recognition On Embedded Devices Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream end-to-end ASR Speaker-Independent Visual Speech Recognition with the Inception V3 Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1