{"title":"语音障碍严重程度的语音类相关分析","authors":"Miklós Gábriel Tulics, K. Vicsi","doi":"10.1109/COGINFOCOM.2017.8268210","DOIUrl":null,"url":null,"abstract":"The main purpose of the research is to model the cognitive processes that occur when the physician determines the severity of the dysphonia, and to build an IT system that can substitute the subjective severity diagnosis used by a clinician. In this preliminary study the relationship between acoustic parameters and the speech defect severity determined by a clinician is investigated. Being limited in the number of pathological speech samples, it is very important to choose the effective parameters. After a phoneme level segmentation, acoustic parameters were measured at a predetermined fixed points in continuous speech. Parameters were grouped according to the phonetic classes (classes according to the manner of articulation), and the correlation of the grouped parameters with the severity of dysphonia given by the RBH scale was examined, where R stands for roughness, B for breathiness, H for overall hoarseness. The analysis was carried out on a database containing several pathological disease types, the most frequent being recurrent paresis and functional dysphonia. It was found that beyond the initial acoustic parameters such as jitter(ddp), shimmer(dda), Harmonics-to-Noise Ratio (HNR) and mel-frequency cepstral coefficients (mfcc) measured on vowels, it is worth measuring Soft Phonation Index (SPI) and Empirical mode decomposition (EMD) based frequency band ratios on different phonetic classes. These measures were found to correlate with the severity of dysphonia, determined by the clinician (RBH). They provide useful information and could be useful to differentiate different types of dysphonia like functional dysphonia and recurrent paresis.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Phonetic-class based correlation analysis for severity of dysphonia\",\"authors\":\"Miklós Gábriel Tulics, K. Vicsi\",\"doi\":\"10.1109/COGINFOCOM.2017.8268210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main purpose of the research is to model the cognitive processes that occur when the physician determines the severity of the dysphonia, and to build an IT system that can substitute the subjective severity diagnosis used by a clinician. In this preliminary study the relationship between acoustic parameters and the speech defect severity determined by a clinician is investigated. Being limited in the number of pathological speech samples, it is very important to choose the effective parameters. After a phoneme level segmentation, acoustic parameters were measured at a predetermined fixed points in continuous speech. Parameters were grouped according to the phonetic classes (classes according to the manner of articulation), and the correlation of the grouped parameters with the severity of dysphonia given by the RBH scale was examined, where R stands for roughness, B for breathiness, H for overall hoarseness. The analysis was carried out on a database containing several pathological disease types, the most frequent being recurrent paresis and functional dysphonia. It was found that beyond the initial acoustic parameters such as jitter(ddp), shimmer(dda), Harmonics-to-Noise Ratio (HNR) and mel-frequency cepstral coefficients (mfcc) measured on vowels, it is worth measuring Soft Phonation Index (SPI) and Empirical mode decomposition (EMD) based frequency band ratios on different phonetic classes. These measures were found to correlate with the severity of dysphonia, determined by the clinician (RBH). They provide useful information and could be useful to differentiate different types of dysphonia like functional dysphonia and recurrent paresis.\",\"PeriodicalId\":212559,\"journal\":{\"name\":\"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COGINFOCOM.2017.8268210\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COGINFOCOM.2017.8268210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Phonetic-class based correlation analysis for severity of dysphonia
The main purpose of the research is to model the cognitive processes that occur when the physician determines the severity of the dysphonia, and to build an IT system that can substitute the subjective severity diagnosis used by a clinician. In this preliminary study the relationship between acoustic parameters and the speech defect severity determined by a clinician is investigated. Being limited in the number of pathological speech samples, it is very important to choose the effective parameters. After a phoneme level segmentation, acoustic parameters were measured at a predetermined fixed points in continuous speech. Parameters were grouped according to the phonetic classes (classes according to the manner of articulation), and the correlation of the grouped parameters with the severity of dysphonia given by the RBH scale was examined, where R stands for roughness, B for breathiness, H for overall hoarseness. The analysis was carried out on a database containing several pathological disease types, the most frequent being recurrent paresis and functional dysphonia. It was found that beyond the initial acoustic parameters such as jitter(ddp), shimmer(dda), Harmonics-to-Noise Ratio (HNR) and mel-frequency cepstral coefficients (mfcc) measured on vowels, it is worth measuring Soft Phonation Index (SPI) and Empirical mode decomposition (EMD) based frequency band ratios on different phonetic classes. These measures were found to correlate with the severity of dysphonia, determined by the clinician (RBH). They provide useful information and could be useful to differentiate different types of dysphonia like functional dysphonia and recurrent paresis.