{"title":"MFCC和DTW在语音测试中的言语障碍分类","authors":"Jueting Liu, Marisha Speights, Dallin J Bailey, Sicheng Li, Huanyi Zhou, Yaoxuan Luan, Tianshi Xie, Cheryl D. Seals","doi":"10.1109/CIC52973.2021.00015","DOIUrl":null,"url":null,"abstract":"Recognizing disordered speech is a challenge to Automatic Speech Recognition (ASR) systems. This research focuses on classifying disordered speech vs. non-disordered speech through signal processing coupled with machine learning techniques. We have found little evidence of ASR that correctly classifies disordered vs. ordered speech at the level of expert-based classification. This research supports the Automated Phonetic Transcription - Grading Tool (APTgt). APTgt is an online E-Learning system that supports Communications Disorders (CMDS) faculty during linguistic courses and provides reinforcement activities for phonetic transcription with the potential to improve the quality of students' learning efficacy and teachers' pedagogical experience. In addition, APTgt generates interactive practice sessions and exams, automatic grading, and exam analysis. This paper will focus on the classification module to classify disordered speech and non-disordered speech supporting APTgt. We utilize Mel-frequency cepstral coefficients (MFCCs) and dynamic time warping (DTW) to preprocess the audio files and calculate the similarity, and the Support Vector Machine (SVM) algorithm for classification and regression.","PeriodicalId":170121,"journal":{"name":"2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Speech Disorders Classification in Phonetic Exams with MFCC and DTW\",\"authors\":\"Jueting Liu, Marisha Speights, Dallin J Bailey, Sicheng Li, Huanyi Zhou, Yaoxuan Luan, Tianshi Xie, Cheryl D. Seals\",\"doi\":\"10.1109/CIC52973.2021.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognizing disordered speech is a challenge to Automatic Speech Recognition (ASR) systems. This research focuses on classifying disordered speech vs. non-disordered speech through signal processing coupled with machine learning techniques. We have found little evidence of ASR that correctly classifies disordered vs. ordered speech at the level of expert-based classification. This research supports the Automated Phonetic Transcription - Grading Tool (APTgt). APTgt is an online E-Learning system that supports Communications Disorders (CMDS) faculty during linguistic courses and provides reinforcement activities for phonetic transcription with the potential to improve the quality of students' learning efficacy and teachers' pedagogical experience. In addition, APTgt generates interactive practice sessions and exams, automatic grading, and exam analysis. This paper will focus on the classification module to classify disordered speech and non-disordered speech supporting APTgt. We utilize Mel-frequency cepstral coefficients (MFCCs) and dynamic time warping (DTW) to preprocess the audio files and calculate the similarity, and the Support Vector Machine (SVM) algorithm for classification and regression.\",\"PeriodicalId\":170121,\"journal\":{\"name\":\"2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIC52973.2021.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIC52973.2021.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech Disorders Classification in Phonetic Exams with MFCC and DTW
Recognizing disordered speech is a challenge to Automatic Speech Recognition (ASR) systems. This research focuses on classifying disordered speech vs. non-disordered speech through signal processing coupled with machine learning techniques. We have found little evidence of ASR that correctly classifies disordered vs. ordered speech at the level of expert-based classification. This research supports the Automated Phonetic Transcription - Grading Tool (APTgt). APTgt is an online E-Learning system that supports Communications Disorders (CMDS) faculty during linguistic courses and provides reinforcement activities for phonetic transcription with the potential to improve the quality of students' learning efficacy and teachers' pedagogical experience. In addition, APTgt generates interactive practice sessions and exams, automatic grading, and exam analysis. This paper will focus on the classification module to classify disordered speech and non-disordered speech supporting APTgt. We utilize Mel-frequency cepstral coefficients (MFCCs) and dynamic time warping (DTW) to preprocess the audio files and calculate the similarity, and the Support Vector Machine (SVM) algorithm for classification and regression.