{"title":"加入发音方式以提高孟加拉语连续语音的音素分类精度","authors":"Tanmay Bhowmik, S. Mandal","doi":"10.1109/ICSDA.2017.8384455","DOIUrl":null,"url":null,"abstract":"In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been performed. The deep- structured model provided good overall classification accuracy of 87.8%. All the phonemes are classified with precision and recall values. A confusion matrix of all the Bengali phonemes is derived. Using the confusion matrix, the phonemes are classified into nine groups. These nine groups provided better overall classification accuracy of 98.7%, and a new confusion matrix is derived for this nine groups. A lower confusion rate is observed this time. In the second phase of the experiment, the nine groups are reclassified into 15 groups using the manner of articulation based knowledge and the deep-structured model is retrained. The system provided 98.9% of overall classification accuracy this time. This result is almost equal to the overall accuracy which was observed for nine groups. But as the nine groups are redivided into 15 groups, the phoneme confusion in a single group became less which leads to a better phoneme classification model.","PeriodicalId":255147,"journal":{"name":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inclusion of manner of articulation to achieve improved phoneme classification accuracy for Bengali continuous speech\",\"authors\":\"Tanmay Bhowmik, S. Mandal\",\"doi\":\"10.1109/ICSDA.2017.8384455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been performed. The deep- structured model provided good overall classification accuracy of 87.8%. All the phonemes are classified with precision and recall values. A confusion matrix of all the Bengali phonemes is derived. Using the confusion matrix, the phonemes are classified into nine groups. These nine groups provided better overall classification accuracy of 98.7%, and a new confusion matrix is derived for this nine groups. A lower confusion rate is observed this time. In the second phase of the experiment, the nine groups are reclassified into 15 groups using the manner of articulation based knowledge and the deep-structured model is retrained. The system provided 98.9% of overall classification accuracy this time. This result is almost equal to the overall accuracy which was observed for nine groups. But as the nine groups are redivided into 15 groups, the phoneme confusion in a single group became less which leads to a better phoneme classification model.\",\"PeriodicalId\":255147,\"journal\":{\"name\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSDA.2017.8384455\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2017.8384455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Inclusion of manner of articulation to achieve improved phoneme classification accuracy for Bengali continuous speech
In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been performed. The deep- structured model provided good overall classification accuracy of 87.8%. All the phonemes are classified with precision and recall values. A confusion matrix of all the Bengali phonemes is derived. Using the confusion matrix, the phonemes are classified into nine groups. These nine groups provided better overall classification accuracy of 98.7%, and a new confusion matrix is derived for this nine groups. A lower confusion rate is observed this time. In the second phase of the experiment, the nine groups are reclassified into 15 groups using the manner of articulation based knowledge and the deep-structured model is retrained. The system provided 98.9% of overall classification accuracy this time. This result is almost equal to the overall accuracy which was observed for nine groups. But as the nine groups are redivided into 15 groups, the phoneme confusion in a single group became less which leads to a better phoneme classification model.