{"title":"Developing Bengali Speech Corpus for Phone Recognizer Using Optimum Text Selection Technique","authors":"S. Mandal, B. Das, Pabitra Mitra, A. Basu","doi":"10.1109/IALP.2011.16","DOIUrl":null,"url":null,"abstract":"Speech corpus plays a key role in construction of automatic speech recognition (ASR), text-to-speech (TTS) synthesis and phone recognition (PR) system. PR system and ASR system are quite similar in functionality. The difference between these two is that for PR system the speech signal is converted to phone\\footnote{smallest discrete segment of sound in uttered speech} text whereas for ASR system the speech signal is converted to word text. Speech corpus for PR system usually consists of a text corpus, recording data corresponding to the text corpus, phonetic representation of the text corpus and a pronunciation dictionary. Selecting optimum text from available text with balanced phone distribution is an important task for developing high quality PR system. In this paper, we describe our text selection technique and discuss the performance of phone recognition system.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Asian Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2011.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
Speech corpus plays a key role in construction of automatic speech recognition (ASR), text-to-speech (TTS) synthesis and phone recognition (PR) system. PR system and ASR system are quite similar in functionality. The difference between these two is that for PR system the speech signal is converted to phone\footnote{smallest discrete segment of sound in uttered speech} text whereas for ASR system the speech signal is converted to word text. Speech corpus for PR system usually consists of a text corpus, recording data corresponding to the text corpus, phonetic representation of the text corpus and a pronunciation dictionary. Selecting optimum text from available text with balanced phone distribution is an important task for developing high quality PR system. In this paper, we describe our text selection technique and discuss the performance of phone recognition system.