D. Magdum, Manisha Shukla Dubey, T. Patil, Ronak Shah, S. Belhe, Mahesh Kulkarni
{"title":"Methodology for designing and creating Hindi speech corpus","authors":"D. Magdum, Manisha Shukla Dubey, T. Patil, Ronak Shah, S. Belhe, Mahesh Kulkarni","doi":"10.1109/SPACES.2015.7058279","DOIUrl":null,"url":null,"abstract":"In this paper we have described the methodologies that we have used in data collection and recording for our Hindi Text to Speech system. Design of the speech corpus plays a very important role in overall quality of the text-to-speech system. A huge text corpus of one million words was created for existing text-to-speech system. We have crawled text from many domains like financial, government, current news etc. along with pre-built dictionaries. For the very first time, we have also generated and incorporated text from Hindi Short-Messaging-Service (SMS). The efforts were made to make the generic speech corpus for Hindi. The crawled text was first filtered for correctness e.g. spelling mistakes, validity to Hindi, word lengths etc. The filtered words were then carefully analyzed and ensured that phonetically balanced text is prepared. This cured text is then recorded by professional recordist in a studio environment. The recorded speech data is then processed and annotated to generate the final speech corpus. The paper explains the speech corpus creation process, beginning with text data crawling, filtering, recording and annotation phases. The final speech corpus thus generated is used in the Hindi Text-to-Speech system with the MOS of 2.8.","PeriodicalId":432479,"journal":{"name":"2015 International Conference on Signal Processing and Communication Engineering Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Signal Processing and Communication Engineering Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPACES.2015.7058279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
In this paper we have described the methodologies that we have used in data collection and recording for our Hindi Text to Speech system. Design of the speech corpus plays a very important role in overall quality of the text-to-speech system. A huge text corpus of one million words was created for existing text-to-speech system. We have crawled text from many domains like financial, government, current news etc. along with pre-built dictionaries. For the very first time, we have also generated and incorporated text from Hindi Short-Messaging-Service (SMS). The efforts were made to make the generic speech corpus for Hindi. The crawled text was first filtered for correctness e.g. spelling mistakes, validity to Hindi, word lengths etc. The filtered words were then carefully analyzed and ensured that phonetically balanced text is prepared. This cured text is then recorded by professional recordist in a studio environment. The recorded speech data is then processed and annotated to generate the final speech corpus. The paper explains the speech corpus creation process, beginning with text data crawling, filtering, recording and annotation phases. The final speech corpus thus generated is used in the Hindi Text-to-Speech system with the MOS of 2.8.