{"title":"Development of multilingual phonetic engine for four Indian languages","authors":"Lincy Babykutty, A. George, L. Mary","doi":"10.1109/ICNGIS.2016.7854044","DOIUrl":null,"url":null,"abstract":"Phonetic Engine (PE) is a system that is used to determine the sequence of phones in a spoken utterance. In order to transcribe the speech database, International Phonetic Alphabet (IPA) is used. This work focuses on developing multilingual PE for four Indian languages namely, Bengali, Hindi, Urdu and Telugu. The number of languages can be increased to any number. For developing the PE, read speech corpus have been used. For developing the multilingual Phonetic Engine, forty phonemes are identified for modeling, by asnalysing the phoneme sets of these languages. The system is based on Hidden Markov Models (HMM). Mel-frequency Cepstral Coefficients are used as features for building the HMM models. The trained forty HMMs are used to derive a sequence of phonetic units for test utterances.","PeriodicalId":147314,"journal":{"name":"2016 International Conference on Next Generation Intelligent Systems (ICNGIS)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Next Generation Intelligent Systems (ICNGIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNGIS.2016.7854044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Phonetic Engine (PE) is a system that is used to determine the sequence of phones in a spoken utterance. In order to transcribe the speech database, International Phonetic Alphabet (IPA) is used. This work focuses on developing multilingual PE for four Indian languages namely, Bengali, Hindi, Urdu and Telugu. The number of languages can be increased to any number. For developing the PE, read speech corpus have been used. For developing the multilingual Phonetic Engine, forty phonemes are identified for modeling, by asnalysing the phoneme sets of these languages. The system is based on Hidden Markov Models (HMM). Mel-frequency Cepstral Coefficients are used as features for building the HMM models. The trained forty HMMs are used to derive a sequence of phonetic units for test utterances.