{"title":"A hybrid neural network/rule based system for bilingual text-to-phoneme mapping","authors":"E. B. Bilcu, J. Astola, J. Saarinen","doi":"10.1109/MLSP.2004.1422992","DOIUrl":null,"url":null,"abstract":"Text-to-phoneme (TTP) mapping is a preliminary step in text-to-speech synthesis and it affects the naturalness and understandability of synthetic speech. In this paper, we propose a hybrid neural network/rule based system for bilingual text-to-phoneme mapping. Our system uses three neural networks and a simple rule to perform the phoneme transcription. The first network is trained to convert the letters from the first language into their corresponding phonemes, the second one is used to obtain the phonemes for the second language whereas the third neural network together with a simple rule is responsible of the language recognition. The proposed approach can be easily extended for multilingual applications when more neural networks are introduced. Simulations performed on a bilingual dictionary (English+French) show the improvements in terms of phoneme accuracy of our method against the approach that uses a single neural network for multilingual TTP","PeriodicalId":70952,"journal":{"name":"信号处理","volume":"109 1","pages":"345-354"},"PeriodicalIF":0.0000,"publicationDate":"2004-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"信号处理","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1109/MLSP.2004.1422992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Text-to-phoneme (TTP) mapping is a preliminary step in text-to-speech synthesis and it affects the naturalness and understandability of synthetic speech. In this paper, we propose a hybrid neural network/rule based system for bilingual text-to-phoneme mapping. Our system uses three neural networks and a simple rule to perform the phoneme transcription. The first network is trained to convert the letters from the first language into their corresponding phonemes, the second one is used to obtain the phonemes for the second language whereas the third neural network together with a simple rule is responsible of the language recognition. The proposed approach can be easily extended for multilingual applications when more neural networks are introduced. Simulations performed on a bilingual dictionary (English+French) show the improvements in terms of phoneme accuracy of our method against the approach that uses a single neural network for multilingual TTP
期刊介绍:
Journal of Signal Processing is an academic journal supervised by China Association for Science and Technology and sponsored by China Institute of Electronics. The journal is an academic journal that reflects the latest research results and technological progress in the field of signal processing and related disciplines. It covers academic papers and review articles on new theories, new ideas, and new technologies in the field of signal processing. The journal aims to provide a platform for academic exchanges for scientific researchers and engineering and technical personnel engaged in basic research and applied research in signal processing, thereby promoting the development of information science and technology. At present, the journal has been included in the three major domestic core journal databases "China Science Citation Database (CSCD), China Science and Technology Core Journals (CSTPCD), Chinese Core Journals Overview" and Coaj. It is also included in many foreign databases such as Scopus, CSA, EBSCO host, INSPEC, JST, etc.