Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds

Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI:10.21437/ICSLP.1996-556

J. Köhler

{"title":"Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds","authors":"J. Köhler","doi":"10.21437/ICSLP.1996-556","DOIUrl":null,"url":null,"abstract":"The aim of the work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross-language HMM-based phoneme models have been used only for bootstrapping the language-dependent models and the multi-lingual approach has been investigated only on very small speech corpora. The author introduces a statistical distance measure to determine the similarities of sounds. Further, he presents a new technique to model multi-lingual phonemes. The experiments are conducted with the OGI Multi-Language Telephone Speech Corpus for the languages American English, German and Spanish. In the first experiment phoneme recognition rates between 39.0% and 53.9% are achieved using language-dependent models. Using cross-language models yields improvement for some phonemes, but on average a degradation of recognition performance is observed. However, cross-language models speeds up the cross-language transfer and reduce the size of the phoneme inventory of multi-lingual speech recognition systems. Finally, a new method of modelling multi-lingual phonemes, which can be used for a variety of languages, is presented. This technique reduces the number of phoneme-based units in a multi-lingual speech recognition system.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"3 1","pages":"2195-2198"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"101","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : ICSLP. International Conference on Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1996-556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 101

Abstract

The aim of the work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross-language HMM-based phoneme models have been used only for bootstrapping the language-dependent models and the multi-lingual approach has been investigated only on very small speech corpora. The author introduces a statistical distance measure to determine the similarities of sounds. Further, he presents a new technique to model multi-lingual phonemes. The experiments are conducted with the OGI Multi-Language Telephone Speech Corpus for the languages American English, German and Spanish. In the first experiment phoneme recognition rates between 39.0% and 53.9% are achieved using language-dependent models. Using cross-language models yields improvement for some phonemes, but on average a degradation of recognition performance is observed. However, cross-language models speeds up the cross-language transfer and reduce the size of the phoneme inventory of multi-lingual speech recognition systems. Finally, a new method of modelling multi-lingual phonemes, which can be used for a variety of languages, is presented. This technique reduces the number of phoneme-based units in a multi-lingual speech recognition system.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用声音的声学-语音相似性进行多语言音素识别

这项工作的目的是利用几种语言之间的声学-语音相似性。在最近的工作中，基于跨语言hmm的音素模型仅用于引导语言依赖模型，多语言方法仅在非常小的语料库上进行了研究。作者引入了一种统计距离度量来确定声音的相似度。此外，他还提出了一种新的多语言音素建模技术。使用OGI多语言电话语音语料库对美国英语、德语和西班牙语进行了实验。在第一个实验中，使用语言依赖模型实现了39.0% ~ 53.9%的音素识别率。使用跨语言模型可以提高某些音素的识别性能，但平均而言会降低识别性能。然而，跨语言模型加速了跨语言迁移，减少了多语言语音识别系统的音素库大小。最后，提出了一种新的多语言音素建模方法，该方法可用于多种语言。该技术减少了多语言语音识别系统中基于音素的单元数量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings : ICSLP. International Conference on Spoken Language Processing

自引率

0.00%

发文量

期刊最新文献

Audiovisual integration of speech by children and adults with cochlear implants AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS. Efficient adaptation of TTS duration model to new speakers SABLE: a standard for TTS markup A three-dimensional linear articulatory model based on MRI data