粤英混码语音的自动识别

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 2009-09-01 DOI:10.30019/IJCLCLP.200909.0003

Joyce Y. C. Chan, Houwei Cao, P. Ching, Tan Lee

{"title":"粤英混码语音的自动识别","authors":"Joyce Y. C. Chan, Houwei Cao, P. Ching, Tan Lee","doi":"10.30019/IJCLCLP.200909.0003","DOIUrl":null,"url":null,"abstract":"Code-mixing is a common phenomenon in bilingual societies. It refers to the intra-sentential switching of two different languages in a spoken utterance. This paper presents the first study on automatic recognition of Cantonese-English code-mixing speech, which is common in Hong Kong. This study starts with the design and compilation of code-mixing speech and text corpora. The problems of acoustic modeling, language modeling, and language boundary detection are investigated. Subsequently, a large-vocabulary code-mixing speech recognition system is developed based on a two-pass decoding algorithm. For acoustic modeling, it is shown that cross-lingual acoustic models are more appropriate than language-dependent models. The language models being used are character tri-grams, in which the embedded English words are grouped into a small number of classes. Language boundary detection is done either by exploiting the phonological and lexical differences between the two languages or is done based on the result of cross-lingual speech recognition. The language boundary information is used to re-score the hypothesized syllables or words in the decoding process. The proposed code-mixing speech recognition system attains the accuracies of 56.4% and 53.0% for the Cantonese syllables and English words in code-mixing utterances.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Automatic Recognition of Cantonese-English Code-Mixing Speech\",\"authors\":\"Joyce Y. C. Chan, Houwei Cao, P. Ching, Tan Lee\",\"doi\":\"10.30019/IJCLCLP.200909.0003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code-mixing is a common phenomenon in bilingual societies. It refers to the intra-sentential switching of two different languages in a spoken utterance. This paper presents the first study on automatic recognition of Cantonese-English code-mixing speech, which is common in Hong Kong. This study starts with the design and compilation of code-mixing speech and text corpora. The problems of acoustic modeling, language modeling, and language boundary detection are investigated. Subsequently, a large-vocabulary code-mixing speech recognition system is developed based on a two-pass decoding algorithm. For acoustic modeling, it is shown that cross-lingual acoustic models are more appropriate than language-dependent models. The language models being used are character tri-grams, in which the embedded English words are grouped into a small number of classes. Language boundary detection is done either by exploiting the phonological and lexical differences between the two languages or is done based on the result of cross-lingual speech recognition. The language boundary information is used to re-score the hypothesized syllables or words in the decoding process. The proposed code-mixing speech recognition system attains the accuracies of 56.4% and 53.0% for the Cantonese syllables and English words in code-mixing utterances.\",\"PeriodicalId\":436300,\"journal\":{\"name\":\"Int. J. Comput. Linguistics Chin. Lang. Process.\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Comput. Linguistics Chin. Lang. Process.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30019/IJCLCLP.200909.0003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Linguistics Chin. Lang. Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30019/IJCLCLP.200909.0003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

摘要

语码混淆是双语社会中普遍存在的现象。它指的是在一个口语话语中两种不同语言的句内转换。本文首次对香港常见的粤英混码语音进行了自动识别研究。本文从语码混合语音和文本语料库的设计和编写入手。研究了声学建模、语言建模和语言边界检测等问题。在此基础上，提出了一种基于两路译码算法的大词汇混码语音识别系统。对于声学建模，跨语言声学模型比依赖语言的模型更合适。使用的语言模型是字符三格，其中嵌入的英语单词被分组为少数类。语言边界检测要么利用两种语言之间的语音和词汇差异来完成，要么基于跨语言语音识别的结果来完成。在解码过程中，利用语言边界信息对假设的音节或单词进行重新评分。本文提出的混码语音识别系统对混码语音中的粤语音节和英语单词的识别准确率分别达到56.4%和53.0%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Automatic Recognition of Cantonese-English Code-Mixing Speech

Code-mixing is a common phenomenon in bilingual societies. It refers to the intra-sentential switching of two different languages in a spoken utterance. This paper presents the first study on automatic recognition of Cantonese-English code-mixing speech, which is common in Hong Kong. This study starts with the design and compilation of code-mixing speech and text corpora. The problems of acoustic modeling, language modeling, and language boundary detection are investigated. Subsequently, a large-vocabulary code-mixing speech recognition system is developed based on a two-pass decoding algorithm. For acoustic modeling, it is shown that cross-lingual acoustic models are more appropriate than language-dependent models. The language models being used are character tri-grams, in which the embedded English words are grouped into a small number of classes. Language boundary detection is done either by exploiting the phonological and lexical differences between the two languages or is done based on the result of cross-lingual speech recognition. The language boundary information is used to re-score the hypothesized syllables or words in the decoding process. The proposed code-mixing speech recognition system attains the accuracies of 56.4% and 53.0% for the Cantonese syllables and English words in code-mixing utterances.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. J. Comput. Linguistics Chin. Lang. Process.

自引率

0.00%

发文量