人工重音语音的音素相似性研究

Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI:10.18653/v1/2023.sigmorphon-1.6

Margot Masson, Julie Carson-Berndsen

{"title":"人工重音语音的音素相似性研究","authors":"Margot Masson, Julie Carson-Berndsen","doi":"10.18653/v1/2023.sigmorphon-1.6","DOIUrl":null,"url":null,"abstract":"While the deep learning revolution has led to significant performance improvements in speech recognition, accented speech remains a challenge. Current approaches to this challenge typically do not seek to understand and provide explanations for the variations of accented speech, whether they stem from native regional variation or non-native error patterns. This paper seeks to address non-native speaker variations from both a knowledge-based and a data-driven perspective. We propose to approximate non-native accented-speech pronunciation patterns by the means of two approaches: based on phonetic and phonological knowledge on the one hand and inferred from a text-to-speech system on the other. Artificial speech is then generated with a range of variants which have been captured in confusion matrices representing phoneme similarities. We then show that non-native accent confusions actually propagate to the transcription from the ASR, thus suggesting that the inference of accent specific phoneme confusions is achievable from artificial speech.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating Phoneme Similarity with Artificially Accented Speech\",\"authors\":\"Margot Masson, Julie Carson-Berndsen\",\"doi\":\"10.18653/v1/2023.sigmorphon-1.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While the deep learning revolution has led to significant performance improvements in speech recognition, accented speech remains a challenge. Current approaches to this challenge typically do not seek to understand and provide explanations for the variations of accented speech, whether they stem from native regional variation or non-native error patterns. This paper seeks to address non-native speaker variations from both a knowledge-based and a data-driven perspective. We propose to approximate non-native accented-speech pronunciation patterns by the means of two approaches: based on phonetic and phonological knowledge on the one hand and inferred from a text-to-speech system on the other. Artificial speech is then generated with a range of variants which have been captured in confusion matrices representing phoneme similarities. We then show that non-native accent confusions actually propagate to the transcription from the ASR, thus suggesting that the inference of accent specific phoneme confusions is achievable from artificial speech.\",\"PeriodicalId\":186158,\"journal\":{\"name\":\"Special Interest Group on Computational Morphology and Phonology Workshop\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Special Interest Group on Computational Morphology and Phonology Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2023.sigmorphon-1.6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Special Interest Group on Computational Morphology and Phonology Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2023.sigmorphon-1.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

虽然深度学习革命在语音识别方面带来了显著的性能改进，但语音重音仍然是一个挑战。当前应对这一挑战的方法通常不寻求理解和解释口音语音的变化，无论它们是源于本地区域变化还是非本地错误模式。本文试图从基于知识和数据驱动的角度来解决非母语人士的差异。我们建议通过两种方法来近似非母语重音语音模式:一方面基于语音和语音知识，另一方面从文本到语音系统推断。然后，人工语音由一系列变体生成，这些变体在表示音素相似性的混淆矩阵中被捕获。然后我们表明，非母语口音混淆实际上从ASR传播到转录，从而表明口音特定音位混淆的推断是可以从人工语音中实现的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Investigating Phoneme Similarity with Artificially Accented Speech

While the deep learning revolution has led to significant performance improvements in speech recognition, accented speech remains a challenge. Current approaches to this challenge typically do not seek to understand and provide explanations for the variations of accented speech, whether they stem from native regional variation or non-native error patterns. This paper seeks to address non-native speaker variations from both a knowledge-based and a data-driven perspective. We propose to approximate non-native accented-speech pronunciation patterns by the means of two approaches: based on phonetic and phonological knowledge on the one hand and inferred from a text-to-speech system on the other. Artificial speech is then generated with a range of variants which have been captured in confusion matrices representing phoneme similarities. We then show that non-native accent confusions actually propagate to the transcription from the ASR, thus suggesting that the inference of accent specific phoneme confusions is achievable from artificial speech.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Special Interest Group on Computational Morphology and Phonology Workshop

自引率

0.00%

发文量