基于书面语言文本的ASR改进语言模型

2022 National Conference on Communications (NCC) Pub Date : 2022-05-24 DOI:10.1109/NCC55593.2022.9806803

Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu

{"title":"基于书面语言文本的ASR改进语言模型","authors":"Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu","doi":"10.1109/NCC55593.2022.9806803","DOIUrl":null,"url":null,"abstract":"The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improved Language Models for ASR using Written Language Text\",\"authors\":\"Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu\",\"doi\":\"10.1109/NCC55593.2022.9806803\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.\",\"PeriodicalId\":403870,\"journal\":{\"name\":\"2022 National Conference on Communications (NCC)\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC55593.2022.9806803\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

自动语音识别(ASR)引擎的性能主要取决于(a)声学模型(AM)， (b)语言模型(LM)和(c)词典(Lx)，虽然每个块对ASR整体性能的贡献不能单独衡量，但在特定领域的ASR情况下，良好的LM有助于以较小的成本提高性能。一般来说，LM比构建AM更环保，并且对于特定领域的ASR更容易构建，因为它只需要特定领域的文本语料库。传统上，由于其现成的可用性，书面语言文本(WLT)语料库已被用于构建LM，尽管人们一致认为WLT和口语文本(SLT)之间存在显着差异。在本文中，我们探索了可用于将WLT转换为实现更好的LM以支持ASR性能的形式的方法和技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improved Language Models for ASR using Written Language Text

The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 National Conference on Communications (NCC)

自引率

0.00%

发文量