{"title":"基于线性规划的语音识别n-gram语言模型判别训练","authors":"Vladimir Magdin, Hui Jiang","doi":"10.1109/ASRU.2009.5373248","DOIUrl":null,"url":null,"abstract":"This paper presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses Maximum Mutual Information Estimation (MMIE) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE objective function is approximated by a linear one using an EM-style auxiliary function, thus converting the discriminative training of n-gram language models into a linear programing problem, which can be efficiently solved by many convex optimization tools. Experimental results on the SPINE1 speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of close to 3% has been observed on the SPINE1 speech recognition task.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Discriminative training of n-gram language models for speech recognition via linear programming\",\"authors\":\"Vladimir Magdin, Hui Jiang\",\"doi\":\"10.1109/ASRU.2009.5373248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses Maximum Mutual Information Estimation (MMIE) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE objective function is approximated by a linear one using an EM-style auxiliary function, thus converting the discriminative training of n-gram language models into a linear programing problem, which can be efficiently solved by many convex optimization tools. Experimental results on the SPINE1 speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of close to 3% has been observed on the SPINE1 speech recognition task.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5373248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Discriminative training of n-gram language models for speech recognition via linear programming
This paper presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses Maximum Mutual Information Estimation (MMIE) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE objective function is approximated by a linear one using an EM-style auxiliary function, thus converting the discriminative training of n-gram language models into a linear programing problem, which can be efficiently solved by many convex optimization tools. Experimental results on the SPINE1 speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of close to 3% has been observed on the SPINE1 speech recognition task.