{"title":"利用自动主题聚类改进阿拉伯语广播转录","authors":"Stephen M. Chu, L. Mangu","doi":"10.1109/ICASSP.2012.6288907","DOIUrl":null,"url":null,"abstract":"Latent Dirichlet Allocation (LDA) has been shown to be an effective model to augment n-gram language models in speech recognition applications. In this work, we aim to take advantage of the superior unsupervised learning ability of the framework, and use it to uncover topic structure embedded in the corpora in an entirely data-driven fashion. In addition, we describe a bi-level inference and classification method that allows topic clustering at the utterance level while preserving the document-level topic structures. We demonstrate the effectiveness of the proposed topic clustering pipeline in a state-of-the-art Arabic broadcast transcription system. Experiments show that optimizing LM in the LDA topic space leads to 5% reduction in language model perplexity. It is further shown that topic clustering and adaptation is able to attain 0.4% absolute word error rate reduction on the GALE Arabic task.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"37 1","pages":"4449-4452"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improving arabic broadcast transcription using automatic topic clustering\",\"authors\":\"Stephen M. Chu, L. Mangu\",\"doi\":\"10.1109/ICASSP.2012.6288907\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Latent Dirichlet Allocation (LDA) has been shown to be an effective model to augment n-gram language models in speech recognition applications. In this work, we aim to take advantage of the superior unsupervised learning ability of the framework, and use it to uncover topic structure embedded in the corpora in an entirely data-driven fashion. In addition, we describe a bi-level inference and classification method that allows topic clustering at the utterance level while preserving the document-level topic structures. We demonstrate the effectiveness of the proposed topic clustering pipeline in a state-of-the-art Arabic broadcast transcription system. Experiments show that optimizing LM in the LDA topic space leads to 5% reduction in language model perplexity. It is further shown that topic clustering and adaptation is able to attain 0.4% absolute word error rate reduction on the GALE Arabic task.\",\"PeriodicalId\":6443,\"journal\":{\"name\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"37 1\",\"pages\":\"4449-4452\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2012.6288907\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2012.6288907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving arabic broadcast transcription using automatic topic clustering
Latent Dirichlet Allocation (LDA) has been shown to be an effective model to augment n-gram language models in speech recognition applications. In this work, we aim to take advantage of the superior unsupervised learning ability of the framework, and use it to uncover topic structure embedded in the corpora in an entirely data-driven fashion. In addition, we describe a bi-level inference and classification method that allows topic clustering at the utterance level while preserving the document-level topic structures. We demonstrate the effectiveness of the proposed topic clustering pipeline in a state-of-the-art Arabic broadcast transcription system. Experiments show that optimizing LM in the LDA topic space leads to 5% reduction in language model perplexity. It is further shown that topic clustering and adaptation is able to attain 0.4% absolute word error rate reduction on the GALE Arabic task.