{"title":"基于特征变换的无监督任务自适应与个性化方法","authors":"Jian Xu, Zhijie Yan, Qiang Huo","doi":"10.1109/ISCSLP.2012.6423513","DOIUrl":null,"url":null,"abstract":"This paper presents a feature-transform based approach to unsupervised task adaptation and personalization for speech recognition. Given task-specific speech data collected from a deployed service, an “acoustic sniffing” module is built first by using a so-called i-vector technique with a number of acoustic conditions identified via i-vector clustering. Unsupervised maximum likelihood training is then performed to estimate a task-dependent feature transform for each acoustic condition, while pre-trained HMM parameters of acoustic models are kept unchanged. Given an unknown utterance, an appropriate feature transform is selected via “acoustic sniffing”, which is used to transform the feature vectors of the unknown utterance for decoding. The effectiveness of the proposed method is confirmed in a task adaptation scenario from a conversational telephone speech transcription task to a short message dictation task. The same method is expected to work for personalization as well.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A feature-transform based approach to unsupervised task adaptation and personalization\",\"authors\":\"Jian Xu, Zhijie Yan, Qiang Huo\",\"doi\":\"10.1109/ISCSLP.2012.6423513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a feature-transform based approach to unsupervised task adaptation and personalization for speech recognition. Given task-specific speech data collected from a deployed service, an “acoustic sniffing” module is built first by using a so-called i-vector technique with a number of acoustic conditions identified via i-vector clustering. Unsupervised maximum likelihood training is then performed to estimate a task-dependent feature transform for each acoustic condition, while pre-trained HMM parameters of acoustic models are kept unchanged. Given an unknown utterance, an appropriate feature transform is selected via “acoustic sniffing”, which is used to transform the feature vectors of the unknown utterance for decoding. The effectiveness of the proposed method is confirmed in a task adaptation scenario from a conversational telephone speech transcription task to a short message dictation task. The same method is expected to work for personalization as well.\",\"PeriodicalId\":186099,\"journal\":{\"name\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP.2012.6423513\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A feature-transform based approach to unsupervised task adaptation and personalization
This paper presents a feature-transform based approach to unsupervised task adaptation and personalization for speech recognition. Given task-specific speech data collected from a deployed service, an “acoustic sniffing” module is built first by using a so-called i-vector technique with a number of acoustic conditions identified via i-vector clustering. Unsupervised maximum likelihood training is then performed to estimate a task-dependent feature transform for each acoustic condition, while pre-trained HMM parameters of acoustic models are kept unchanged. Given an unknown utterance, an appropriate feature transform is selected via “acoustic sniffing”, which is used to transform the feature vectors of the unknown utterance for decoding. The effectiveness of the proposed method is confirmed in a task adaptation scenario from a conversational telephone speech transcription task to a short message dictation task. The same method is expected to work for personalization as well.