{"title":"基于广义变参数hmm和说话人自适应的结构化建模","authors":"Yang Li, Xunying Liu, Lan Wang","doi":"10.1109/ISCSLP.2012.6423526","DOIUrl":null,"url":null,"abstract":"It is a challenging task that to handle ambient variable acoustic factors in automatic speech recognition (ASR) system. The ambient variable noise and the distinct acoustic factors among speakers are two key issues for recognition task. To solve these problems, we present a new framework for robust speech recognition based on structured modeling, using generalized variable parameter HMMs (GVP-HMMs) and unsupervised speaker adaptation (SA) to compensate the mismatch from environment and speaker variability. GVP-HMMs can explicitly approximate the continuous trajectory of Gaussian component mean, variance and linear transformation parameter with a polynomial function against the varying noise level. In recognition stage, MLLR transform captures general relationship between the original model set and the current speaker, which could help in removing the effects of unwanted speaker factors. The effectiveness of the proposed approach is confirmed by evaluation experiment on a medium vocabulary Mandarin recognition task.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Structured modeling based on generalized variable parameter HMMs and speaker adaptation\",\"authors\":\"Yang Li, Xunying Liu, Lan Wang\",\"doi\":\"10.1109/ISCSLP.2012.6423526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is a challenging task that to handle ambient variable acoustic factors in automatic speech recognition (ASR) system. The ambient variable noise and the distinct acoustic factors among speakers are two key issues for recognition task. To solve these problems, we present a new framework for robust speech recognition based on structured modeling, using generalized variable parameter HMMs (GVP-HMMs) and unsupervised speaker adaptation (SA) to compensate the mismatch from environment and speaker variability. GVP-HMMs can explicitly approximate the continuous trajectory of Gaussian component mean, variance and linear transformation parameter with a polynomial function against the varying noise level. In recognition stage, MLLR transform captures general relationship between the original model set and the current speaker, which could help in removing the effects of unwanted speaker factors. The effectiveness of the proposed approach is confirmed by evaluation experiment on a medium vocabulary Mandarin recognition task.\",\"PeriodicalId\":186099,\"journal\":{\"name\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP.2012.6423526\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Structured modeling based on generalized variable parameter HMMs and speaker adaptation
It is a challenging task that to handle ambient variable acoustic factors in automatic speech recognition (ASR) system. The ambient variable noise and the distinct acoustic factors among speakers are two key issues for recognition task. To solve these problems, we present a new framework for robust speech recognition based on structured modeling, using generalized variable parameter HMMs (GVP-HMMs) and unsupervised speaker adaptation (SA) to compensate the mismatch from environment and speaker variability. GVP-HMMs can explicitly approximate the continuous trajectory of Gaussian component mean, variance and linear transformation parameter with a polynomial function against the varying noise level. In recognition stage, MLLR transform captures general relationship between the original model set and the current speaker, which could help in removing the effects of unwanted speaker factors. The effectiveness of the proposed approach is confirmed by evaluation experiment on a medium vocabulary Mandarin recognition task.