M. N. Huda, Md. Shahadat Hossain, Foyzul Hassan, Mohammad Mahedi Hasan, N. J. Lisa, G. Muhammad
{"title":"噪声鲁棒ASR的抑制/增强网络","authors":"M. N. Huda, Md. Shahadat Hossain, Foyzul Hassan, Mohammad Mahedi Hasan, N. J. Lisa, G. Muhammad","doi":"10.1109/ICCITECHN.2010.5723899","DOIUrl":null,"url":null,"abstract":"This paper describes an evaluation of Inhibition/Enhancement (In/En) network for noise robust automatic speech recognition (ASR). In articulatory feature based speech recognition using neural network, the In/En network is needed to discriminate whether the articulatory features (AFs) dynamic patterns of trajectories are convex or concave. The network is used to achieve categorical AFs movement by enhancing AFs peak patterns (convex patterns) and inhibiting AFs dip patterns (concave patterns). We have analyzed the effectiveness of the In/En algorithm by incorporating it into a system which consists of three stages: a) Multilayer Neural Networks (MLNs), b) an In/En Network and c) the Gram-Schmidt (GS) algorithm for orthogonalization. From the experiments using Japanese Newspaper Article Sentences (JNAS) database in clean and noisy acoustic environments, it is observed that the In/En network plays a significant role on the improvement of phoneme recognition performance. Moreover, the In/En network reduces the number of mixture components needed in Hidden Markov Models (HMMs).","PeriodicalId":149135,"journal":{"name":"2010 13th International Conference on Computer and Information Technology (ICCIT)","volume":"360 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Inhibition/Enhancement network for noise robust ASR\",\"authors\":\"M. N. Huda, Md. Shahadat Hossain, Foyzul Hassan, Mohammad Mahedi Hasan, N. J. Lisa, G. Muhammad\",\"doi\":\"10.1109/ICCITECHN.2010.5723899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes an evaluation of Inhibition/Enhancement (In/En) network for noise robust automatic speech recognition (ASR). In articulatory feature based speech recognition using neural network, the In/En network is needed to discriminate whether the articulatory features (AFs) dynamic patterns of trajectories are convex or concave. The network is used to achieve categorical AFs movement by enhancing AFs peak patterns (convex patterns) and inhibiting AFs dip patterns (concave patterns). We have analyzed the effectiveness of the In/En algorithm by incorporating it into a system which consists of three stages: a) Multilayer Neural Networks (MLNs), b) an In/En Network and c) the Gram-Schmidt (GS) algorithm for orthogonalization. From the experiments using Japanese Newspaper Article Sentences (JNAS) database in clean and noisy acoustic environments, it is observed that the In/En network plays a significant role on the improvement of phoneme recognition performance. Moreover, the In/En network reduces the number of mixture components needed in Hidden Markov Models (HMMs).\",\"PeriodicalId\":149135,\"journal\":{\"name\":\"2010 13th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"360 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 13th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITECHN.2010.5723899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2010.5723899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Inhibition/Enhancement network for noise robust ASR
This paper describes an evaluation of Inhibition/Enhancement (In/En) network for noise robust automatic speech recognition (ASR). In articulatory feature based speech recognition using neural network, the In/En network is needed to discriminate whether the articulatory features (AFs) dynamic patterns of trajectories are convex or concave. The network is used to achieve categorical AFs movement by enhancing AFs peak patterns (convex patterns) and inhibiting AFs dip patterns (concave patterns). We have analyzed the effectiveness of the In/En algorithm by incorporating it into a system which consists of three stages: a) Multilayer Neural Networks (MLNs), b) an In/En Network and c) the Gram-Schmidt (GS) algorithm for orthogonalization. From the experiments using Japanese Newspaper Article Sentences (JNAS) database in clean and noisy acoustic environments, it is observed that the In/En network plays a significant role on the improvement of phoneme recognition performance. Moreover, the In/En network reduces the number of mixture components needed in Hidden Markov Models (HMMs).