{"title":"乌兹别克语语音识别系统整体模型的开发","authors":"M. Musaev, Ilyos Khujayorov, M. Ochilov","doi":"10.1109/AICT50176.2020.9368719","DOIUrl":null,"url":null,"abstract":"In this paper investigates the approach to realization of recognition of Uzbek words on the basis of end-to-end models is considered. Also presented are some theoretical data on the architecture of neural networks used in the integrated model, and the results of preliminary experimental studies conducted on their basis. Deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long-range context that empowers RNNs. When trained end-to-end with suitable regularization, we find that deep BRNNs achieve a test set error of CER=49.1% on our dataset.","PeriodicalId":136491,"journal":{"name":"2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Development of integral model of speech recognition system for Uzbek language\",\"authors\":\"M. Musaev, Ilyos Khujayorov, M. Ochilov\",\"doi\":\"10.1109/AICT50176.2020.9368719\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper investigates the approach to realization of recognition of Uzbek words on the basis of end-to-end models is considered. Also presented are some theoretical data on the architecture of neural networks used in the integrated model, and the results of preliminary experimental studies conducted on their basis. Deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long-range context that empowers RNNs. When trained end-to-end with suitable regularization, we find that deep BRNNs achieve a test set error of CER=49.1% on our dataset.\",\"PeriodicalId\":136491,\"journal\":{\"name\":\"2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT)\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICT50176.2020.9368719\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT50176.2020.9368719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Development of integral model of speech recognition system for Uzbek language
In this paper investigates the approach to realization of recognition of Uzbek words on the basis of end-to-end models is considered. Also presented are some theoretical data on the architecture of neural networks used in the integrated model, and the results of preliminary experimental studies conducted on their basis. Deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long-range context that empowers RNNs. When trained end-to-end with suitable regularization, we find that deep BRNNs achieve a test set error of CER=49.1% on our dataset.