{"title":"能源设施语音自动识别模型的开发","authors":"V. A. Nechaev, S. Kosyakov","doi":"10.17588/2072-2672.2023.4.094-100","DOIUrl":null,"url":null,"abstract":"Currently, when developing automatic speech recognition models for specialized subject areas, in particular for energy facilities, deep neural network architectures are used, which require a large amount of training data. At the same time, models often turn out to be poorly suitable for use in specific information systems due to poor-quality recognition of highly specialized subject vocabulary. Additional training of models to improve their quality in a specific context of recognition encounters the difficulty to obtain a sufficient amount of data and the laboriousness of their markup. Thus, an urgent task is to create methods that allow reducing the complexity of developing applied speech recognition models and improving their quality when used in subject areas, in particular, in the field of energy. Methods of thematic text modeling based on language models for adapting open data are applied. A deep neural network is used as a pretrained speech recognition model. For training, open-source datasets are used. A method to develop automatic speech recognition models for specialized subject areas has been developed. It includes the stage of intermediate learning of subject area vocabulary based on open-source data selected using thematic sampling. Based on the method, the authors have developed and studied a model of automatic speech recognition for energy facilities. It has showed higher recognition results than models obtained by traditional methods. Approbation of the proposed method has confirmed its effectiveness. The applied neural network model developed on the method has demonstrated the possibility to work in the information systems of energy facilities in Russian and English without additional training on proprietary data.","PeriodicalId":23635,"journal":{"name":"Vestnik IGEU","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development of automatic speech recognition model for energy facilities\",\"authors\":\"V. A. Nechaev, S. Kosyakov\",\"doi\":\"10.17588/2072-2672.2023.4.094-100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, when developing automatic speech recognition models for specialized subject areas, in particular for energy facilities, deep neural network architectures are used, which require a large amount of training data. At the same time, models often turn out to be poorly suitable for use in specific information systems due to poor-quality recognition of highly specialized subject vocabulary. Additional training of models to improve their quality in a specific context of recognition encounters the difficulty to obtain a sufficient amount of data and the laboriousness of their markup. Thus, an urgent task is to create methods that allow reducing the complexity of developing applied speech recognition models and improving their quality when used in subject areas, in particular, in the field of energy. Methods of thematic text modeling based on language models for adapting open data are applied. A deep neural network is used as a pretrained speech recognition model. For training, open-source datasets are used. A method to develop automatic speech recognition models for specialized subject areas has been developed. It includes the stage of intermediate learning of subject area vocabulary based on open-source data selected using thematic sampling. Based on the method, the authors have developed and studied a model of automatic speech recognition for energy facilities. It has showed higher recognition results than models obtained by traditional methods. Approbation of the proposed method has confirmed its effectiveness. The applied neural network model developed on the method has demonstrated the possibility to work in the information systems of energy facilities in Russian and English without additional training on proprietary data.\",\"PeriodicalId\":23635,\"journal\":{\"name\":\"Vestnik IGEU\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vestnik IGEU\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17588/2072-2672.2023.4.094-100\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vestnik IGEU","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17588/2072-2672.2023.4.094-100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Development of automatic speech recognition model for energy facilities
Currently, when developing automatic speech recognition models for specialized subject areas, in particular for energy facilities, deep neural network architectures are used, which require a large amount of training data. At the same time, models often turn out to be poorly suitable for use in specific information systems due to poor-quality recognition of highly specialized subject vocabulary. Additional training of models to improve their quality in a specific context of recognition encounters the difficulty to obtain a sufficient amount of data and the laboriousness of their markup. Thus, an urgent task is to create methods that allow reducing the complexity of developing applied speech recognition models and improving their quality when used in subject areas, in particular, in the field of energy. Methods of thematic text modeling based on language models for adapting open data are applied. A deep neural network is used as a pretrained speech recognition model. For training, open-source datasets are used. A method to develop automatic speech recognition models for specialized subject areas has been developed. It includes the stage of intermediate learning of subject area vocabulary based on open-source data selected using thematic sampling. Based on the method, the authors have developed and studied a model of automatic speech recognition for energy facilities. It has showed higher recognition results than models obtained by traditional methods. Approbation of the proposed method has confirmed its effectiveness. The applied neural network model developed on the method has demonstrated the possibility to work in the information systems of energy facilities in Russian and English without additional training on proprietary data.