使用nlp方法和机器学习将乌克兰语音频转换为文本表示的信息系统

Yurii Tyshchuk, V. Vysotska, Olha Vlasenko
{"title":"使用nlp方法和机器学习将乌克兰语音频转换为文本表示的信息系统","authors":"Yurii Tyshchuk, V. Vysotska, Olha Vlasenko","doi":"10.23939/sisn2022.12.023","DOIUrl":null,"url":null,"abstract":"Speech recognition involves various models, methods and algorithms for analysing and processing the user’s recorded voice. This allows people to control different systems that support one type of speech recognition. A speech-to-text conversion system is a type of speech recognition that uses spoken data for further processing. It also provides several stages for processing an audio file, which uses electroacoustic means, filtering algorithms in the audio file to isolate relevant sounds, electronic data arrays for the selected language, as well as mathematical models that make up the most likely words from phonemes. Thanks to the conversion of speech to text, people whose professions are closely related to typing a large amount of text on the keyboard, significantly speed up and facilitate the work process, as well as reduce the amount of stress. In addition, such systems help businesses, because the concept of remote work is becoming more and more popular, and therefore companies need tools to record and systematize meetings in the form of written text. The object of the research is the process of converting the Ukrainian-language text into a written one based on NLP and machine learning methods. The subject of the research is file processing algorithms for extracting relevant sounds and recognizing phonemes, as well as mathematical models for recognizing an array of phonemes as specific words. The purpose of the work is to design and develop an information system for converting audio Ukrainian-language text into written text based on the Ukrainian Speech-to-text Web application, which is a technology for accurate and easy analysis of Ukrainian-language audio files and their subsequent transcription into text. The application supports downloading files from the file system and recording using the microphone, as well as saving the analysed data. The article also describes the stages of design and the general typical architecture of the corresponding system for converting audio Ukrainian-language text into written text. According to the results of the experimental testing of the developed system, it was found that the number of words does not affect the accuracy of the conversion algorithm, and the decrease in percentage is not large and occurred due to the complexity of the words and the low quality of the microphone, and therefore the recorded file.","PeriodicalId":444399,"journal":{"name":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Information system for converting audio in Ukrainian language into its textual representation using nlp methods and machine learning\",\"authors\":\"Yurii Tyshchuk, V. Vysotska, Olha Vlasenko\",\"doi\":\"10.23939/sisn2022.12.023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech recognition involves various models, methods and algorithms for analysing and processing the user’s recorded voice. This allows people to control different systems that support one type of speech recognition. A speech-to-text conversion system is a type of speech recognition that uses spoken data for further processing. It also provides several stages for processing an audio file, which uses electroacoustic means, filtering algorithms in the audio file to isolate relevant sounds, electronic data arrays for the selected language, as well as mathematical models that make up the most likely words from phonemes. Thanks to the conversion of speech to text, people whose professions are closely related to typing a large amount of text on the keyboard, significantly speed up and facilitate the work process, as well as reduce the amount of stress. In addition, such systems help businesses, because the concept of remote work is becoming more and more popular, and therefore companies need tools to record and systematize meetings in the form of written text. The object of the research is the process of converting the Ukrainian-language text into a written one based on NLP and machine learning methods. The subject of the research is file processing algorithms for extracting relevant sounds and recognizing phonemes, as well as mathematical models for recognizing an array of phonemes as specific words. The purpose of the work is to design and develop an information system for converting audio Ukrainian-language text into written text based on the Ukrainian Speech-to-text Web application, which is a technology for accurate and easy analysis of Ukrainian-language audio files and their subsequent transcription into text. The application supports downloading files from the file system and recording using the microphone, as well as saving the analysed data. The article also describes the stages of design and the general typical architecture of the corresponding system for converting audio Ukrainian-language text into written text. According to the results of the experimental testing of the developed system, it was found that the number of words does not affect the accuracy of the conversion algorithm, and the decrease in percentage is not large and occurred due to the complexity of the words and the low quality of the microphone, and therefore the recorded file.\",\"PeriodicalId\":444399,\"journal\":{\"name\":\"Vìsnik Nacìonalʹnogo unìversitetu \\\"Lʹvìvsʹka polìtehnìka\\\". Serìâ Ìnformacìjnì sistemi ta merežì\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vìsnik Nacìonalʹnogo unìversitetu \\\"Lʹvìvsʹka polìtehnìka\\\". Serìâ Ìnformacìjnì sistemi ta merežì\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23939/sisn2022.12.023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23939/sisn2022.12.023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

语音识别涉及用于分析和处理用户录制的声音的各种模型、方法和算法。这允许人们控制不同的系统,支持一种类型的语音识别。语音到文本转换系统是一种使用语音数据进行进一步处理的语音识别。它还提供了处理音频文件的几个阶段,其中使用电声手段,音频文件中的过滤算法来隔离相关声音,所选语言的电子数据阵列,以及从音素中组成最可能的单词的数学模型。由于语音到文本的转换,那些与在键盘上输入大量文本密切相关的职业的人,大大加快和方便了工作过程,减少了压力。此外,这样的系统对企业也有帮助,因为远程工作的概念正变得越来越流行,因此公司需要以书面文本的形式记录和系统化会议的工具。研究对象是基于NLP和机器学习方法将乌克兰语文本转换为书面文本的过程。该研究的主题是提取相关声音和识别音素的文件处理算法,以及将一系列音素识别为特定单词的数学模型。这项工作的目的是设计和开发一个信息系统,以乌克兰语语音到文本Web应用程序为基础,将音频乌克兰语文本转换为书面文本,这是一种准确而容易地分析乌克兰语音频文件并随后将其转录为文本的技术。该应用程序支持从文件系统下载文件,并使用麦克风录音,以及保存分析的数据。文章还描述了将乌克兰语音频文本转换为书面文本的相应系统的设计阶段和一般典型架构。根据所开发系统的实验测试结果,发现单词的数量不会影响转换算法的准确性,并且百分比的下降并不大,并且由于单词的复杂性和麦克风的低质量而发生,因此录制文件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Information system for converting audio in Ukrainian language into its textual representation using nlp methods and machine learning
Speech recognition involves various models, methods and algorithms for analysing and processing the user’s recorded voice. This allows people to control different systems that support one type of speech recognition. A speech-to-text conversion system is a type of speech recognition that uses spoken data for further processing. It also provides several stages for processing an audio file, which uses electroacoustic means, filtering algorithms in the audio file to isolate relevant sounds, electronic data arrays for the selected language, as well as mathematical models that make up the most likely words from phonemes. Thanks to the conversion of speech to text, people whose professions are closely related to typing a large amount of text on the keyboard, significantly speed up and facilitate the work process, as well as reduce the amount of stress. In addition, such systems help businesses, because the concept of remote work is becoming more and more popular, and therefore companies need tools to record and systematize meetings in the form of written text. The object of the research is the process of converting the Ukrainian-language text into a written one based on NLP and machine learning methods. The subject of the research is file processing algorithms for extracting relevant sounds and recognizing phonemes, as well as mathematical models for recognizing an array of phonemes as specific words. The purpose of the work is to design and develop an information system for converting audio Ukrainian-language text into written text based on the Ukrainian Speech-to-text Web application, which is a technology for accurate and easy analysis of Ukrainian-language audio files and their subsequent transcription into text. The application supports downloading files from the file system and recording using the microphone, as well as saving the analysed data. The article also describes the stages of design and the general typical architecture of the corresponding system for converting audio Ukrainian-language text into written text. According to the results of the experimental testing of the developed system, it was found that the number of words does not affect the accuracy of the conversion algorithm, and the decrease in percentage is not large and occurred due to the complexity of the words and the low quality of the microphone, and therefore the recorded file.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Project of the information system of sales on the charity auction platform Intelligent system for analyzing battery charge consumption processes Information system of feedback monitoring in social networks for the formation of recommendations for the purchase of goods Software for the implementation of an intelligent system to solve the problem of “cold start” Analysis of multiplication algorithms in Galuis fields for the cryptographic protection of information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1