Intelligence System for Multi-Language Recognition

F. Ramo, Mohammed Kannah
{"title":"Intelligence System for Multi-Language Recognition","authors":"F. Ramo, Mohammed Kannah","doi":"10.33899/edusj.2022.132223.1200","DOIUrl":null,"url":null,"abstract":"Language classification systems are used to classify spoken language from a particular phoneme sample and are usually the first step of many spoken language processing tasks, such as automatic speech recognition (ASR) systems Without automatic language detection, spoken speech cannot be properly analyzed and grammar rules cannot be applied, causing failures Subsequent speech recognition steps. We propose a language classification system that solves the problem in the image field, rather than the sound field. This research identified and implemented several low-level features using Mel Frequency Cepstral Coefficients, which extract traits from speech files of four languages (Arabic, English, French, Kurdish) from the database (M2L_Dataset) as the data source used in this research. A Convolutional Neuron Network is used to operate on spectrogram images of the available audio snippets. In extensive experiments, we showed that our model is applicable to a range of noisy scenarios and can easily be extended to previously unknown languages, while maintaining classification accuracy. We released our own code and extensive training package for language classification systems for the community. CNN algorithm was applied in this research to classify and the result was perfect, as the classification accuracy reached 97% between two languages if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 98%. While the classification among three languages, the classification accuracy reached 95% if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 96%.","PeriodicalId":33491,"journal":{"name":"mjl@ ltrby@ wl`lm","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"mjl@ ltrby@ wl`lm","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33899/edusj.2022.132223.1200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Language classification systems are used to classify spoken language from a particular phoneme sample and are usually the first step of many spoken language processing tasks, such as automatic speech recognition (ASR) systems Without automatic language detection, spoken speech cannot be properly analyzed and grammar rules cannot be applied, causing failures Subsequent speech recognition steps. We propose a language classification system that solves the problem in the image field, rather than the sound field. This research identified and implemented several low-level features using Mel Frequency Cepstral Coefficients, which extract traits from speech files of four languages (Arabic, English, French, Kurdish) from the database (M2L_Dataset) as the data source used in this research. A Convolutional Neuron Network is used to operate on spectrogram images of the available audio snippets. In extensive experiments, we showed that our model is applicable to a range of noisy scenarios and can easily be extended to previously unknown languages, while maintaining classification accuracy. We released our own code and extensive training package for language classification systems for the community. CNN algorithm was applied in this research to classify and the result was perfect, as the classification accuracy reached 97% between two languages if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 98%. While the classification among three languages, the classification accuracy reached 95% if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 96%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多语言识别智能系统
语言分类系统用于从特定音素样本中对口语进行分类,通常是许多口语处理任务的第一步,如自动语音识别(ASR)系统,如果没有自动语言检测,口语就无法正确分析,语法规则也无法应用,导致后续语音识别步骤失败。我们提出了一个语言分类系统,解决了图像领域的问题,而不是声音领域的问题。本研究使用Mel Frequency Cepstral系数从数据库(M2L_Dataset)中的四种语言(阿拉伯语、英语、法语、库尔德语)的语音文件中提取特征,作为本研究使用的数据源,识别并实现了几个低级特征。使用卷积神经元网络对可用音频片段的频谱图图像进行操作。在大量的实验中,我们证明了我们的模型适用于一系列有噪声的场景,并且可以很容易地扩展到以前未知的语言,同时保持分类准确性。我们为社区发布了自己的代码和广泛的语言分类系统培训包。本研究使用CNN算法进行分类,结果非常理想,当样本长度为1秒时,两种语言之间的分类准确率达到97%,而当样本长度为2秒时,分类准确率达到98%。而三种语言之间的分类,当样本长度为1秒时,分类准确率达到95%,而当样本长度为2秒时,分类准确率达到96%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
38
审稿时长
24 weeks
期刊最新文献
Numerical Solution of the Fredholm Integro-Differential Equations Using High-Order Compact Finite Difference Method Implementing Runge-Kutta Method of Sixth-Order for Numerical Solution of Fuzzy Differential Equations Determining the fundamental conditions of the soliton solution for the new nonlocal discrete Separation and identification of a number of alkaloids and some phenols from two species of plants of the genus Euphorbia grown in Nineveh Governorate. Diagnosing Soft Tissue Tumors using Machine Learning Techniques: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1