Persian Language Phone Recognition Based on Robust Extraction of Acoustic Landmarks

Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi
{"title":"Persian Language Phone Recognition Based on Robust Extraction of Acoustic Landmarks","authors":"Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi","doi":"10.1109/ICBME51989.2020.9319436","DOIUrl":null,"url":null,"abstract":"Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.","PeriodicalId":120969,"journal":{"name":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBME51989.2020.9319436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于声标志鲁棒提取的波斯语电话识别
声学标志被定义为语音信号中信息量更大的部分,并被证明对设计更健壮的语音识别系统是有益的。本工作旨在提出一种基于声学地标的波斯语电话识别系统,以实现高质量的电话识别系统。为此,选择适合波斯语的声学标志,并将其训练到人工神经网络中。然后对模型的结构和训练方法进行了改进,以提高系统的性能。这些修改的目标是尽可能多地过滤声学标志的变化。为此,我们利用神经网络结构将地标非线性地映射到相应的金色地标。在我们的地标识别系统中,这些黄金地标是可以准确识别的。这些实验是在一个名为Farsdat的波斯语数据库上进行的。最佳的地标识别模型是五隐层前馈神经网络,错误率为21.74。使用我们的最佳变异过滤方法,我们也获得了0.56%的PER改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Semi-automatic 3-D pose estimation of laparoscopic tools to generate 3-D labeled database by developing a graphical user interface Children Semantic Network Growth: A Graph Theory Analysis Autistic Children Skill Acquisition In Sport: An Experimental Study A Two-step Registration Approach: Application in MRI-based Strain Calculation of the Left Ventricle The Effect of Stem on The Knee Joint Prosthesis Flexion Considering Natural Gait Forces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1