Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi
{"title":"Persian Language Phone Recognition Based on Robust Extraction of Acoustic Landmarks","authors":"Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi","doi":"10.1109/ICBME51989.2020.9319436","DOIUrl":null,"url":null,"abstract":"Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.","PeriodicalId":120969,"journal":{"name":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBME51989.2020.9319436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.