Comparison of several acoustic features for the vowel sequence reproduction of a talking robot

2016 IEEE International Conference on Mechatronics and Automation Pub Date : 2016-09-01 DOI:10.1109/ICMA.2016.7558722

Vo Nhu Thanh, H. Sawada

{"title":"Comparison of several acoustic features for the vowel sequence reproduction of a talking robot","authors":"Vo Nhu Thanh, H. Sawada","doi":"10.1109/ICMA.2016.7558722","DOIUrl":null,"url":null,"abstract":"This study compares several acoustic features for developing an automatic vowel sequence reproduction system for a talking robot, which is a mechanical vocalization system modeling the human articulatory system. Matlab-based control system is used to analyze a recorded sound and drives the articulatory motors of the talking robot. A novel method based on short-time energy analysis is used to extract a human speech and translate into a sequence of sound elements for the sequence of vowels reproduction. Then, several phonemes detection methods including the direct cross-correlation analysis, the linear predictive coding (LPC) association, the partial correlation (PARCOR) coefficients analysis, and the formant frequencies comparison are applied to each sound element to give the corrected command for the talking robot to repeat the sound sequentially. Finally, experiments to compare these techniques and verify the working behavior of the robot are performed. The result of the tests indicates that the robot is able to repeat a sequence of vowels spoken by a human with a successful rate of more than 70% for the PARCOR analysis technique and the formant frequencies comparison technique. The greatest accuracy for repeating the sequence is given by the formant comparison method, while the direct cross-correlation method delivers the least accuracy.","PeriodicalId":260197,"journal":{"name":"2016 IEEE International Conference on Mechatronics and Automation","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Mechatronics and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA.2016.7558722","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

This study compares several acoustic features for developing an automatic vowel sequence reproduction system for a talking robot, which is a mechanical vocalization system modeling the human articulatory system. Matlab-based control system is used to analyze a recorded sound and drives the articulatory motors of the talking robot. A novel method based on short-time energy analysis is used to extract a human speech and translate into a sequence of sound elements for the sequence of vowels reproduction. Then, several phonemes detection methods including the direct cross-correlation analysis, the linear predictive coding (LPC) association, the partial correlation (PARCOR) coefficients analysis, and the formant frequencies comparison are applied to each sound element to give the corrected command for the talking robot to repeat the sound sequentially. Finally, experiments to compare these techniques and verify the working behavior of the robot are performed. The result of the tests indicates that the robot is able to repeat a sequence of vowels spoken by a human with a successful rate of more than 70% for the PARCOR analysis technique and the formant frequencies comparison technique. The greatest accuracy for repeating the sequence is given by the formant comparison method, while the direct cross-correlation method delivers the least accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

语音机器人元音序列再现的几种声学特征比较

为了研制一种模仿人类发音系统的机械发声系统——会说话机器人，本研究对几种声学特征进行了比较。基于matlab的控制系统对录制的声音进行分析，并驱动说话机器人的发音马达。提出了一种基于短时能量分析的语音提取方法，并将其转化为音元序列，用于元音序列的再现。然后，对每个音元采用直接互相关分析、线性预测编码(LPC)关联、偏相关(PARCOR)系数分析和形成峰频率比较等多种音素检测方法，对语音机器人进行顺序重复。最后，通过实验对这些技术进行了比较，验证了机器人的工作性能。测试结果表明，机器人能够重复人类所说的元音序列，在PARCOR分析技术和形成峰频率比较技术中成功率超过70%。形成峰比比法的重复序列精度最高，直接互相关法的重复序列精度最低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE International Conference on Mechatronics and Automation

自引率

0.00%

发文量