Comparison of several acoustic features for the vowel sequence reproduction of a talking robot

Vo Nhu Thanh, H. Sawada
{"title":"Comparison of several acoustic features for the vowel sequence reproduction of a talking robot","authors":"Vo Nhu Thanh, H. Sawada","doi":"10.1109/ICMA.2016.7558722","DOIUrl":null,"url":null,"abstract":"This study compares several acoustic features for developing an automatic vowel sequence reproduction system for a talking robot, which is a mechanical vocalization system modeling the human articulatory system. Matlab-based control system is used to analyze a recorded sound and drives the articulatory motors of the talking robot. A novel method based on short-time energy analysis is used to extract a human speech and translate into a sequence of sound elements for the sequence of vowels reproduction. Then, several phonemes detection methods including the direct cross-correlation analysis, the linear predictive coding (LPC) association, the partial correlation (PARCOR) coefficients analysis, and the formant frequencies comparison are applied to each sound element to give the corrected command for the talking robot to repeat the sound sequentially. Finally, experiments to compare these techniques and verify the working behavior of the robot are performed. The result of the tests indicates that the robot is able to repeat a sequence of vowels spoken by a human with a successful rate of more than 70% for the PARCOR analysis technique and the formant frequencies comparison technique. The greatest accuracy for repeating the sequence is given by the formant comparison method, while the direct cross-correlation method delivers the least accuracy.","PeriodicalId":260197,"journal":{"name":"2016 IEEE International Conference on Mechatronics and Automation","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Mechatronics and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA.2016.7558722","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This study compares several acoustic features for developing an automatic vowel sequence reproduction system for a talking robot, which is a mechanical vocalization system modeling the human articulatory system. Matlab-based control system is used to analyze a recorded sound and drives the articulatory motors of the talking robot. A novel method based on short-time energy analysis is used to extract a human speech and translate into a sequence of sound elements for the sequence of vowels reproduction. Then, several phonemes detection methods including the direct cross-correlation analysis, the linear predictive coding (LPC) association, the partial correlation (PARCOR) coefficients analysis, and the formant frequencies comparison are applied to each sound element to give the corrected command for the talking robot to repeat the sound sequentially. Finally, experiments to compare these techniques and verify the working behavior of the robot are performed. The result of the tests indicates that the robot is able to repeat a sequence of vowels spoken by a human with a successful rate of more than 70% for the PARCOR analysis technique and the formant frequencies comparison technique. The greatest accuracy for repeating the sequence is given by the formant comparison method, while the direct cross-correlation method delivers the least accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语音机器人元音序列再现的几种声学特征比较
为了研制一种模仿人类发音系统的机械发声系统——会说话机器人,本研究对几种声学特征进行了比较。基于matlab的控制系统对录制的声音进行分析,并驱动说话机器人的发音马达。提出了一种基于短时能量分析的语音提取方法,并将其转化为音元序列,用于元音序列的再现。然后,对每个音元采用直接互相关分析、线性预测编码(LPC)关联、偏相关(PARCOR)系数分析和形成峰频率比较等多种音素检测方法,对语音机器人进行顺序重复。最后,通过实验对这些技术进行了比较,验证了机器人的工作性能。测试结果表明,机器人能够重复人类所说的元音序列,在PARCOR分析技术和形成峰频率比较技术中成功率超过70%。形成峰比比法的重复序列精度最高,直接互相关法的重复序列精度最低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Dynamic lane tracking system based on multi-model fuzzy controller Automatic path and trajectory planning for laser cladding robot based on CAD Analysis of dynamic characteristics of rugged vessel in the process of hepatic perfusion A simulation method for X-ray pulsar signal based on Monte Carlo Study of audiovisual asynchrony signal processing: Robot recognition system of different ages
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1