Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck.

IF 1.9 Q3 ENGINEERING, MECHANICAL Vibration Pub Date : 2022-12-01 Epub Date: 2022-10-13 DOI:10.3390/vibration5040041
Jennifer M Vojtech, Claire L Mitchell, Laura Raiff, Joshua C Kline, Gianluca De Luca
{"title":"Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck.","authors":"Jennifer M Vojtech,&nbsp;Claire L Mitchell,&nbsp;Laura Raiff,&nbsp;Joshua C Kline,&nbsp;Gianluca De Luca","doi":"10.3390/vibration5040041","DOIUrl":null,"url":null,"abstract":"<p><p>Silent speech interfaces (SSIs) enable speech recognition and synthesis in the absence of an acoustic signal. Yet, the archetypal SSI fails to convey the expressive attributes of prosody such as pitch and loudness, leading to lexical ambiguities. The aim of this study was to determine the efficacy of using surface electromyography (sEMG) as an approach for predicting continuous acoustic estimates of prosody. Ten participants performed a series of vocal tasks including sustained vowels, phrases, and monologues while acoustic data was recorded simultaneously with sEMG activity from muscles of the face and neck. A battery of time-, frequency-, and cepstral-domain features extracted from the sEMG signals were used to train deep regression neural networks to predict fundamental frequency and intensity contours from the acoustic signals. We achieved an average accuracy of 0.01 ST and precision of 0.56 ST for the estimation of fundamental frequency, and an average accuracy of 0.21 dB SPL and precision of 3.25 dB SPL for the estimation of intensity. This work highlights the importance of using sEMG as an alternative means of detecting prosody and shows promise for improving SSIs in future development.</p>","PeriodicalId":75301,"journal":{"name":"Vibration","volume":"5 4","pages":"692-710"},"PeriodicalIF":1.9000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9592063/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vibration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/vibration5040041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/13 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Silent speech interfaces (SSIs) enable speech recognition and synthesis in the absence of an acoustic signal. Yet, the archetypal SSI fails to convey the expressive attributes of prosody such as pitch and loudness, leading to lexical ambiguities. The aim of this study was to determine the efficacy of using surface electromyography (sEMG) as an approach for predicting continuous acoustic estimates of prosody. Ten participants performed a series of vocal tasks including sustained vowels, phrases, and monologues while acoustic data was recorded simultaneously with sEMG activity from muscles of the face and neck. A battery of time-, frequency-, and cepstral-domain features extracted from the sEMG signals were used to train deep regression neural networks to predict fundamental frequency and intensity contours from the acoustic signals. We achieved an average accuracy of 0.01 ST and precision of 0.56 ST for the estimation of fundamental frequency, and an average accuracy of 0.21 dB SPL and precision of 3.25 dB SPL for the estimation of intensity. This work highlights the importance of using sEMG as an alternative means of detecting prosody and shows promise for improving SSIs in future development.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于面部和颈部肌电信号的声音基频和强度预测。
无声语音接口(ssi)在没有声音信号的情况下实现语音识别和合成。然而,原型SSI未能传达韵律的表达属性,如音高和响度,导致词汇歧义。本研究的目的是确定使用表面肌电图(sEMG)作为预测韵律连续声学估计的方法的有效性。10名参与者完成了一系列发声任务,包括持续的元音、短语和独白,同时记录了面部和颈部肌肉的声电信号活动。从表面肌电信号中提取的一系列时间、频率和倒谱域特征用于训练深度回归神经网络,以预测声信号的基频和强度轮廓。基频估计的平均精度为0.01 ST,精密度为0.56 ST,强度估计的平均精度为0.21 dB SPL,精密度为3.25 dB SPL。这项工作强调了使用肌电图作为检测韵律的替代方法的重要性,并显示了在未来发展中改善ssi的希望。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.20
自引率
0.00%
发文量
0
审稿时长
10 weeks
期刊最新文献
A Study of Seating Suspension System Vibration Isolation Using a Hybrid Method of an Artificial Neural Network and Response Surface Modelling Evaluating Contact-Less Sensing and Fault Diagnosis Characteristics in Vibrating Thin Cantilever Beams with a MetGlas® 2826MB Ribbon A Testbench for Measuring the Dynamic Force-Displacement Characteristics of Shockmounts Study on Fluid–Structure Interaction of a Camber Morphing Wing Study on Lateral Vibration of Tail Coach for High-Speed Train under Unsteady Aerodynamic Loads
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1