Improved Epoch Extraction Using Variational Mode Decomposition Based Spectral Smoothing of Zero Frequency Filtered Emotive Speech Signals

D. Govind, D. Pravena, S. Ajay
{"title":"Improved Epoch Extraction Using Variational Mode Decomposition Based Spectral Smoothing of Zero Frequency Filtered Emotive Speech Signals","authors":"D. Govind, D. Pravena, S. Ajay","doi":"10.1109/NCC.2018.8600091","DOIUrl":null,"url":null,"abstract":"The objective of the present work is to improve the epoch extraction performance from emotive speech by proposing a post processing approach to the conventional zero frequency filtering (ZFF) method using variational mode decomposition (VMD) based spectral smoothing. Due to the fast uncontrolled variations of the pitch in emotive speech signals, the reliable estimation of epochs is always challenging. In the proposed method, the spectra of the short frames of zero frequency filtered signal (ZFFS) is subjected variational mode decomposition to get component spectra in five modes. A smoothed short time spectra is then obtained by excluding the spectra from the two higher VMD modes which essentially have the high spectral variations. The modified ZFFS is then reconstructed using the sinusoidal parameters corresponding to single dominant frequency present in the smoothed spectra using VMD by parameter interpolation based sinusoidal synthesis. The resulting re-synthesized ZFFS has reduced spurious zero crossings as compared to that obtained from the conventional ZFF method for emotive speech signals. The effectiveness of the proposed VMD based spectral post processing is confirmed from the improved epoch identification rate and epoch identification accuracy across all the emotive utterances (with 7 emotions) present in German emotion speech database having simultaneous speech and electroglottographic (EGG) signal recordings. The performance of the proposed method is found to be better or comparable with the other existing ZFF based post processing methods proposed for emotive speech signals in terms of the epoch identification accuracy with respect to the corresponding reference epochs estimated from EGG signals.","PeriodicalId":121544,"journal":{"name":"2018 Twenty Fourth National Conference on Communications (NCC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Twenty Fourth National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2018.8600091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The objective of the present work is to improve the epoch extraction performance from emotive speech by proposing a post processing approach to the conventional zero frequency filtering (ZFF) method using variational mode decomposition (VMD) based spectral smoothing. Due to the fast uncontrolled variations of the pitch in emotive speech signals, the reliable estimation of epochs is always challenging. In the proposed method, the spectra of the short frames of zero frequency filtered signal (ZFFS) is subjected variational mode decomposition to get component spectra in five modes. A smoothed short time spectra is then obtained by excluding the spectra from the two higher VMD modes which essentially have the high spectral variations. The modified ZFFS is then reconstructed using the sinusoidal parameters corresponding to single dominant frequency present in the smoothed spectra using VMD by parameter interpolation based sinusoidal synthesis. The resulting re-synthesized ZFFS has reduced spurious zero crossings as compared to that obtained from the conventional ZFF method for emotive speech signals. The effectiveness of the proposed VMD based spectral post processing is confirmed from the improved epoch identification rate and epoch identification accuracy across all the emotive utterances (with 7 emotions) present in German emotion speech database having simultaneous speech and electroglottographic (EGG) signal recordings. The performance of the proposed method is found to be better or comparable with the other existing ZFF based post processing methods proposed for emotive speech signals in terms of the epoch identification accuracy with respect to the corresponding reference epochs estimated from EGG signals.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于变分模分解的零频率滤波情绪语音信号谱平滑改进历元提取
本文的目的是通过提出一种基于变分模态分解(VMD)的频谱平滑的后处理方法来改进从情绪语音中提取历元的性能。由于情绪语音信号中音高的快速不受控制的变化,可靠的时代估计一直是一个挑战。该方法对零频滤波信号的短帧谱进行变分模态分解,得到五种模态的分量谱。然后通过排除两个高VMD模式的光谱得到平滑的短时间光谱,这两个模式本质上具有高光谱变化。然后利用基于参数插值的正弦合成方法,利用VMD平滑谱中单个主频率对应的正弦参数重构改进后的ZFFS。与传统的ZFF方法获得的情感语音信号相比,由此产生的重新合成的ZFFS减少了虚假的过零。在德语情绪语音数据库中,同时记录语音和声门电信号的所有情绪话语(含7种情绪)的历元识别率和历元识别准确率均有所提高,从而证实了基于VMD的频谱后处理的有效性。与现有的基于ZFF的情感语音信号后处理方法相比,基于EGG信号估计的相应参考epoch的历元识别精度更好或相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Determining the Generalized Hamming Weight Hierarchy of the Binary Projective Reed-Muller Code A Cognitive Opportunistic Fractional Frequency Reuse Scheme for OFDMA Uplinks Caching Policies for Transient Data Grouping Subarray for Robust Estimation of Direction of Arrival Universal Compression of a Piecewise Stationary Source Through Sequential Change Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1