基于线性预测误差方差的难发音语音信号的改进静-静音-浊音分割

T. Ijitona, Hong Yue, J. Soraghan, A. Lowit
{"title":"基于线性预测误差方差的难发音语音信号的改进静-静音-浊音分割","authors":"T. Ijitona, Hong Yue, J. Soraghan, A. Lowit","doi":"10.1109/ICCCS49078.2020.9118462","DOIUrl":null,"url":null,"abstract":"A novel algorithm for the segmentation of dysarthric speech into silence, unvoiced and voiced (SUV) segments is presented. The proposed algorithm is based on the combination of short-time energy (STE), zero-crossing rate (ZCR) and linear prediction error variance (LPEV) or the segmentation problem. Extending the previous work in this field, the proposed method will address the difficulties in distinguishing between voiced and unvoiced segments in dysarthric speech. More precisely, the error variance of the linear prediction coefficients will be used to design a three-fold decision matrix that can accommodate the high variability in loudness experienced in dysarthric speech. In addition, a moving average threshold approach will be proposed in order to provide an “as-fit” segmentation technique that is fully automated and that will be able to handle highly severe dysarthric speech with varying loudness and ZCRs. The ability of the proposed fully-automated algorithm will be validated using real speech samples from healthy speakers, and speakers with ataxic dysarthria. The results of the proposed approach are compared with known methods using STE and ZCR. It is observed that the proposed classification method does not only show an improvement in segmentation performance but also provides consistent results in low signal energy situations.","PeriodicalId":105556,"journal":{"name":"2020 5th International Conference on Computer and Communication Systems (ICCCS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Improved Silence-Unvoiced-Voiced (SUV) Segmentation for Dysarthric Speech Signals using Linear Prediction Error Variance\",\"authors\":\"T. Ijitona, Hong Yue, J. Soraghan, A. Lowit\",\"doi\":\"10.1109/ICCCS49078.2020.9118462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel algorithm for the segmentation of dysarthric speech into silence, unvoiced and voiced (SUV) segments is presented. The proposed algorithm is based on the combination of short-time energy (STE), zero-crossing rate (ZCR) and linear prediction error variance (LPEV) or the segmentation problem. Extending the previous work in this field, the proposed method will address the difficulties in distinguishing between voiced and unvoiced segments in dysarthric speech. More precisely, the error variance of the linear prediction coefficients will be used to design a three-fold decision matrix that can accommodate the high variability in loudness experienced in dysarthric speech. In addition, a moving average threshold approach will be proposed in order to provide an “as-fit” segmentation technique that is fully automated and that will be able to handle highly severe dysarthric speech with varying loudness and ZCRs. The ability of the proposed fully-automated algorithm will be validated using real speech samples from healthy speakers, and speakers with ataxic dysarthria. The results of the proposed approach are compared with known methods using STE and ZCR. It is observed that the proposed classification method does not only show an improvement in segmentation performance but also provides consistent results in low signal energy situations.\",\"PeriodicalId\":105556,\"journal\":{\"name\":\"2020 5th International Conference on Computer and Communication Systems (ICCCS)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Computer and Communication Systems (ICCCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCS49078.2020.9118462\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCS49078.2020.9118462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

提出了一种将困难语音分割为静、清、浊音(SUV)段的新算法。该算法基于短时能量(STE)、过零率(ZCR)和线性预测误差方差(LPEV)的组合或分割问题。在此领域的基础上,提出的方法将解决发音困难语音中浊音段和不浊音段的区分困难。更准确地说,线性预测系数的误差方差将用于设计一个三重决策矩阵,该矩阵可以适应困难语音中响度的高可变性。此外,将提出一种移动平均阈值方法,以提供一种完全自动化的“as-fit”分割技术,该技术将能够处理具有不同响度和zcr的高度严重的困难语音。所提出的全自动算法的能力将通过健康说话者和患有共济失调构音障碍的说话者的真实语音样本进行验证。将该方法的结果与已知的STE和ZCR方法进行了比较。结果表明,本文提出的分类方法不仅在分割性能上有所提高,而且在低信号能量情况下也能提供一致的分割结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Improved Silence-Unvoiced-Voiced (SUV) Segmentation for Dysarthric Speech Signals using Linear Prediction Error Variance
A novel algorithm for the segmentation of dysarthric speech into silence, unvoiced and voiced (SUV) segments is presented. The proposed algorithm is based on the combination of short-time energy (STE), zero-crossing rate (ZCR) and linear prediction error variance (LPEV) or the segmentation problem. Extending the previous work in this field, the proposed method will address the difficulties in distinguishing between voiced and unvoiced segments in dysarthric speech. More precisely, the error variance of the linear prediction coefficients will be used to design a three-fold decision matrix that can accommodate the high variability in loudness experienced in dysarthric speech. In addition, a moving average threshold approach will be proposed in order to provide an “as-fit” segmentation technique that is fully automated and that will be able to handle highly severe dysarthric speech with varying loudness and ZCRs. The ability of the proposed fully-automated algorithm will be validated using real speech samples from healthy speakers, and speakers with ataxic dysarthria. The results of the proposed approach are compared with known methods using STE and ZCR. It is observed that the proposed classification method does not only show an improvement in segmentation performance but also provides consistent results in low signal energy situations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Resource Dynamic Recombination and Its Technology Development of Space TT&C Equipment Automatic Arousal Detection Using Multi-model Deep Neural Network Internet Traffic Categories Demand Prediction to Support Dynamic QoS Research on Scatter Imaging Method for Electromagnetic Field Inverse Problem Based on Sparse Constraints Usage Intention of Internet of Vehicles Based on CAB Model: The Moderating Effect of Reference Groups
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1