基于语音信号两级自回归模型的声源异步分析方法

V. V. Savchenko, L. V. Savchenko
{"title":"基于语音信号两级自回归模型的声源异步分析方法","authors":"V. V. Savchenko, L. V. Savchenko","doi":"10.32446/0368-1025it.2024-2-55-62","DOIUrl":null,"url":null,"abstract":"The task of analyzing a glottal source over a short observation interval is considered. The acute problem of insufficient performance of known methods for analyzing a glottal source is pointed out, regardless of the mode of data preparation: synchronous with the main tone of speech sounds or asynchronous. A method for analyzing the glottal source based on a two-level autoregressive model of the speech signal is proposed. Its software implementation based on the high-speed Burg-Levinson computational procedure is described. It does not require synchronization of the sequence of observations used with the main tone of the speech signal and is characterized by a relatively small amount of computational costs. Using the described software implementation, a full-scale experiment was set up and conducted, where the vowel sounds of the control speaker’s speech were used as the object of study. Based on the results of the experiment, the increased performance of the proposed method was confirmed and its requirements for the duration of the speech signal during voice analysis in real time were formulated. It is shown that the optimal duration is in the range from 32 to 128 ms. The results obtained can be used in the development and research of digital speech communication systems, voice control, biometrics, biomedicine and other speech systems where the voice characteristics of the speaker’s speech are of paramount importance.","PeriodicalId":14651,"journal":{"name":"Izmeritel`naya Tekhnika","volume":"195 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Method for asynchronous analysis of a glottal source based on a two-level autoregressive model of the speech signal\",\"authors\":\"V. V. Savchenko, L. V. Savchenko\",\"doi\":\"10.32446/0368-1025it.2024-2-55-62\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The task of analyzing a glottal source over a short observation interval is considered. The acute problem of insufficient performance of known methods for analyzing a glottal source is pointed out, regardless of the mode of data preparation: synchronous with the main tone of speech sounds or asynchronous. A method for analyzing the glottal source based on a two-level autoregressive model of the speech signal is proposed. Its software implementation based on the high-speed Burg-Levinson computational procedure is described. It does not require synchronization of the sequence of observations used with the main tone of the speech signal and is characterized by a relatively small amount of computational costs. Using the described software implementation, a full-scale experiment was set up and conducted, where the vowel sounds of the control speaker’s speech were used as the object of study. Based on the results of the experiment, the increased performance of the proposed method was confirmed and its requirements for the duration of the speech signal during voice analysis in real time were formulated. It is shown that the optimal duration is in the range from 32 to 128 ms. The results obtained can be used in the development and research of digital speech communication systems, voice control, biometrics, biomedicine and other speech systems where the voice characteristics of the speaker’s speech are of paramount importance.\",\"PeriodicalId\":14651,\"journal\":{\"name\":\"Izmeritel`naya Tekhnika\",\"volume\":\"195 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Izmeritel`naya Tekhnika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32446/0368-1025it.2024-2-55-62\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Izmeritel`naya Tekhnika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32446/0368-1025it.2024-2-55-62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

研究考虑了在较短观察间隔内分析声源的任务。研究指出,无论采用何种数据准备模式:与语音主音同步或非同步,已知的喉音源分析方法都存在性能不足的严重问题。本文提出了一种基于语音信号两级自回归模型的声源分析方法。介绍了基于高速 Burg-Levinson 计算程序的软件实现方法。该方法无需将观测序列与语音信号的主音同步,而且计算成本相对较低。利用所描述的软件实现,建立并进行了一次全面的实验,将对照组说话者语音中的元音作为研究对象。根据实验结果,确认了所提方法性能的提高,并制定了实时语音分析过程中对语音信号持续时间的要求。结果表明,最佳持续时间范围为 32 至 128 毫秒。所获得的结果可用于数字语音通信系统、语音控制、生物识别、生物医学和其他语音系统的开发和研究,在这些系统中,说话者的语音特征至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Method for asynchronous analysis of a glottal source based on a two-level autoregressive model of the speech signal
The task of analyzing a glottal source over a short observation interval is considered. The acute problem of insufficient performance of known methods for analyzing a glottal source is pointed out, regardless of the mode of data preparation: synchronous with the main tone of speech sounds or asynchronous. A method for analyzing the glottal source based on a two-level autoregressive model of the speech signal is proposed. Its software implementation based on the high-speed Burg-Levinson computational procedure is described. It does not require synchronization of the sequence of observations used with the main tone of the speech signal and is characterized by a relatively small amount of computational costs. Using the described software implementation, a full-scale experiment was set up and conducted, where the vowel sounds of the control speaker’s speech were used as the object of study. Based on the results of the experiment, the increased performance of the proposed method was confirmed and its requirements for the duration of the speech signal during voice analysis in real time were formulated. It is shown that the optimal duration is in the range from 32 to 128 ms. The results obtained can be used in the development and research of digital speech communication systems, voice control, biometrics, biomedicine and other speech systems where the voice characteristics of the speaker’s speech are of paramount importance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
International comparisons in the field of measurement of specific adsorption of gases and specific surface area of solids: the optimal amount to ensure calibration and measurement capabilities Experimental installation of a reference source of low currents for calibration of electrometers Development of state information systems in the field of metrology: main tasks Effect of surfactants on the degree of polydispersity of a suspension of polystyrene latex spheres Scientific equipment for the “Sun-Terahertz” space experiment: study of the temperature effect of the Golay cell
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1