Artificial Bandwidth Extension Using H∞ Optimization, Deep Neural Network, and Speech Production Model

2022 IEEE International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2022-07-11 DOI:10.1109/SPCOM55316.2022.9840805

Deepika Gupta, H. S. Shekhawat

{"title":"Artificial Bandwidth Extension Using H∞ Optimization, Deep Neural Network, and Speech Production Model","authors":"Deepika Gupta, H. S. Shekhawat","doi":"10.1109/SPCOM55316.2022.9840805","DOIUrl":null,"url":null,"abstract":"Artificial bandwidth extension is applied to speech signals to improve their quality in narrowband telephonic communication. For accomplishing this, the missing high-frequency components of speech signals are recovered by utilizing an extrapolation process. In this context, we propose another structure wherein we apply the gain adjustment as well as the discrete Fourier transform addition for adding the narrowband signal and corresponding estimated high-band signal. The high-band signal is evaluated by using a synthesis filter, which is acquired by utilizing the $H^{\\infty}$ optimization and speech production model. Non-stationary (time-varying) characteristics of speech signals produce assorted variety in the synthesis filters. So, we use a feed-forward deep neural network (DNN) to estimate the synthesis filter information and gain factor for a given narrowband feature of the signal. Objective analysis is done on the RSR15 and TIMIT datasets. Additionally, objective analysis is performed separately for the voiced speech as well as for the unvoiced speech. Subjective evaluation is conducted on the RSR15 dataset.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Artificial bandwidth extension is applied to speech signals to improve their quality in narrowband telephonic communication. For accomplishing this, the missing high-frequency components of speech signals are recovered by utilizing an extrapolation process. In this context, we propose another structure wherein we apply the gain adjustment as well as the discrete Fourier transform addition for adding the narrowband signal and corresponding estimated high-band signal. The high-band signal is evaluated by using a synthesis filter, which is acquired by utilizing the $H^{\infty}$ optimization and speech production model. Non-stationary (time-varying) characteristics of speech signals produce assorted variety in the synthesis filters. So, we use a feed-forward deep neural network (DNN) to estimate the synthesis filter information and gain factor for a given narrowband feature of the signal. Objective analysis is done on the RSR15 and TIMIT datasets. Additionally, objective analysis is performed separately for the voiced speech as well as for the unvoiced speech. Subjective evaluation is conducted on the RSR15 dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用H∞优化、深度神经网络和语音产生模型的人工带宽扩展

在窄带电话通信中，为了提高语音信号的质量，对语音信号进行了人工带宽扩展。为了实现这一点，通过利用外推过程恢复语音信号中缺失的高频成分。在这种情况下，我们提出了另一种结构，其中我们应用增益调整以及离散傅立叶变换加法来添加窄带信号和相应的估计高频带信号。利用$H^{\infty}$优化和语音产生模型获得的合成滤波器对高频带信号进行评估。语音信号的非平稳(时变)特性导致了合成滤波器的各种变化。因此，我们使用前馈深度神经网络(DNN)来估计给定信号窄带特征的合成滤波器信息和增益因子。对RSR15和TIMIT数据集进行客观分析。此外，对浊音和不浊音分别进行客观分析。对RSR15数据集进行主观评价。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE International Conference on Signal Processing and Communications (SPCOM)

自引率

0.00%

发文量