Single-Channel Speech Quality Enhancement in Mobile Networks Based on Generative Adversarial Networks

Mobile Networks and Applications Pub Date : 2024-04-02 DOI:10.1007/s11036-024-02300-4

Guifen Wu, Norbert Herencsar

{"title":"Single-Channel Speech Quality Enhancement in Mobile Networks Based on Generative Adversarial Networks","authors":"Guifen Wu, Norbert Herencsar","doi":"10.1007/s11036-024-02300-4","DOIUrl":null,"url":null,"abstract":"<p>A large amount of randomly generated noise in mobile networks leads to a lack of targeting and gaming processes in the speech enhancement process, and the enhancement process from the perspective of acoustic features alone suffers from major drawbacks. Propose a single-channel speech quality enhancement method based on generative adversarial networks in mobile networks. Explain the principle of generative adversarial network to realize single-channel speech quality enhancement in mobile networks and clarify its shortcomings. Design an improved Mel frequency cepstral coefficient extraction method to extract 12 base features as the enhancement basis. Use the relative average least squares loss instead of the traditional loss function to enhance the training efficiency, use the hybrid penalty term to enhance the generator's ability to generate single-channel speech, and optimize the discriminator through the multi-layer convolution and the addition of fully connected layers to enhance the speech quality enhancement ability of adversarial generative networks in various aspects, forming a relative average generative adversarial network (RaGAN) based on hybrid penalty term to realize single-channel speech quality enhancement processing. Through the experiment, when the discriminator is applied with the size of a 3*3 convolutional kernel, the best effect of speech quality enhancement is achieved in the mobile network. This method can complete the enhancement of single-channel speech quality in the mobile network, and the effect is significant, which can effectively reduce the noise in the original single-channel speech.</p>","PeriodicalId":501103,"journal":{"name":"Mobile Networks and Applications","volume":"81 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Networks and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11036-024-02300-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A large amount of randomly generated noise in mobile networks leads to a lack of targeting and gaming processes in the speech enhancement process, and the enhancement process from the perspective of acoustic features alone suffers from major drawbacks. Propose a single-channel speech quality enhancement method based on generative adversarial networks in mobile networks. Explain the principle of generative adversarial network to realize single-channel speech quality enhancement in mobile networks and clarify its shortcomings. Design an improved Mel frequency cepstral coefficient extraction method to extract 12 base features as the enhancement basis. Use the relative average least squares loss instead of the traditional loss function to enhance the training efficiency, use the hybrid penalty term to enhance the generator's ability to generate single-channel speech, and optimize the discriminator through the multi-layer convolution and the addition of fully connected layers to enhance the speech quality enhancement ability of adversarial generative networks in various aspects, forming a relative average generative adversarial network (RaGAN) based on hybrid penalty term to realize single-channel speech quality enhancement processing. Through the experiment, when the discriminator is applied with the size of a 3*3 convolutional kernel, the best effect of speech quality enhancement is achieved in the mobile network. This method can complete the enhancement of single-channel speech quality in the mobile network, and the effect is significant, which can effectively reduce the noise in the original single-channel speech.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于生成式对抗网络的移动网络中单信道语音质量增强技术

移动网络中大量随机产生的噪声导致语音增强过程缺乏针对性和博弈性，仅从声学特征角度出发的增强过程存在较大弊端。提出一种基于生成式对抗网络的移动网络单信道语音质量增强方法。解释生成式对抗网络在移动网络中实现单通道语音质量增强的原理，并阐明其缺点。设计一种改进的 Mel 频率倒频谱系数提取方法，提取 12 个基本特征作为增强基础。用相对平均最小二乘损失代替传统损失函数提高训练效率，用混合惩罚项提高生成器生成单通道语音的能力，通过多层卷积和增加全连接层优化鉴别器，多方面提高对抗生成网络的语音质量增强能力，形成基于混合惩罚项的相对平均对抗生成网络（RaGAN），实现单通道语音质量增强处理。通过实验，当判别器的大小为 3*3 卷积核时，移动网络中的语音质量增强效果最佳。该方法能完成移动网络中单信道语音质量的增强，且效果显著，能有效降低原始单信道语音中的噪声。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Mobile Networks and Applications

自引率

0.00%

发文量