Weighted Energy Reallocation Approach for Near-end Speech Enhancement

S.Steniffer Jebaruby, N.Nirmal Singh, M. Jeeva
{"title":"Weighted Energy Reallocation Approach for Near-end Speech Enhancement","authors":"S.Steniffer Jebaruby, N.Nirmal Singh, M. Jeeva","doi":"10.1109/ICONSTEM.2019.8918713","DOIUrl":null,"url":null,"abstract":"In any speech communication system, the presence of background noises cause the quality or intelligibility of speech to degrade. Speech quality refers to naturalness and cleanliness that is, how good the signal is perceived. Speech intelligibility refers to understandability between speaker's and listener's message. Speech corrupted by noise (far-end speech) and speech rendered in a noisy environment (near-end speech) lacks intelligibility and is uncomfortable for human listening. The current work aims to improve speech intelligibility for a near-end listener. State of the art near-end speech enhancement algorithms improve intelligibility by reallocating speech energy over time and frequency depends on the perceptual distortion measure. This method automatically redistributes speech energy from vowels to transients with reduced delay. However it treats different classes (e.g. fricatives, stops, liquids, nasals) of consonant sounds as a single group during energy reallocation. The effect of noise is disparate and it varies among different consonant classes of sound. Therefore an analysis is carried out to find out the energy relation among the classes of sound units and weightings are given to low energy classes of sounds. Then weighted energy reallocation method is evaluated for speech quality and intelligibility using PESQ and STOI. In addition to that an analysis is carried out to find out the optimum segment size for energy reallocation.","PeriodicalId":164463,"journal":{"name":"2019 Fifth International Conference on Science Technology Engineering and Mathematics (ICONSTEM)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Fifth International Conference on Science Technology Engineering and Mathematics (ICONSTEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONSTEM.2019.8918713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In any speech communication system, the presence of background noises cause the quality or intelligibility of speech to degrade. Speech quality refers to naturalness and cleanliness that is, how good the signal is perceived. Speech intelligibility refers to understandability between speaker's and listener's message. Speech corrupted by noise (far-end speech) and speech rendered in a noisy environment (near-end speech) lacks intelligibility and is uncomfortable for human listening. The current work aims to improve speech intelligibility for a near-end listener. State of the art near-end speech enhancement algorithms improve intelligibility by reallocating speech energy over time and frequency depends on the perceptual distortion measure. This method automatically redistributes speech energy from vowels to transients with reduced delay. However it treats different classes (e.g. fricatives, stops, liquids, nasals) of consonant sounds as a single group during energy reallocation. The effect of noise is disparate and it varies among different consonant classes of sound. Therefore an analysis is carried out to find out the energy relation among the classes of sound units and weightings are given to low energy classes of sounds. Then weighted energy reallocation method is evaluated for speech quality and intelligibility using PESQ and STOI. In addition to that an analysis is carried out to find out the optimum segment size for energy reallocation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
近端语音增强的加权能量重新分配方法
在任何语音通信系统中,背景噪声的存在都会导致语音的质量或可理解性下降。语音质量指的是自然度和清洁度,即信号被感知的程度。言语可理解性是指说话者和听者信息之间的可理解性。被噪声破坏的语音(远端语音)和在嘈杂环境中呈现的语音(近端语音)缺乏可理解性,对人类听力来说不舒服。目前的工作旨在提高近端听者的语音清晰度。目前的近端语音增强算法通过重新分配语音能量随时间和频率来提高可理解性,这取决于感知失真度量。该方法自动将语音能量从元音重新分配到瞬态,减少了延迟。然而,在能量重新分配时,它将不同类别的辅音(如摩擦音、顿音、液体音、鼻音)视为一组。噪音的影响是完全不同的,它在不同辅音类别的声音中是不同的。为此,分析了声音单元各等级之间的能量关系,并对低能量等级的声音进行了加权。然后利用PESQ和STOI对加权能量重新分配方法的语音质量和可理解性进行了评价。此外,还对能量再分配的最佳分段尺寸进行了分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Lively Contour Facets for Red Lesion Recognition Analysis of Different Speed Control Techniques for A Six-Phase Asymmetrical Induction Motor Drive Multi Inteligent Traffic Light Optimization techniques by appling Modified Component Analysis Algorithm Identification of Land Document Duplication and Black Money Transaction Using Big Data Analytics An Implementation of Energy Demand Forecast using J48 and Simple K Means
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1