基于滤波器组设计和心理声学模型的语音压缩新技术

IF 0.8 4区 工程技术 Q4 ACOUSTICS International Journal of Acoustics and Vibration Pub Date : 2019-12-31 DOI:10.20855/ijav.2019.24.41455
M. Talbi, M. Bouhlel
{"title":"基于滤波器组设计和心理声学模型的语音压缩新技术","authors":"M. Talbi, M. Bouhlel","doi":"10.20855/ijav.2019.24.41455","DOIUrl":null,"url":null,"abstract":"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.","PeriodicalId":49185,"journal":{"name":"International Journal of Acoustics and Vibration","volume":"24 1","pages":"728-735"},"PeriodicalIF":0.8000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model\",\"authors\":\"M. Talbi, M. Bouhlel\",\"doi\":\"10.20855/ijav.2019.24.41455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.\",\"PeriodicalId\":49185,\"journal\":{\"name\":\"International Journal of Acoustics and Vibration\",\"volume\":\"24 1\",\"pages\":\"728-735\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2019-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Acoustics and Vibration\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.20855/ijav.2019.24.41455\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Acoustics and Vibration","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.20855/ijav.2019.24.41455","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种新的语音压缩技术。该技术采用心理声学模型和使用优化的滤波器组设计的一般方法。它被评估并与使用32个滤波器的MDCT(修正离散余弦变换)滤波器组和心理声学模型的压缩技术进行比较。这种评估和比较是通过计算压缩前后的比特、峰值信噪比(PSNR)、归一化均方根误差(NRMSE)、信噪比(SNR)和语音质量感知评价(PESQ)计算来完成的。对这两种技术进行了测试,并应用于以8 kHz采样的许多语音信号。评估结果表明,该技术在压缩后比特数和压缩比方面优于第二种压缩技术(基于心理声学模型和MDCT滤波器组)。实际上,所提出的技术比第二种压缩技术产生更高的压缩比值。此外,所提出的压缩技术可以呈现具有可接受感知质量的重构语音信号。信噪比、PSNR、NRMSE和PESQ的值证明了这一点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model
In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Acoustics and Vibration
International Journal of Acoustics and Vibration ACOUSTICS-ENGINEERING, MECHANICAL
CiteScore
1.60
自引率
10.00%
发文量
0
审稿时长
12 months
期刊介绍: The International Journal of Acoustics and Vibration (IJAV) is the refereed open-access journal of the International Institute of Acoustics and Vibration (IIAV). The IIAV is a non-profit international scientific society founded in 1995. The primary objective of the Institute is to advance the science of acoustics and vibration by creating an international organization that is responsive to the needs of scientists and engineers concerned with acoustics and vibration problems all around the world. Manuscripts of articles, technical notes and letters-to-the-editor should be submitted to the Editor-in-Chief via the on-line submission system. Authors wishing to submit an article need to log in on the IJAV website first. Users logged into the website are able to submit new articles, track the status of their articles already submitted, upload revised articles, responses and/or rebuttals to reviewers, figures, biographies, photographs, copyright transfer agreements, and send comments to the editor. Each time the status of an article submitted changes, the author will also be notified automatically by email. IIAV members (in good standing for at least six months) can publish in IJAV free of charge and their papers will be displayed on-line immediately after they have been edited and laid-out. Non-IIAV members will be required to pay a mandatory Article Processing Charge (APC) of $200 USD if the manuscript is accepted for publication after review. The APC fee allows IIAV to make your research freely available to all readers using the Open Access model. In addition, Non-IIAV members who pay an extra voluntary publication fee (EVPF) of $500 USD will be granted expedited publication in the IJAV Journal and their papers can be displayed on the Internet after acceptance. If the $200 USD (APC) publication fee is not honored, papers will not be published. Authors who do not pay the voluntary fixed fee of $500 USD will have their papers published but there may be a considerable delay. The English text of the papers must be of high quality. If the text submitted is of low quality the manuscript will be more than likely rejected. For authors whose first language is not English, we recommend having their manuscripts reviewed and edited prior to submission by a native English speaker with scientific expertise. There are many commercial editing services which can provide this service at a cost to the authors.
期刊最新文献
Surge Motion Passive Control of TLP with Double Horizontal Tuned Mass Dampers Numerical and Experimental Evaluation of Hydrodynamic Bearings Applied to a Jeffcott Test Bench Experimental and Numerical Investigation on the Flow-Induced Interior Noise Based on Pellicular Analysis Application of Statistical Energy Analysis (SEA) in Estimating Acoustic Response of Panels With Non-Uniform Mass Distribution Railways: An Acoustical Point of View
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1