{"title":"基于滤波器组设计和心理声学模型的语音压缩新技术","authors":"M. Talbi, M. Bouhlel","doi":"10.20855/ijav.2019.24.41455","DOIUrl":null,"url":null,"abstract":"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.","PeriodicalId":49185,"journal":{"name":"International Journal of Acoustics and Vibration","volume":"24 1","pages":"728-735"},"PeriodicalIF":0.8000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model\",\"authors\":\"M. Talbi, M. Bouhlel\",\"doi\":\"10.20855/ijav.2019.24.41455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.\",\"PeriodicalId\":49185,\"journal\":{\"name\":\"International Journal of Acoustics and Vibration\",\"volume\":\"24 1\",\"pages\":\"728-735\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2019-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Acoustics and Vibration\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.20855/ijav.2019.24.41455\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Acoustics and Vibration","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.20855/ijav.2019.24.41455","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model
In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.
期刊介绍:
The International Journal of Acoustics and Vibration (IJAV) is the refereed open-access journal of the International Institute of Acoustics and Vibration (IIAV). The IIAV is a non-profit international scientific society founded in 1995. The primary objective of the Institute is to advance the science of acoustics and vibration by creating an international organization that is responsive to the needs of scientists and engineers concerned with acoustics and vibration problems all around the world.
Manuscripts of articles, technical notes and letters-to-the-editor should be submitted to the Editor-in-Chief via the on-line submission system. Authors wishing to submit an article need to log in on the IJAV website first. Users logged into the website are able to submit new articles, track the status of their articles already submitted, upload revised articles, responses and/or rebuttals to reviewers, figures, biographies, photographs, copyright transfer agreements, and send comments to the editor. Each time the status of an article submitted changes, the author will also be notified automatically by email.
IIAV members (in good standing for at least six months) can publish in IJAV free of charge and their papers will be displayed on-line immediately after they have been edited and laid-out.
Non-IIAV members will be required to pay a mandatory Article Processing Charge (APC) of $200 USD if the manuscript is accepted for publication after review. The APC fee allows IIAV to make your research freely available to all readers using the Open Access model.
In addition, Non-IIAV members who pay an extra voluntary publication fee (EVPF) of $500 USD will be granted expedited publication in the IJAV Journal and their papers can be displayed on the Internet after acceptance. If the $200 USD (APC) publication fee is not honored, papers will not be published. Authors who do not pay the voluntary fixed fee of $500 USD will have their papers published but there may be a considerable delay.
The English text of the papers must be of high quality. If the text submitted is of low quality the manuscript will be more than likely rejected. For authors whose first language is not English, we recommend having their manuscripts reviewed and edited prior to submission by a native English speaker with scientific expertise. There are many commercial editing services which can provide this service at a cost to the authors.