基于分形音频编码和误差补偿的高合成音频压缩模型

A. Ali, Loay E. George
{"title":"基于分形音频编码和误差补偿的高合成音频压缩模型","authors":"A. Ali, Loay E. George","doi":"10.33166/aetic.2022.02.001","DOIUrl":null,"url":null,"abstract":"This study presented a model for improving audio files quality using fractal coding specifically when a high compression ratio is required. The proposed high synthetic audio compression model which can be called (HSACM) is based on conventional fractal coding and lifting wavelet transform. Various lifting wavelet transform families and levels are used and their effects on the reconstructed audio files are discussed as well. Audio files from GTZAN dataset and standard measurements for data compression are used in the evaluation of the proposed model. The results reveal that using block length 50 samples which is the worst case, PSNR is increased, on average, from 34.1 to 44.8 dB and from 34.1 to 40.5 dB using lifting wavelet transform with 3 and 2 levels, respectively. Thus, the PSNR is improved by 10 and 5 dB with slightly reducing the compression ratio by 6.2 and 12.5%, respectively. Moreover, it can be noticed that adopting lifting wavelet transform with basis Haar, db1, db4, db5, cdf1.1 and cdf2.2 provide higher audio quality while db6, db8, sym7 and sym8 give the worst audio quality. Furthermore, the performance of HSACM is compared with that of existing work to highlight its performance.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"High Synthetic Audio Compression Model Based on Fractal Audio Coding and Error-Compensation\",\"authors\":\"A. Ali, Loay E. George\",\"doi\":\"10.33166/aetic.2022.02.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study presented a model for improving audio files quality using fractal coding specifically when a high compression ratio is required. The proposed high synthetic audio compression model which can be called (HSACM) is based on conventional fractal coding and lifting wavelet transform. Various lifting wavelet transform families and levels are used and their effects on the reconstructed audio files are discussed as well. Audio files from GTZAN dataset and standard measurements for data compression are used in the evaluation of the proposed model. The results reveal that using block length 50 samples which is the worst case, PSNR is increased, on average, from 34.1 to 44.8 dB and from 34.1 to 40.5 dB using lifting wavelet transform with 3 and 2 levels, respectively. Thus, the PSNR is improved by 10 and 5 dB with slightly reducing the compression ratio by 6.2 and 12.5%, respectively. Moreover, it can be noticed that adopting lifting wavelet transform with basis Haar, db1, db4, db5, cdf1.1 and cdf2.2 provide higher audio quality while db6, db8, sym7 and sym8 give the worst audio quality. Furthermore, the performance of HSACM is compared with that of existing work to highlight its performance.\",\"PeriodicalId\":36440,\"journal\":{\"name\":\"Annals of Emerging Technologies in Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Emerging Technologies in Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33166/aetic.2022.02.001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Emerging Technologies in Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33166/aetic.2022.02.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1

摘要

本研究提出了一个使用分形编码提高音频文件质量的模型,特别是当需要高压缩比时。提出了基于传统分形编码和提升小波变换的高合成音频压缩模型(HSACM)。讨论了各种提升小波变换族和层次,并讨论了它们对音频重构文件的影响。采用GTZAN数据集的音频文件和数据压缩的标准测量值来评估所提出的模型。结果表明,在块长为50的最坏情况下,采用3级和2级提升小波变换,PSNR分别从34.1提高到44.8 dB和34.1提高到40.5 dB。因此,PSNR分别提高了10 dB和5 dB,压缩比分别降低了6.2和12.5%。此外,可以注意到,采用Haar基的提升小波变换,db1、db4、db5、cdf1.1和cdf2.2的音频质量较高,而db6、db8、sym7和sym8的音频质量最差。此外,将HSACM的性能与现有工作进行了比较,以突出其性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
High Synthetic Audio Compression Model Based on Fractal Audio Coding and Error-Compensation
This study presented a model for improving audio files quality using fractal coding specifically when a high compression ratio is required. The proposed high synthetic audio compression model which can be called (HSACM) is based on conventional fractal coding and lifting wavelet transform. Various lifting wavelet transform families and levels are used and their effects on the reconstructed audio files are discussed as well. Audio files from GTZAN dataset and standard measurements for data compression are used in the evaluation of the proposed model. The results reveal that using block length 50 samples which is the worst case, PSNR is increased, on average, from 34.1 to 44.8 dB and from 34.1 to 40.5 dB using lifting wavelet transform with 3 and 2 levels, respectively. Thus, the PSNR is improved by 10 and 5 dB with slightly reducing the compression ratio by 6.2 and 12.5%, respectively. Moreover, it can be noticed that adopting lifting wavelet transform with basis Haar, db1, db4, db5, cdf1.1 and cdf2.2 provide higher audio quality while db6, db8, sym7 and sym8 give the worst audio quality. Furthermore, the performance of HSACM is compared with that of existing work to highlight its performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Annals of Emerging Technologies in Computing
Annals of Emerging Technologies in Computing Computer Science-Computer Science (all)
CiteScore
3.50
自引率
0.00%
发文量
26
期刊最新文献
The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding Lightweight Model for Occlusion Removal from Face Images A Torpor-based Enhanced Security Model for CSMA/CA Protocol in Wireless Networks Enhancing Robot Navigation Efficiency Using Cellular Automata with Active Cells Wildfire Prediction in the United States Using Time Series Forecasting Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1