Hardware design for the 32×32 IDCT of the HEVC video coding standard

R. Conceição, J. Cláudio, Souza Jr, R. Jeske, M. Porto, J. Mattos, L. Agostini
{"title":"Hardware design for the 32×32 IDCT of the HEVC video coding standard","authors":"R. Conceição, J. Cláudio, Souza Jr, R. Jeske, M. Porto, J. Mattos, L. Agostini","doi":"10.1109/SBCCI.2013.6644881","DOIUrl":null,"url":null,"abstract":"This paper is focused in the inverse transforms defined in the video coding standard HEVC - High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC since it allows the use of the biggest number of transforms sizes (four) and also the biggest transform sizes (till 32×32) when compared with previous standards. The inverse DCT is performed by the video encoder and decoder as well. This paper presents an efficient hardware design for the 32×32 HEVC IDCT based on the separability principle. The hardware design was planned to reach real time processing (at least 30 frames per second) for high resolution videos, exploiting a high parallelism level (32 samples consumed per clock cycle). The architecture was also planned to reach a low latency and a low cost, then it was designed in a purely combinational way and using a multiplierless approach. The synthesis process was targeted to an Altera Stratix IV FPGA. The synthesis results show that the designed architecture is capable to process more than 30 QFHD frames (3840×2160 pixels) per second, with a latency of 33 clock cycles.","PeriodicalId":203604,"journal":{"name":"2013 26th Symposium on Integrated Circuits and Systems Design (SBCCI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 26th Symposium on Integrated Circuits and Systems Design (SBCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBCCI.2013.6644881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

This paper is focused in the inverse transforms defined in the video coding standard HEVC - High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC since it allows the use of the biggest number of transforms sizes (four) and also the biggest transform sizes (till 32×32) when compared with previous standards. The inverse DCT is performed by the video encoder and decoder as well. This paper presents an efficient hardware design for the 32×32 HEVC IDCT based on the separability principle. The hardware design was planned to reach real time processing (at least 30 frames per second) for high resolution videos, exploiting a high parallelism level (32 samples consumed per clock cycle). The architecture was also planned to reach a low latency and a low cost, then it was designed in a purely combinational way and using a multiplierless approach. The synthesis process was targeted to an Altera Stratix IV FPGA. The synthesis results show that the designed architecture is capable to process more than 30 QFHD frames (3840×2160 pixels) per second, with a latency of 33 clock cycles.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
硬件设计为32×32 IDCT的HEVC视频编码标准
本文主要研究视频编码标准HEVC (High Efficiency video coding)中定义的逆变换。转换阶段是HEVC提出的创新之一,因为与以前的标准相比,它允许使用最大数量的转换尺寸(四个)和最大的转换尺寸(直到32×32)。反向DCT由视频编码器和解码器完成。本文提出了一种基于可分性原理的32×32 HEVC IDCT的高效硬件设计方法。硬件设计计划达到高分辨率视频的实时处理(至少每秒30帧),利用高并行性水平(每个时钟周期消耗32个样本)。该架构还计划达到低延迟和低成本,然后以纯粹的组合方式设计,并使用无乘法器方法。合成过程针对Altera Stratix IV FPGA。综合结果表明,所设计的架构能够每秒处理超过30个QFHD帧(3840×2160像素),延迟为33个时钟周期。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Temporal noise analysis and measurements of CMOS active pixel sensor operating in time domain A new code compression algorithm and its decompressor in FPGA-based hardware Implementation of split-radix FFT pruning for the reduction of computational complexity in OFDM based cognitive radio system Security-enhanced 3D communication structure for dynamic 3D-MPSoCs protection Synthesis of a narrow-band Low Noise Amplifier in a 180 nm CMOS technology using Simulated Annealing with crossover operator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1