首页 > 最新文献

2018 Picture Coding Symposium (PCS)最新文献

英文 中文
PCS 2018 Commentary pc 2018评论
Pub Date : 2018-06-01 DOI: 10.1109/pcs.2018.8456270
{"title":"PCS 2018 Commentary","authors":"","doi":"10.1109/pcs.2018.8456270","DOIUrl":"https://doi.org/10.1109/pcs.2018.8456270","url":null,"abstract":"","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"278 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127383308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hybrid Weighted Compound Motion Compensated Prediction for Video Compression 视频压缩的混合加权复合运动补偿预测
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456241
Cheng Chen, Jingning Han, Yaowu Xu
Compound motion compensated prediction that combines reconstructed reference blocks to exploit the temporal correlation is a major component in the hierarchical coding scheme. A uniform combination that applies equal weights to reference blocks regardless of distances towards the current frame is widely employed in mainstream codecs. Linear distance weighted combination, while reflecting the temporal correlation, is likely to ignore the quantization noise factor and hence degrade the prediction quality. This work builds on the premise that the compound prediction mode effectively embeds two functionalities - exploiting temporal correlation in the video signal and canceling the quantization noise from reference blocks. A modified distance weighting scheme is introduced to optimize the trade-off between these two factors. It quantizes the weights to limit the minimum contribution from both reference blocks for noise cancellation. We further introduces a hybrid scheme allowing the codec to switch between the proposed distance weighted compound mode and the averaging mode to provide more flexibility for the trade-off between temporal correlation and noise cancellation. The scheme is implemented in the AV1 codec as part of the syntax definition. It is experimentally demonstrated to provide on average 1.5% compression gains across a wide range of test sets.
复合运动补偿预测是分层编码方案的重要组成部分,它结合重构的参考块来利用时间相关性。在主流编解码器中广泛采用一种统一的组合,即对参考块应用相同的权重,而不管与当前帧的距离。线性距离加权组合虽然反映了时间相关性,但容易忽略量化噪声因素,从而降低预测质量。这项工作建立在复合预测模式有效嵌入两个功能的前提下-利用视频信号的时间相关性和消除参考块的量化噪声。引入了一种改进的距离加权方案来优化这两个因素之间的权衡。它量化权重以限制两个参考块对噪声消除的最小贡献。我们进一步介绍了一种混合方案,允许编解码器在提出的距离加权复合模式和平均模式之间切换,为时间相关和噪声消除之间的权衡提供更大的灵活性。该方案在AV1编解码器中作为语法定义的一部分实现。实验证明,在广泛的测试集范围内,它可以提供平均1.5%的压缩增益。
{"title":"A Hybrid Weighted Compound Motion Compensated Prediction for Video Compression","authors":"Cheng Chen, Jingning Han, Yaowu Xu","doi":"10.1109/PCS.2018.8456241","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456241","url":null,"abstract":"Compound motion compensated prediction that combines reconstructed reference blocks to exploit the temporal correlation is a major component in the hierarchical coding scheme. A uniform combination that applies equal weights to reference blocks regardless of distances towards the current frame is widely employed in mainstream codecs. Linear distance weighted combination, while reflecting the temporal correlation, is likely to ignore the quantization noise factor and hence degrade the prediction quality. This work builds on the premise that the compound prediction mode effectively embeds two functionalities - exploiting temporal correlation in the video signal and canceling the quantization noise from reference blocks. A modified distance weighting scheme is introduced to optimize the trade-off between these two factors. It quantizes the weights to limit the minimum contribution from both reference blocks for noise cancellation. We further introduces a hybrid scheme allowing the codec to switch between the proposed distance weighted compound mode and the averaging mode to provide more flexibility for the trade-off between temporal correlation and noise cancellation. The scheme is implemented in the AV1 codec as part of the syntax definition. It is experimentally demonstrated to provide on average 1.5% compression gains across a wide range of test sets.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128490448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
PCS 2018 Author Index PCS 2018作者索引
Pub Date : 2018-06-01 DOI: 10.1109/pcs.2018.8456286
{"title":"PCS 2018 Author Index","authors":"","doi":"10.1109/pcs.2018.8456286","DOIUrl":"https://doi.org/10.1109/pcs.2018.8456286","url":null,"abstract":"","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114818632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Simple Prediction Fusion Improves Data-driven Full-Reference Video Quality Assessment Models 一个简单的预测融合改进了数据驱动的全参考视频质量评估模型
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456293
C. Bampis, A. Bovik, Zhi Li
When developing data-driven video quality assessment algorithms, the size of the available ground truth subjective data may hamper the generalization capabilities of the trained models. Nevertheless, if the application context is known a priori, leveraging data-driven approaches for video quality prediction can deliver promising results. Towards achieving highperforming video quality prediction for compression and scaling artifacts, Netflix developed the Video Multi-method Assessment Fusion (VMAF) Framework, a full-reference prediction system which uses a regression scheme to integrate multiple perceptionmotivated features to predict video quality. However, the current version of VMAF does not fully capture temporal video features relevant to temporal video distortions. To achieve this goal, we developed Ensemble VMAF (E-VMAF): a video quality predictor that combines two models: VMAF and predictions based on entropic differencing features calculated on video frames and frame differences. We demonstrate the improved performance of E-VMAF on various subjective video databases. The proposed model will become available as part of the open source package in https://github. com/Netflix/vmaf.
在开发数据驱动的视频质量评估算法时,可用的真实主观数据的大小可能会阻碍训练模型的泛化能力。然而,如果应用程序上下文是先验的,利用数据驱动的方法进行视频质量预测可以提供有希望的结果。为了实现对压缩和缩放工件的高性能视频质量预测,Netflix开发了视频多方法评估融合(VMAF)框架,这是一个全参考预测系统,它使用回归方案集成多个感知驱动特征来预测视频质量。然而,当前版本的VMAF并不能完全捕获与时间视频失真相关的时间视频特征。为了实现这一目标,我们开发了集成VMAF (E-VMAF):一个视频质量预测器,它结合了两个模型:VMAF和基于视频帧和帧差计算的熵差特征的预测。我们在各种主观视频数据库上演示了E-VMAF的改进性能。建议的模型将作为https://github中开放源代码包的一部分提供。com/Netflix/vmaf。
{"title":"A Simple Prediction Fusion Improves Data-driven Full-Reference Video Quality Assessment Models","authors":"C. Bampis, A. Bovik, Zhi Li","doi":"10.1109/PCS.2018.8456293","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456293","url":null,"abstract":"When developing data-driven video quality assessment algorithms, the size of the available ground truth subjective data may hamper the generalization capabilities of the trained models. Nevertheless, if the application context is known a priori, leveraging data-driven approaches for video quality prediction can deliver promising results. Towards achieving highperforming video quality prediction for compression and scaling artifacts, Netflix developed the Video Multi-method Assessment Fusion (VMAF) Framework, a full-reference prediction system which uses a regression scheme to integrate multiple perceptionmotivated features to predict video quality. However, the current version of VMAF does not fully capture temporal video features relevant to temporal video distortions. To achieve this goal, we developed Ensemble VMAF (E-VMAF): a video quality predictor that combines two models: VMAF and predictions based on entropic differencing features calculated on video frames and frame differences. We demonstrate the improved performance of E-VMAF on various subjective video databases. The proposed model will become available as part of the open source package in https://github. com/Netflix/vmaf.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114412572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
2D Video Coding of Volumetric Video Data 体积视频数据的二维视频编码
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456265
S. Schwarz, M. Hannuksela, Vida Fakour Sevom, Nahid Sheikhi-Pour
Due to the increased popularity of augmented and virtual reality experiences, the interest in representing the real world in an immersive fashion has never been higher. Distributing such representations enables users all over the world to freely navigate in never seen before media experiences. Unfortunately, such representations require a large amount of data, not feasible for transmission on today’s networks. Thus, efficient compression technologies are in high demand. This paper proposes an approach to compress 3D video data utilizing 2D video coding technology. The proposed solution was developed to address the needs of ‘tele-immersive’ applications, such as virtual (VR), augmented (AR) or mixed (MR) reality with Six Degrees of Freedom (6DoF) capabilities. Volumetric video data is projected on 2D image planes and compressed using standard 2D video coding solutions. A key benefit of this approach is its compatibility with readily available 2D video coding infrastructure. Furthermore, objective and subjective evaluation shows significant improvement in coding efficiency over reference technology.
由于增强现实和虚拟现实体验的日益普及,人们对以身临其境的方式呈现现实世界的兴趣从未如此高涨。分发这样的表现使世界各地的用户能够在从未见过的媒体体验中自由导航。不幸的是,这样的表示需要大量的数据,在今天的网络上传输是不可行的。因此,对高效压缩技术的需求很大。本文提出了一种利用二维视频编码技术对三维视频数据进行压缩的方法。提出的解决方案旨在满足“远程沉浸式”应用的需求,例如具有六自由度(6DoF)功能的虚拟(VR)、增强(AR)或混合(MR)现实。体积视频数据投影在二维图像平面上,并使用标准的二维视频编码解决方案进行压缩。这种方法的一个关键优点是它与现成的2D视频编码基础设施兼容。此外,客观和主观评价表明编码效率比参考技术有显著提高。
{"title":"2D Video Coding of Volumetric Video Data","authors":"S. Schwarz, M. Hannuksela, Vida Fakour Sevom, Nahid Sheikhi-Pour","doi":"10.1109/PCS.2018.8456265","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456265","url":null,"abstract":"Due to the increased popularity of augmented and virtual reality experiences, the interest in representing the real world in an immersive fashion has never been higher. Distributing such representations enables users all over the world to freely navigate in never seen before media experiences. Unfortunately, such representations require a large amount of data, not feasible for transmission on today’s networks. Thus, efficient compression technologies are in high demand. This paper proposes an approach to compress 3D video data utilizing 2D video coding technology. The proposed solution was developed to address the needs of ‘tele-immersive’ applications, such as virtual (VR), augmented (AR) or mixed (MR) reality with Six Degrees of Freedom (6DoF) capabilities. Volumetric video data is projected on 2D image planes and compressed using standard 2D video coding solutions. A key benefit of this approach is its compatibility with readily available 2D video coding infrastructure. Furthermore, objective and subjective evaluation shows significant improvement in coding efficiency over reference technology.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126520870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Machine learning as applied intrinsically to individual dimensions of HDR Display Quality 机器学习本质上应用于HDR显示质量的各个维度
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456284
A. Choudhury, S. Daly
This study builds on previous work exploring machine learning and perceptual transforms in predicting overall display quality as a function of image quality dimensions that correspond to physical display parameters. Previously, we found that the use of perceptually transformed parameters or machine learning exceeded the performance of predictors using just physical parameters and linear regression. Further, the combination of perceptually transformed parameters with machine learning allowed for robustness to parameters outside of the data set, both for cases of interpolation and extrapolation. Here we apply machine learning at a more intrinsic level. We first evaluate how well the machine learning can develop predictors of the individual dimensions of the overall quality, and then how well those individual predictors can be consolidated across themselves to predict the overall display quality. Having predictions of individual dimensions of quality that are closely related to specific hardware design choices enables more nimble cost trade-off design options.
本研究建立在先前探索机器学习和感知转换的工作基础上,以预测整体显示质量作为与物理显示参数对应的图像质量维度的函数。之前,我们发现使用感知转换参数或机器学习超过了仅使用物理参数和线性回归的预测器的性能。此外,感知转换参数与机器学习的结合允许对数据集之外的参数具有鲁棒性,无论是内插还是外推。在这里,我们将机器学习应用于更内在的层面。我们首先评估机器学习对整体质量的各个维度的预测能力,然后评估这些个体预测能力在预测整体显示质量方面的整合能力。预测与特定硬件设计选择密切相关的单个质量维度,可以实现更灵活的成本权衡设计选择。
{"title":"Machine learning as applied intrinsically to individual dimensions of HDR Display Quality","authors":"A. Choudhury, S. Daly","doi":"10.1109/PCS.2018.8456284","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456284","url":null,"abstract":"This study builds on previous work exploring machine learning and perceptual transforms in predicting overall display quality as a function of image quality dimensions that correspond to physical display parameters. Previously, we found that the use of perceptually transformed parameters or machine learning exceeded the performance of predictors using just physical parameters and linear regression. Further, the combination of perceptually transformed parameters with machine learning allowed for robustness to parameters outside of the data set, both for cases of interpolation and extrapolation. Here we apply machine learning at a more intrinsic level. We first evaluate how well the machine learning can develop predictors of the individual dimensions of the overall quality, and then how well those individual predictors can be consolidated across themselves to predict the overall display quality. Having predictions of individual dimensions of quality that are closely related to specific hardware design choices enables more nimble cost trade-off design options.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129406526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Perceptually-Aligned Frame Rate Selection Using Spatio-Temporal Features 基于时空特征的感知对齐帧率选择
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456274
Angeliki V. Katsenou, Di Ma, D. Bull
During recent years, the standardisation committees on video compression and broadcast formats have worked on extending practical video frame rates up to 120 frames per second. Generally, increased video frame rates have been shown to improve immersion, but at the cost of higher bit rates. Taking into consideration that the benefits of high frame rates are content dependent, a decision mechanism that recommends the appropriate frame rate for the specific content would provide benefits prior to compression and transmission. Furthermore, this decision mechanism must take account of the perceived video quality. The proposed method extracts and selects suitable spatio-temporal features and uses a supervised machine learning technique to build a model that is able to predict, with high accuracy, the lowest frame rate for which the perceived video quality is indistinguishable from that of video at the acquisition frame rate. The results show that it is a promising tool for prior to compression and delivery processing of videos, such as content-aware frame rate adaptation.
近年来,视频压缩和广播格式标准化委员会一直致力于将实际视频帧率提高到每秒120帧。一般来说,提高视频帧率已经被证明可以提高沉浸感,但代价是更高的比特率。考虑到高帧率的好处依赖于内容,为特定内容推荐适当帧率的决策机制将在压缩和传输之前提供好处。此外,这种决策机制必须考虑到感知到的视频质量。该方法提取并选择合适的时空特征,并使用监督机器学习技术构建一个模型,该模型能够高精度地预测在获取帧率下感知到的视频质量与视频质量无法区分的最低帧率。结果表明,它是一种很有前途的工具,用于视频的预先压缩和传输处理,如内容感知帧率自适应。
{"title":"Perceptually-Aligned Frame Rate Selection Using Spatio-Temporal Features","authors":"Angeliki V. Katsenou, Di Ma, D. Bull","doi":"10.1109/PCS.2018.8456274","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456274","url":null,"abstract":"During recent years, the standardisation committees on video compression and broadcast formats have worked on extending practical video frame rates up to 120 frames per second. Generally, increased video frame rates have been shown to improve immersion, but at the cost of higher bit rates. Taking into consideration that the benefits of high frame rates are content dependent, a decision mechanism that recommends the appropriate frame rate for the specific content would provide benefits prior to compression and transmission. Furthermore, this decision mechanism must take account of the perceived video quality. The proposed method extracts and selects suitable spatio-temporal features and uses a supervised machine learning technique to build a model that is able to predict, with high accuracy, the lowest frame rate for which the perceived video quality is indistinguishable from that of video at the acquisition frame rate. The results show that it is a promising tool for prior to compression and delivery processing of videos, such as content-aware frame rate adaptation.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Quality Assessment of Thumbnail and Billboard Images on Mobile Devices 移动设备上缩略图和广告牌图像的质量评估
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456285
Zeina Sinno, Anush K. Moorthy, J. D. Cock, Zhi Li, A. Bovik
Objective image quality assessment (IQA) research entails developing algorithms that predict human judgments of picture quality. Validating performance entails evaluating algorithms under conditions similar to where they are deployed. Hence, creating image quality databases representative of target use cases is an important endeavor. Here we present a database that relates to quality assessment of billboard images commonly displayed on mobile devices. Billboard images are a subset of thumbnail images, that extend across a display screen, representing things like album covers, banners, or frames or artwork. We conducted a subjective study of the quality of billboard images distorted by processes like compression, scaling and chroma-subsampling, and compared high-performance quality prediction models on the images and subjective data.
客观图像质量评估(IQA)研究需要开发预测人类对图像质量判断的算法。验证性能需要在与部署算法相似的条件下评估算法。因此,创建代表目标用例的图像质量数据库是一项重要的工作。在这里,我们提出了一个数据库,涉及到通常显示在移动设备上的广告牌图像的质量评估。广告牌图像是缩略图图像的一个子集,它在显示屏幕上延伸,代表诸如专辑封面、横幅、框架或艺术品之类的东西。我们对经过压缩、缩放和色度子采样等处理后的广告牌图像进行了主观质量研究,并比较了基于图像和主观数据的高性能质量预测模型。
{"title":"Quality Assessment of Thumbnail and Billboard Images on Mobile Devices","authors":"Zeina Sinno, Anush K. Moorthy, J. D. Cock, Zhi Li, A. Bovik","doi":"10.1109/PCS.2018.8456285","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456285","url":null,"abstract":"Objective image quality assessment (IQA) research entails developing algorithms that predict human judgments of picture quality. Validating performance entails evaluating algorithms under conditions similar to where they are deployed. Hence, creating image quality databases representative of target use cases is an important endeavor. Here we present a database that relates to quality assessment of billboard images commonly displayed on mobile devices. Billboard images are a subset of thumbnail images, that extend across a display screen, representing things like album covers, banners, or frames or artwork. We conducted a subjective study of the quality of billboard images distorted by processes like compression, scaling and chroma-subsampling, and compared high-performance quality prediction models on the images and subjective data.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"10 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127088882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Wavelet Decomposition Pre-processing for Spatial Scalability Video Compression Scheme 空间可扩展性视频压缩方案的小波分解预处理
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456307
Glenn Herrou, W. Hamidouche, L. Morin
Scalable video coding enables to compress the video at different formats within a single layered bitstream. SHVC, the scalable extension of the High Efficiency Video Coding (HEVC) standard, enables x2 spatial scalability, among other additional features. The closed-loop architecture of the SHVC codec is based on the use of multiple instances of the HEVC codec to encode the video layers, which considerably increases the encoding complexity. With the arrival of new immersive video formats, like 4K, 8K, High Frame Rate (HFR) and 360° videos, the quantity of data to compress is exploding, making the use of high-complexity coding algorithms unsuitable. In this paper, we propose a lowcomplexity scalable coding scheme based on the use of a single HEVC codec instance and a wavelet-based decomposition as pre-processing. The pre-encoding image decomposition relies on well-known simple Discrete Wavelet Transform (DWT) kernels, such as Haar or Le Gall 5/3. Compared to SHVC, the proposed architecture achieves a similar rate distortion performance with a coding complexity reduction of 50%.
可扩展的视频编码可以在单个层比特流中压缩不同格式的视频。SHVC是高效视频编码(HEVC)标准的可扩展扩展,支持x2空间可扩展性,以及其他附加功能。SHVC编解码器的闭环架构是基于使用多个HEVC编解码器实例对视频层进行编码,这大大增加了编码复杂度。随着新的沉浸式视频格式的出现,如4K、8K、高帧率(HFR)和360°视频,需要压缩的数据量呈爆炸式增长,这使得使用高复杂性的编码算法变得不合适。在本文中,我们提出了一种基于单个HEVC编解码器实例和基于小波分解作为预处理的低复杂度可扩展编码方案。预编码图像分解依赖于众所周知的简单离散小波变换(DWT)核,如Haar或Le Gall 5/3。与SHVC相比,该结构在编码复杂度降低50%的情况下实现了相似的率失真性能。
{"title":"Wavelet Decomposition Pre-processing for Spatial Scalability Video Compression Scheme","authors":"Glenn Herrou, W. Hamidouche, L. Morin","doi":"10.1109/PCS.2018.8456307","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456307","url":null,"abstract":"Scalable video coding enables to compress the video at different formats within a single layered bitstream. SHVC, the scalable extension of the High Efficiency Video Coding (HEVC) standard, enables x2 spatial scalability, among other additional features. The closed-loop architecture of the SHVC codec is based on the use of multiple instances of the HEVC codec to encode the video layers, which considerably increases the encoding complexity. With the arrival of new immersive video formats, like 4K, 8K, High Frame Rate (HFR) and 360° videos, the quantity of data to compress is exploding, making the use of high-complexity coding algorithms unsuitable. In this paper, we propose a lowcomplexity scalable coding scheme based on the use of a single HEVC codec instance and a wavelet-based decomposition as pre-processing. The pre-encoding image decomposition relies on well-known simple Discrete Wavelet Transform (DWT) kernels, such as Haar or Le Gall 5/3. Compared to SHVC, the proposed architecture achieves a similar rate distortion performance with a coding complexity reduction of 50%.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114063397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Perceptual Quality Driven Adaptive Video Coding Using JND Estimation 基于JND估计的感知质量驱动自适应视频编码
Pub Date : 2018-06-01 DOI: 10.1109/PCS.2018.8456297
Masaru Takeuchi, Shintaro Saika, Yusuke Sakamoto, Tatsuya Nagashima, Zhengxue Cheng, Kenji Kanai, J. Katto, Kaijin Wei, Ju Zengwei, Xu Wei
We introduce a perceptual video quality driven video encoding solution for optimized adaptive streaming. By using multiple bitrate/resolution encoding like MPEG-DASH, video streaming services can deliver the best video stream to a client, under the conditions of the client's available bandwidth and viewing device capability. However, conventional fixed encoding recipes (i.e., resolution-bitrate pairs) suffer from many problems, such as improper resolution selection and stream redundancy. To avoid these problems, we propose a novel video coding method, which generates multiple representations with constant JustNoticeable Difference (JND) interval. For this purpose, we developed a JND scale estimator using Support Vector Regression (SVR), and designed a pre-encoder which outputs an encoding recipe with constant JND interval in an adaptive manner to input video.
我们提出了一种感知视频质量驱动的视频编码方案,用于优化自适应流媒体。通过使用像MPEG-DASH这样的多比特率/分辨率编码,视频流服务可以在客户端可用带宽和观看设备能力的条件下向客户端提供最佳视频流。然而,传统的固定编码方法(即分辨率-比特率对)存在许多问题,例如不正确的分辨率选择和流冗余。为了避免这些问题,我们提出了一种新的视频编码方法,该方法以恒定的justvisible Difference (JND)间隔生成多个表示。为此,我们利用支持向量回归(SVR)开发了JND尺度估计器,并设计了一个预编码器,该预编码器以自适应方式输出恒定JND间隔的编码配方来输入视频。
{"title":"Perceptual Quality Driven Adaptive Video Coding Using JND Estimation","authors":"Masaru Takeuchi, Shintaro Saika, Yusuke Sakamoto, Tatsuya Nagashima, Zhengxue Cheng, Kenji Kanai, J. Katto, Kaijin Wei, Ju Zengwei, Xu Wei","doi":"10.1109/PCS.2018.8456297","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456297","url":null,"abstract":"We introduce a perceptual video quality driven video encoding solution for optimized adaptive streaming. By using multiple bitrate/resolution encoding like MPEG-DASH, video streaming services can deliver the best video stream to a client, under the conditions of the client's available bandwidth and viewing device capability. However, conventional fixed encoding recipes (i.e., resolution-bitrate pairs) suffer from many problems, such as improper resolution selection and stream redundancy. To avoid these problems, we propose a novel video coding method, which generates multiple representations with constant JustNoticeable Difference (JND) interval. For this purpose, we developed a JND scale estimator using Support Vector Regression (SVR), and designed a pre-encoder which outputs an encoding recipe with constant JND interval in an adaptive manner to input video.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125872696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
2018 Picture Coding Symposium (PCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1