Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456304
Jing Li, Lukáš Krasula, P. Callet, Zhi Li, Yoann Baveye
The Internet streaming is changing the way of watching videos for people. Traditional quality assessment on the cable/satellite broadcasting system mainly focused on the perceptual quality. Nowadays, this concept has been extended to Quality of Experience (QoE) which considers also the contextual factors, such as the environment, the display devices, etc. In this study, we focus on the influence of devices on QoE. A subjective experiment was conducted by using our proposed AccAnn methodology. The observers evaluated the QoE of the video sequences by considering their Acceptance and Annoyance. Two devices were used in this study, TV and Tablet. The experimental results showed that the device was a significant influence factor on QoE. In addition, we found that this influence varied with the QoE of the video sequences. To quantify this influence, the Eliminated-By-Aspects model was used. The results could be used for the training of a device-neutral objective QoE metric. For video streaming providers, the quantification results of the influence from devices could be used to optimize the selection of streaming content. On one hand it could satisfy the QoE expectations of the observers according to the used devices, on the other hand it could help to save the bitrates.
{"title":"Quantifying the Influence of Devices on Quality of Experience for Video Streaming","authors":"Jing Li, Lukáš Krasula, P. Callet, Zhi Li, Yoann Baveye","doi":"10.1109/PCS.2018.8456304","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456304","url":null,"abstract":"The Internet streaming is changing the way of watching videos for people. Traditional quality assessment on the cable/satellite broadcasting system mainly focused on the perceptual quality. Nowadays, this concept has been extended to Quality of Experience (QoE) which considers also the contextual factors, such as the environment, the display devices, etc. In this study, we focus on the influence of devices on QoE. A subjective experiment was conducted by using our proposed AccAnn methodology. The observers evaluated the QoE of the video sequences by considering their Acceptance and Annoyance. Two devices were used in this study, TV and Tablet. The experimental results showed that the device was a significant influence factor on QoE. In addition, we found that this influence varied with the QoE of the video sequences. To quantify this influence, the Eliminated-By-Aspects model was used. The results could be used for the training of a device-neutral objective QoE metric. For video streaming providers, the quantification results of the influence from devices could be used to optimize the selection of streaming content. On one hand it could satisfy the QoE expectations of the observers according to the used devices, on the other hand it could help to save the bitrates.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"418 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116682674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456281
Adeel Abbas, David Newman, Srilakshmi Akula, Akhil Konda
Recently, the Joint Video Exploration Team (JVET) issued a Call for Proposals (CFP) for video compression technology that is expected to be successor to HEVC. In this paper, we present some of the technology from our joint response in the 360° video category of CFP. Goal was to keep design as simple as possible, with picture level preprocessing and without 360 specific coding tools. The response is based on a relatively new projection called Rotated Sphere Projection (RSP). RSP splits and surrounds the sphere using two faces that are cropped from Equirectangular Projection (ERP), in the same way as two flat pieces of rubber are stitched to form a tennis ball. This approach allows RSP to get closer to the sphere than Cube Map, achieving more continuity while preserving 3:2 aspect ratio. Our results show an average BDrate Luma coding gain of 10.5% compared to ERP using HEVC.
{"title":"Next Generation Video Coding for Spherical Content","authors":"Adeel Abbas, David Newman, Srilakshmi Akula, Akhil Konda","doi":"10.1109/PCS.2018.8456281","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456281","url":null,"abstract":"Recently, the Joint Video Exploration Team (JVET) issued a Call for Proposals (CFP) for video compression technology that is expected to be successor to HEVC. In this paper, we present some of the technology from our joint response in the 360° video category of CFP. Goal was to keep design as simple as possible, with picture level preprocessing and without 360 specific coding tools. The response is based on a relatively new projection called Rotated Sphere Projection (RSP). RSP splits and surrounds the sphere using two faces that are cropped from Equirectangular Projection (ERP), in the same way as two flat pieces of rubber are stitched to form a tennis ball. This approach allows RSP to get closer to the sphere than Cube Map, achieving more continuity while preserving 3:2 aspect ratio. Our results show an average BDrate Luma coding gain of 10.5% compared to ERP using HEVC.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121750493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the video bit-rate required for 8K 119.88 Hz (120 Hz) high efficiency video coding (HEVC)/H.265 temporal scalable coding that can partially decode 59.94 Hz (60 Hz) video frames from compressed 120 Hz bit-streams. We compress 8K 120 Hz test sequences using software that emulates our developing HEVC/H.265 encoder and conduct two types of subjective evaluation experiments to investigate the appropriate bit-rate for both 8K 120 and 60 Hz videos for broadcasting purpose. From the results of the experiments, we conclude that the required video bit-rate for 8K 120 Hz temporal scalable coding is estimated to be between 85 and 110 Mbps, which is equivalent to the practical bit-rate for 8K 60 Hz videos, and the appropriate bitrate allocation for the 8K 60 Hz video in 8K 120 Hz temporal scalable video coding at 85 Mbps is presumed to be ∼80 Mbps.
{"title":"A Study on the Required Video Bit-rate for 8K 120-Hz HEVC/H.265 Temporal Scalable Coding","authors":"Yasuko Sugito, Shinya Iwasaki, Kazuhiro Chida, Kazuhisa Iguchi, Kikufumi Kanda, Xuying Lei, H. Miyoshi, Kimihiko Kazui","doi":"10.1109/PCS.2018.8456288","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456288","url":null,"abstract":"This paper studies the video bit-rate required for 8K 119.88 Hz (120 Hz) high efficiency video coding (HEVC)/H.265 temporal scalable coding that can partially decode 59.94 Hz (60 Hz) video frames from compressed 120 Hz bit-streams. We compress 8K 120 Hz test sequences using software that emulates our developing HEVC/H.265 encoder and conduct two types of subjective evaluation experiments to investigate the appropriate bit-rate for both 8K 120 and 60 Hz videos for broadcasting purpose. From the results of the experiments, we conclude that the required video bit-rate for 8K 120 Hz temporal scalable coding is estimated to be between 85 and 110 Mbps, which is equivalent to the practical bit-rate for 8K 60 Hz videos, and the appropriate bitrate allocation for the 8K 60 Hz video in 8K 120 Hz temporal scalable video coding at 85 Mbps is presumed to be ∼80 Mbps.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116108396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456309
Weijia Zhu, A. Segall
During the exploration of video coding technology for potential next generation standards, the Joint Video Exploration Team (JVET) has been studying quad-tree plus binary-tree (QTBT) partition structures within its Joint Exploration Model (JEM). This QTBT partition structure provides more flexibility compared with the quad-tree only partition structure in HEVC. Here, we further consider the QTBT structure and extended it to allow quad-tree partitioning to be performed both before and after a binary-tree partition. We refer to this structure as a compound split tree (CST). To show the efficacy of the approach, we implemented the method into JEM7. The method achieved 1.25%, 2.11% and 1.87% BD-bitrate savings for Y, U and V components on average under the random-access configuration, respectively.
{"title":"Compound Split Tree for Video Coding","authors":"Weijia Zhu, A. Segall","doi":"10.1109/PCS.2018.8456309","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456309","url":null,"abstract":"During the exploration of video coding technology for potential next generation standards, the Joint Video Exploration Team (JVET) has been studying quad-tree plus binary-tree (QTBT) partition structures within its Joint Exploration Model (JEM). This QTBT partition structure provides more flexibility compared with the quad-tree only partition structure in HEVC. Here, we further consider the QTBT structure and extended it to allow quad-tree partitioning to be performed both before and after a binary-tree partition. We refer to this structure as a compound split tree (CST). To show the efficacy of the approach, we implemented the method into JEM7. The method achieved 1.25%, 2.11% and 1.87% BD-bitrate savings for Y, U and V components on average under the random-access configuration, respectively.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134058435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456246
Alex Mackin, Mariana Afonso, Fan Zhang, D. Bull
This paper presents a full reference objective video quality metric (SRQM), which characterises the relationship between variations in spatial resolution and visual quality in the context of adaptive video formats. SRQM uses wavelet decomposition, subband combination with perceptually inspired weights, and spatial pooling, to estimate the relative quality between the frames of a high resolution reference video, and one that has been spatially adapted through a combination of down and upsampling. The uVI-SR video database is used to benchmark SRQM against five commonly-used quality metrics. The database contains 24 diverse video sequences that span a range of spatial resolutions up to UHD-I $(3840times 2160)$. An in- depth analysis demonstrates that SRQM is statistically superior to the other quality metrics for all tested adaptation filters, and all with relatively low computational complexity.
{"title":"SRQM: A Video Quality Metric for Spatial Resolution Adaptation","authors":"Alex Mackin, Mariana Afonso, Fan Zhang, D. Bull","doi":"10.1109/PCS.2018.8456246","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456246","url":null,"abstract":"This paper presents a full reference objective video quality metric (SRQM), which characterises the relationship between variations in spatial resolution and visual quality in the context of adaptive video formats. SRQM uses wavelet decomposition, subband combination with perceptually inspired weights, and spatial pooling, to estimate the relative quality between the frames of a high resolution reference video, and one that has been spatially adapted through a combination of down and upsampling. The uVI-SR video database is used to benchmark SRQM against five commonly-used quality metrics. The database contains 24 diverse video sequences that span a range of spatial resolutions up to UHD-I $(3840times 2160)$. An in- depth analysis demonstrates that SRQM is statistically superior to the other quality metrics for all tested adaptation filters, and all with relatively low computational complexity.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132766263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High dynamic range (HDR) image has larger luminance range than conventional low dynamic range (LDR) image, which is more consistent with human visual system (HVS). Recently, JPEG committee releases a new HDR image compression standard JPEG XT. It decomposes input HDR image into base layer and extension layer. However, this method doesn’t make full use of HVS, causing waste of bits on imperceptible regions to human eyes. In this paper, a visual saliency based HDR image compression scheme is proposed. The saliency map of tone mapped HDR image is first extracted, then is used to guide extension layer encoding. The compression quality is adaptive to the saliency of the coding region of the image. Extensive experimental results show that our method outperforms JPEG XT profile A, B, C, and offers the JPEG compatibility at the same time. Moreover, our method can provide progressive coding of extension layer.
{"title":"High Dynamic Range Image Compression Based on Visual Saliency","authors":"Shenda Li, Jin Wang, Qing Zhu","doi":"10.1017/ATSIP.2020.15","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.15","url":null,"abstract":"High dynamic range (HDR) image has larger luminance range than conventional low dynamic range (LDR) image, which is more consistent with human visual system (HVS). Recently, JPEG committee releases a new HDR image compression standard JPEG XT. It decomposes input HDR image into base layer and extension layer. However, this method doesn’t make full use of HVS, causing waste of bits on imperceptible regions to human eyes. In this paper, a visual saliency based HDR image compression scheme is proposed. The saliency map of tone mapped HDR image is first extracted, then is used to guide extension layer encoding. The compression quality is adaptive to the saliency of the coding region of the image. Extensive experimental results show that our method outperforms JPEG XT profile A, B, C, and offers the JPEG compatibility at the same time. Moreover, our method can provide progressive coding of extension layer.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123164943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456278
Shiba Kuanar, C. Conly, K. Rao
High Efficiency Video Coding (HEVC), which is the latest video coding standard currently, achieves up to 50% bit rate reduction compared to previous H.264/AVC standard. While performing the block based video coding, these lossy compression techniques produce various artifacts like blurring, distortion, ringing, and contouring effects on output frames, especially at low bit rates. To reduce those compression artifacts HEVC adopted two post processing filtering technique namely de-blocking filter (DBF) and sample adaptive offset (SAO) on the decoder side. While DBF applies to samples located at block boundaries, SAO nonlinear operation applies adaptively to samples satisfying the gradient based conditions through a lookup table. Again SAO filter corrects the quantization errors by sending edge offset values to decoders. This operation consumes extra signaling bit and becomes an overhead to network. In this paper, we proposed a Convolutional Neural Network (CNN) based architecture for SAO in-loop filtering operation without modifying anything on encoding process. Our experimental results show that our proposed model outperformed previous state-of-the-art models in terms of BD-PSNR (0.408 dB) and BD-BR (3.44%), measured on a widely available standard video sequences.
{"title":"Deep Learning Based HEVC In-Loop Filtering for Decoder Quality Enhancement","authors":"Shiba Kuanar, C. Conly, K. Rao","doi":"10.1109/PCS.2018.8456278","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456278","url":null,"abstract":"High Efficiency Video Coding (HEVC), which is the latest video coding standard currently, achieves up to 50% bit rate reduction compared to previous H.264/AVC standard. While performing the block based video coding, these lossy compression techniques produce various artifacts like blurring, distortion, ringing, and contouring effects on output frames, especially at low bit rates. To reduce those compression artifacts HEVC adopted two post processing filtering technique namely de-blocking filter (DBF) and sample adaptive offset (SAO) on the decoder side. While DBF applies to samples located at block boundaries, SAO nonlinear operation applies adaptively to samples satisfying the gradient based conditions through a lookup table. Again SAO filter corrects the quantization errors by sending edge offset values to decoders. This operation consumes extra signaling bit and becomes an overhead to network. In this paper, we proposed a Convolutional Neural Network (CNN) based architecture for SAO in-loop filtering operation without modifying anything on encoding process. Our experimental results show that our proposed model outperformed previous state-of-the-art models in terms of BD-PSNR (0.408 dB) and BD-BR (3.44%), measured on a widely available standard video sequences.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124867899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456249
Yue Chen, D. Mukherjee, Jingning Han, Adrian Grange, Yaowu Xu, Zoe Liu, Sarah Parker, Cheng Chen, Hui Su, Urvang Joshi, Ching-Han Chiang, Yunqing Wang, Paul Wilkins, Jim Bankoski, Luc N. Trudeau, N. Egge, J. Valin, T. Davies, Steinar Midtskogen, A. Norkin, Peter De Rivaz
AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.
{"title":"An Overview of Core Coding Tools in the AV1 Video Codec","authors":"Yue Chen, D. Mukherjee, Jingning Han, Adrian Grange, Yaowu Xu, Zoe Liu, Sarah Parker, Cheng Chen, Hui Su, Urvang Joshi, Ching-Han Chiang, Yunqing Wang, Paul Wilkins, Jim Bankoski, Luc N. Trudeau, N. Egge, J. Valin, T. Davies, Steinar Midtskogen, A. Norkin, Peter De Rivaz","doi":"10.1109/PCS.2018.8456249","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456249","url":null,"abstract":"AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125791572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}