Pub Date : 2024-01-22DOI: 10.1109/TBC.2024.3349790
Zhitong He;Kewu Peng;Chao Zhang;Jian Song
Digital terrestrial television multimedia broadcasting-advanced (DTMB-A) proposed by China is served as a 2nd generation digital terrestrial television broadcasting (DTTB) standard with advanced forward error correction coding schemes. Nevertheless, to adapt low signal-to-noise ratio (SNR) scenarios such as in cloud transmission systems, LDPC codes with low rates are required for DTMB-A. In this paper, the new design of low-rate DTMB-A LDPC codes is presented systematically. Specifically, a rate-compatible Raptor-Like structure of low-rate DTMB-A LDPC codes is presented, which supports multiple low code rates with constant code length. Then a new construction method is proposed for low-rate DTMB-A LDPC codes, where progressive block extension is employed and the minimum distance is majorly optimized such that the minimum distance increases after each block extension. Finally, the performance of the constructed DTMB-A LDPC codes with two low code rates of 1/3 and 1/4 are simulated and compared with ATSC 3.0 LDPC codes, which demonstrates the effectiveness of our design.
{"title":"Low-Rate LDPC Code Design for DTMB-A","authors":"Zhitong He;Kewu Peng;Chao Zhang;Jian Song","doi":"10.1109/TBC.2024.3349790","DOIUrl":"10.1109/TBC.2024.3349790","url":null,"abstract":"Digital terrestrial television multimedia broadcasting-advanced (DTMB-A) proposed by China is served as a 2nd generation digital terrestrial television broadcasting (DTTB) standard with advanced forward error correction coding schemes. Nevertheless, to adapt low signal-to-noise ratio (SNR) scenarios such as in cloud transmission systems, LDPC codes with low rates are required for DTMB-A. In this paper, the new design of low-rate DTMB-A LDPC codes is presented systematically. Specifically, a rate-compatible Raptor-Like structure of low-rate DTMB-A LDPC codes is presented, which supports multiple low code rates with constant code length. Then a new construction method is proposed for low-rate DTMB-A LDPC codes, where progressive block extension is employed and the minimum distance is majorly optimized such that the minimum distance increases after each block extension. Finally, the performance of the constructed DTMB-A LDPC codes with two low code rates of 1/3 and 1/4 are simulated and compared with ATSC 3.0 LDPC codes, which demonstrates the effectiveness of our design.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"739-746"},"PeriodicalIF":4.5,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advancements in SDRTV-to-HDRTV conversion have yielded impressive results in reconstructing high dynamic range television (HDRTV) videos from standard dynamic range television (SDRTV) videos. However, the practical applications of these techniques are limited for ultra-high definition (UHD) video systems due to their high computational and memory costs. In this paper, we propose EffiHDR, an efficient framework primarily operating in the downsampled space, effectively reducing the computational and memory demands. Our framework comprises a real-time SDRTV-to-HDRTV Reconstruction model and a plug-and-play HDRTV Enhancement model. The SDRTV-to-HDRTV Reconstruction model learns affine transformation coefficients instead of directly predicting output pixels to preserve high-frequency information and mitigate information loss caused by downsampling. It decomposes SDRTV-to-HDR mapping into pixel intensity-dependent and local-dependent affine transformations. The pixel intensity-dependent transformation leverages global contexts and pixel intensity conditions to transform SDRTV pixels to the HDRTV domain. The local-dependent transformation predicts affine coefficients based on local contexts, further enhancing dynamic range, local contrast, and color tone. Additionally, we introduce a plug-and-play HDRTV Enhancement model based on an efficient Transformer-based U-net, which enhances luminance and color details in challenging recovery scenarios. Experimental results demonstrate that our SDRTV-to-HDRTV Reconstruction model achieves real-time 4K conversion with impressive performance. When combined with the HDRTV Enhancement model, our approach outperforms state-of-the-art methods in performance and efficiency.
{"title":"EffiHDR: An Efficient Framework for HDRTV Reconstruction and Enhancement in UHD Systems","authors":"Hengsheng Zhang;Xueyi Zou;Guo Lu;Li Chen;Li Song;Wenjun Zhang","doi":"10.1109/TBC.2023.3345657","DOIUrl":"10.1109/TBC.2023.3345657","url":null,"abstract":"Recent advancements in SDRTV-to-HDRTV conversion have yielded impressive results in reconstructing high dynamic range television (HDRTV) videos from standard dynamic range television (SDRTV) videos. However, the practical applications of these techniques are limited for ultra-high definition (UHD) video systems due to their high computational and memory costs. In this paper, we propose EffiHDR, an efficient framework primarily operating in the downsampled space, effectively reducing the computational and memory demands. Our framework comprises a real-time SDRTV-to-HDRTV Reconstruction model and a plug-and-play HDRTV Enhancement model. The SDRTV-to-HDRTV Reconstruction model learns affine transformation coefficients instead of directly predicting output pixels to preserve high-frequency information and mitigate information loss caused by downsampling. It decomposes SDRTV-to-HDR mapping into pixel intensity-dependent and local-dependent affine transformations. The pixel intensity-dependent transformation leverages global contexts and pixel intensity conditions to transform SDRTV pixels to the HDRTV domain. The local-dependent transformation predicts affine coefficients based on local contexts, further enhancing dynamic range, local contrast, and color tone. Additionally, we introduce a plug-and-play HDRTV Enhancement model based on an efficient Transformer-based U-net, which enhances luminance and color details in challenging recovery scenarios. Experimental results demonstrate that our SDRTV-to-HDRTV Reconstruction model achieves real-time 4K conversion with impressive performance. When combined with the HDRTV Enhancement model, our approach outperforms state-of-the-art methods in performance and efficiency.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"620-636"},"PeriodicalIF":4.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-10DOI: 10.1109/TBC.2023.3345646
Wei Zhang;Yunpeng Jing;Yuan Zhang;Tao Lin;Jinyao Yan
UHD live video streaming, with its high video resolution, offers a wealth of fine-grained scene details, presenting opportunities for intricate video analytics. However, current real-time video streaming analytics solutions are inadequate in analyzing these detailed features, often leading to low accuracy in the analysis of small objects with fine details. Furthermore, due to the high bitrate and precision of UHD streaming, existing real-time inference frameworks typically suffer from low analyzed frame rate caused by the significant computational cost involved. To meet the accuracy requirement and improve the analyzed frame rate, we introduce Retina-U, a real-time analytics framework for UHD video streaming. Specifically, we first present SECT, a real-time DNN model level inference model to enhance inference accuracy in dynamic UHD streaming with an abundance of small objects. SECT uses a slicing-based enhanced inference (SEI) method and Cascade Sparse Queries (CSQ) based-fine tuning to improve the accuracy, and leverages a lightweight tracker to achieve high analyzed frame rate. At the system level, to further improve the inference accuracy and bolster the analyzed frame rate, we propose a deep reinforcement learning-based resource management algorithm for real-time joint network adaptation, resource allocation, and server selection. By simultaneously considering the network and computational resources, we can maximize the comprehensive analytic performance in a dynamic and complex environment. Experimental results demonstrate the effectiveness of Retina-U, showcasing improvements in accuracy of up to 38.01% and inference speed acceleration of up to 24.33%.
{"title":"Retina-U: A Two-Level Real-Time Analytics Framework for UHD Live Video Streaming","authors":"Wei Zhang;Yunpeng Jing;Yuan Zhang;Tao Lin;Jinyao Yan","doi":"10.1109/TBC.2023.3345646","DOIUrl":"10.1109/TBC.2023.3345646","url":null,"abstract":"UHD live video streaming, with its high video resolution, offers a wealth of fine-grained scene details, presenting opportunities for intricate video analytics. However, current real-time video streaming analytics solutions are inadequate in analyzing these detailed features, often leading to low accuracy in the analysis of small objects with fine details. Furthermore, due to the high bitrate and precision of UHD streaming, existing real-time inference frameworks typically suffer from low analyzed frame rate caused by the significant computational cost involved. To meet the accuracy requirement and improve the analyzed frame rate, we introduce Retina-U, a real-time analytics framework for UHD video streaming. Specifically, we first present SECT, a real-time DNN model level inference model to enhance inference accuracy in dynamic UHD streaming with an abundance of small objects. SECT uses a slicing-based enhanced inference (SEI) method and Cascade Sparse Queries (CSQ) based-fine tuning to improve the accuracy, and leverages a lightweight tracker to achieve high analyzed frame rate. At the system level, to further improve the inference accuracy and bolster the analyzed frame rate, we propose a deep reinforcement learning-based resource management algorithm for real-time joint network adaptation, resource allocation, and server selection. By simultaneously considering the network and computational resources, we can maximize the comprehensive analytic performance in a dynamic and complex environment. Experimental results demonstrate the effectiveness of Retina-U, showcasing improvements in accuracy of up to 38.01% and inference speed acceleration of up to 24.33%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"429-440"},"PeriodicalIF":4.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-10DOI: 10.1109/TBC.2023.3340042
Liping Zhao;Zhuge Yan;Zehao Wang;Xu Wang;Keli Hu;Huawen Liu;Tao Lin
During and following the global COVID-19 pandemic, the use of screen content coding applications such as large-scale cloud office, online teaching, and teleconferencing has surged. The vast amount of online data generated by these applications, especially online teaching, has become a vital source of Internet video traffic. Consequently, there is an urgent need for low-complexity online teaching screen content (OTSC) coding techniques. Energy-efficient low-complexity green coding techniques for OTSC, named GCOTSC, are proposed based on the unique characteristics of OTSC. In the inter-frame prediction mode, the input frames are first divided into visually constant frames (VCFs) and non-VCFs using a VCF identifier. A new VCF mode has been proposed to code VCFs efficiently. In the intra-frame prediction mode, a heuristic multi-type least probable option skip mode based on static and dynamic historical information is proposed. Compared with the AVS3 screen content coding algorithm, using the typical online teaching screen content and AVS3 SCC common test condition, the experimental results show that the GOTSC achieves an average 59.06% reduction of encoding complexity in low delay configuration, with almost no impact on coding efficiency.
{"title":"GCOTSC: Green Coding Techniques for Online Teaching Screen Content Implemented in AVS3","authors":"Liping Zhao;Zhuge Yan;Zehao Wang;Xu Wang;Keli Hu;Huawen Liu;Tao Lin","doi":"10.1109/TBC.2023.3340042","DOIUrl":"10.1109/TBC.2023.3340042","url":null,"abstract":"During and following the global COVID-19 pandemic, the use of screen content coding applications such as large-scale cloud office, online teaching, and teleconferencing has surged. The vast amount of online data generated by these applications, especially online teaching, has become a vital source of Internet video traffic. Consequently, there is an urgent need for low-complexity online teaching screen content (OTSC) coding techniques. Energy-efficient low-complexity green coding techniques for OTSC, named GCOTSC, are proposed based on the unique characteristics of OTSC. In the inter-frame prediction mode, the input frames are first divided into visually constant frames (VCFs) and non-VCFs using a VCF identifier. A new VCF mode has been proposed to code VCFs efficiently. In the intra-frame prediction mode, a heuristic multi-type least probable option skip mode based on static and dynamic historical information is proposed. Compared with the AVS3 screen content coding algorithm, using the typical online teaching screen content and AVS3 SCC common test condition, the experimental results show that the GOTSC achieves an average 59.06% reduction of encoding complexity in low delay configuration, with almost no impact on coding efficiency.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"174-182"},"PeriodicalIF":4.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-05DOI: 10.1109/TBC.2023.3345642
He Sun;Emanuele Viterbo;Bin Dai;Rongke Liu
The rapid revolution of mobile communication technology provides a great avenue for efficient information transmission to facilitate digital multimedia services. In current 5G systems, broadcasting technology is used to improve the efficiency of information transmission, and polar codes are adopted to improve data transmission reliability. Reducing the decoding latency of polar codes is of great importance for ultra-low-latency and reliable data transmission for 5G broadcasting, which still remains a challenge in digital broadcasting services. In this paper, we propose an aggregation method to construct constituent codes for reducing the decoding latency of polar codes. The aggregation method jointly exploits the structure and reliability of constituent codes to increase the lengths of constituent codes that can be decoded in parallel, thus significantly reducing the decoding latency. Furthermore, an efficient parallel decoding algorithm is integrated with the proposed aggregation method to efficiently decode the reliable constituent codes without sacrificing error-correction performance. Simulation results show that the proposed method significantly reduces the decoding latency as compared to the existing state-of-the-art schemes.
{"title":"Fast Decoding of Polar Codes for Digital Broadcasting Services in 5G","authors":"He Sun;Emanuele Viterbo;Bin Dai;Rongke Liu","doi":"10.1109/TBC.2023.3345642","DOIUrl":"10.1109/TBC.2023.3345642","url":null,"abstract":"The rapid revolution of mobile communication technology provides a great avenue for efficient information transmission to facilitate digital multimedia services. In current 5G systems, broadcasting technology is used to improve the efficiency of information transmission, and polar codes are adopted to improve data transmission reliability. Reducing the decoding latency of polar codes is of great importance for ultra-low-latency and reliable data transmission for 5G broadcasting, which still remains a challenge in digital broadcasting services. In this paper, we propose an aggregation method to construct constituent codes for reducing the decoding latency of polar codes. The aggregation method jointly exploits the structure and reliability of constituent codes to increase the lengths of constituent codes that can be decoded in parallel, thus significantly reducing the decoding latency. Furthermore, an efficient parallel decoding algorithm is integrated with the proposed aggregation method to efficiently decode the reliable constituent codes without sacrificing error-correction performance. Simulation results show that the proposed method significantly reduces the decoding latency as compared to the existing state-of-the-art schemes.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"731-738"},"PeriodicalIF":4.5,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As wireless technology continues its rapid evolution, the sixth-generation (6G) networks are capable of offering exceptionally high data transmission rates as well as low latency, which is promisingly able to meet the high-demand needs for digital twins (DTs). Quality-of-experience (QoE) in this situation, which refers to the users’ overall satisfaction and perception of the provided DT service in 6G networks, is significant to optimize the service and help improve the users’ experience. Despite progress in developing theories and systems for digital twin transmission under 6G networks, the assessment of QoE for users falls behind. To address this gap, our paper introduces the first QoE evaluation database for human digital twins (HDTs) in 6G network environments, aiming to systematically analyze and quantify the related quality factors. We utilize a mmWave network model for channel capacity simulation and employ high-quality digital humans as source models, which are further animated, encoded, and distorted for final QoE evaluation. Subjective quality ratings are collected from a well-controlled subjective experiment for the 400 generated HDT sequences. Additionally, we propose a novel QoE evaluation metric that considers both quality-of-service (QoS) and content-quality features. Experimental results indicate that our model outperforms existing state-of-the-art QoE evaluation models and other competitive quality assessment models, thus making significant contributions to the domain of 6G network applications for HDTs.
{"title":"Quality-of-Experience Evaluation for Digital Twins in 6G Network Environments","authors":"Zicheng Zhang;Yingjie Zhou;Long Teng;Wei Sun;Chunyi Li;Xiongkuo Min;Xiao-Ping Zhang;Guangtao Zhai","doi":"10.1109/TBC.2023.3345656","DOIUrl":"10.1109/TBC.2023.3345656","url":null,"abstract":"As wireless technology continues its rapid evolution, the sixth-generation (6G) networks are capable of offering exceptionally high data transmission rates as well as low latency, which is promisingly able to meet the high-demand needs for digital twins (DTs). Quality-of-experience (QoE) in this situation, which refers to the users’ overall satisfaction and perception of the provided DT service in 6G networks, is significant to optimize the service and help improve the users’ experience. Despite progress in developing theories and systems for digital twin transmission under 6G networks, the assessment of QoE for users falls behind. To address this gap, our paper introduces the first QoE evaluation database for human digital twins (HDTs) in 6G network environments, aiming to systematically analyze and quantify the related quality factors. We utilize a mmWave network model for channel capacity simulation and employ high-quality digital humans as source models, which are further animated, encoded, and distorted for final QoE evaluation. Subjective quality ratings are collected from a well-controlled subjective experiment for the 400 generated HDT sequences. Additionally, we propose a novel QoE evaluation metric that considers both quality-of-service (QoS) and content-quality features. Experimental results indicate that our model outperforms existing state-of-the-art QoE evaluation models and other competitive quality assessment models, thus making significant contributions to the domain of 6G network applications for HDTs.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"995-1007"},"PeriodicalIF":3.2,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-03DOI: 10.1109/TBC.2023.3342707
Zongyao Hu;Lixiong Liu;Qingbing Sang
Spherical signals of omnidirectional videos need to be projected to a 2D plane for transmission or storage. The projection will produce geometrical deformation that affects the feature representation of Convolutional Neural Networks (CNN) on the perception of omnidirectional videos. Currently developed omnidirectional video quality assessment (OVQA) methods leverage viewport images or spherical CNN to circumvent the geometrical deformation. However, the viewport-based methods neglect the interaction between viewport images while there lacks sufficient pre-training samples for taking spherical CNN as an efficient backbone in OVQA model. In this paper, we alleviate the influence of geometrical deformation from a causal perspective. A structural causal model is adopted to analyze the implicit reason for the disturbance of geometrical deformation on quality representation and we find the latitude factor confounds the feature representation and distorted contents. Based on this evidence, we propose a Causal Intervention-based Quality prediction Network (CIQNet) to alleviate the causal effect of the confounder. The resulting framework first segments the video content into sub-areas and trains feature encoders to obtain latitude-invariant representation for removing the relationship between the latitude and feature representation. Then the features of each sub-area are aggregated by estimated weights in a backdoor adjustment module to remove the relationship between the latitude and video contents. Finally, the temporal dependencies of aggregated features are modeled to implement the quality prediction. We evaluate the performance of CIQNet on three publicly available OVQA databases. The experimental results show CIQNet achieves competitive performance against state-of-art methods. The source code of CIQNet is available at: https://github.com/Aca4peop/CIQNet