Pub Date : 2023-12-19DOI: 10.1109/TBC.2023.3333750
Motong Xu;Byeungwoo Jeon
The rate-distortion optimized quantization (RDOQ) in HEVC has improved the coding efficiency of the conventional uniform scalar quantization (SQ) very much. Since the RDOQ is computationally complex, in this paper, we investigate a way of performing RDOQ more efficiently in HEVC. Based on our statistical observation of non-trivial percentage of transform blocks (TB) for which RDOQ does not change their quantization results of SQ, we design a learning-based quantizer selection scheme which can tell in advance whether RDOQ is expected to modify the quantization levels calculated by SQ. Only those TBs likely to be changed by RDOQ are subject to the actual RDOQ process. For the remaining TBs, we design an improved SQ which adapts the dead-zone interval size and round offset based on coefficient group and entropy coding features. The proposed improved SQ has much lower computational complexity than RDOQ while achieving better coding efficiency than the conventional SQ. The experimental results show that our efficient quantization scheme respectively provides 9% and 34% of encoding and quantization time reduction by selectively performing RDOQ only for 21% of TBs. The average BDBR performances of Y, Cb, and Cr channels are respectively–0.03%, 0.48%, and 0.45%.
{"title":"Learning-Based Efficient Quantizer Selection for Fast HEVC Encoder","authors":"Motong Xu;Byeungwoo Jeon","doi":"10.1109/TBC.2023.3333750","DOIUrl":"https://doi.org/10.1109/TBC.2023.3333750","url":null,"abstract":"The rate-distortion optimized quantization (RDOQ) in HEVC has improved the coding efficiency of the conventional uniform scalar quantization (SQ) very much. Since the RDOQ is computationally complex, in this paper, we investigate a way of performing RDOQ more efficiently in HEVC. Based on our statistical observation of non-trivial percentage of transform blocks (TB) for which RDOQ does not change their quantization results of SQ, we design a learning-based quantizer selection scheme which can tell in advance whether RDOQ is expected to modify the quantization levels calculated by SQ. Only those TBs likely to be changed by RDOQ are subject to the actual RDOQ process. For the remaining TBs, we design an improved SQ which adapts the dead-zone interval size and round offset based on coefficient group and entropy coding features. The proposed improved SQ has much lower computational complexity than RDOQ while achieving better coding efficiency than the conventional SQ. The experimental results show that our efficient quantization scheme respectively provides 9% and 34% of encoding and quantization time reduction by selectively performing RDOQ only for 21% of TBs. The average BDBR performances of Y, Cb, and Cr channels are respectively–0.03%, 0.48%, and 0.45%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"161-173"},"PeriodicalIF":4.5,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140052881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-15DOI: 10.1109/TBC.2023.3334137
Yuhong Xie;Yuan Zhang;Tao Lin
Deep reinforcement learning (DRL) has demonstrated remarkable potential within the domain of video adaptive bitrate (ABR) optimization. However, training a well-performing DRL agent in the two-tier 360° video streaming system is non-trivial. The conventional DRL training approach fails to enable the model to start learning from simpler environments and then progressively explore more challenging ones, leading to suboptimal asymptotic performance and poor long-tail performance. In this paper, we propose a novel approach called DCRL360, which seamlessly integrates automatic curriculum learning (ACL) with DRL techniques to enable adaptive decision-making for 360° video bitrate selection and chunk scheduling. To tackle the training issue, we introduce a structured two-stage training framework. The first stage focuses on the selection of tasks conducive to learning, guided by a newly introduced training metric called Pscore, to enhance asymptotic performance. The newly introduced metric takes into consideration multiple facets, including performance improvement potential, the risk of being forgotten, and the uncertainty of a decision, to encourage the agent to train in rewarding environments. The second stage utilizes existing rule-based techniques to identify challenging tasks for fine-tuning the model, thereby alleviating the long-tail effect. Our experimental results demonstrate that DCRL360 outperforms state-of-the-art algorithms under various network conditions - including 5G/LTE/Broadband - with a remarkable improvement of 6.51-20.86% in quality of experience (QoE), as well as a reduction in bandwidth wastage by 10.60-31.50%.
{"title":"Deep Curriculum Reinforcement Learning for Adaptive 360° Video Streaming With Two-Stage Training","authors":"Yuhong Xie;Yuan Zhang;Tao Lin","doi":"10.1109/TBC.2023.3334137","DOIUrl":"https://doi.org/10.1109/TBC.2023.3334137","url":null,"abstract":"Deep reinforcement learning (DRL) has demonstrated remarkable potential within the domain of video adaptive bitrate (ABR) optimization. However, training a well-performing DRL agent in the two-tier 360° video streaming system is non-trivial. The conventional DRL training approach fails to enable the model to start learning from simpler environments and then progressively explore more challenging ones, leading to suboptimal asymptotic performance and poor long-tail performance. In this paper, we propose a novel approach called DCRL360, which seamlessly integrates automatic curriculum learning (ACL) with DRL techniques to enable adaptive decision-making for 360° video bitrate selection and chunk scheduling. To tackle the training issue, we introduce a structured two-stage training framework. The first stage focuses on the selection of tasks conducive to learning, guided by a newly introduced training metric called Pscore, to enhance asymptotic performance. The newly introduced metric takes into consideration multiple facets, including performance improvement potential, the risk of being forgotten, and the uncertainty of a decision, to encourage the agent to train in rewarding environments. The second stage utilizes existing rule-based techniques to identify challenging tasks for fine-tuning the model, thereby alleviating the long-tail effect. Our experimental results demonstrate that DCRL360 outperforms state-of-the-art algorithms under various network conditions - including 5G/LTE/Broadband - with a remarkable improvement of 6.51-20.86% in quality of experience (QoE), as well as a reduction in bandwidth wastage by 10.60-31.50%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"441-452"},"PeriodicalIF":4.5,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
5G multicast/broadcast services can provide transformative new opportunities as mobile devices proliferate. However, realizing the full potential of these services requires real-time pedestrian localization. We propose a federated multitask learning (FML) approach on smartphones to enable pedestrian location-aware 5G multicast/broadcast services. Our lightweight FML architecture provides accurate real-time localization while preserving privacy. The pedestrian location data enables adaptive 5G network planning, contextual location-based services, quality of service improvements, and load balancing. Simulations demonstrate the effectiveness of our FML scheme for accurate pedestrian localization. They also highlight significant enhancements to 5G multicast/broadcast services enabled by real-time pedestrian positioning. In summary, our work facilitates enhanced 5G multicast/broadcast services through federated on-device learning for real-time pedestrian localization.
{"title":"Federated Multitask Learning for Pedestrian Location-Aware 5G Multicast/Broadcast Services","authors":"Zexuan Jing;Junsheng Mu;Jian Jin;Zhenzhen Jiao;Peng Yu","doi":"10.1109/TBC.2023.3332012","DOIUrl":"https://doi.org/10.1109/TBC.2023.3332012","url":null,"abstract":"5G multicast/broadcast services can provide transformative new opportunities as mobile devices proliferate. However, realizing the full potential of these services requires real-time pedestrian localization. We propose a federated multitask learning (FML) approach on smartphones to enable pedestrian location-aware 5G multicast/broadcast services. Our lightweight FML architecture provides accurate real-time localization while preserving privacy. The pedestrian location data enables adaptive 5G network planning, contextual location-based services, quality of service improvements, and load balancing. Simulations demonstrate the effectiveness of our FML scheme for accurate pedestrian localization. They also highlight significant enhancements to 5G multicast/broadcast services enabled by real-time pedestrian positioning. In summary, our work facilitates enhanced 5G multicast/broadcast services through federated on-device learning for real-time pedestrian localization.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"66-77"},"PeriodicalIF":4.5,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140052873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-11DOI: 10.1109/TBC.2023.3335815
Zhiqiang Li;Shuai Han;Liang Xiao;Mugen Peng
The integrated satellite-terrestrial network (ISTN) is gaining traction for providing seamless communication and various services, i.e., broadcast and unicast information services. However, meeting massive terminal access and diverse information services poses challenges due to limited spectrum resources and complex multiple access interference in ISTN. Recently, rate-splitting multiple access (RSMA) has emerged as a promising solution offering non-orthogonal transmission and robust interference management. Inspired by this, we design the non-orthogonal broadcast and unicast (NOBU) transmission model by utilizing the common and private data streams of RSMA. Taking different levels of cooperation between satellite and base station (BS) into consideration, we propose two cooperative NOBU transmission schemes, where one is that only broadcast messages are shared, and the other is that the broadcast message and the sub-common message split by terminals are shared and jointly encoded into a super-common stream. Building upon this, we formulate joint max-min rate optimization problems while satisfying the broadcast information rate requirement in ISTN. To address these non-convex problems, we introduce an improved alternating optimization algorithm based on weighted minimum mean square error. Simulation results validate the significant gains of cooperative NOBU schemes compared to various baseline schemes.
{"title":"Cooperative Non-Orthogonal Broadcast and Unicast Transmission for Integrated Satellite–Terrestrial Network","authors":"Zhiqiang Li;Shuai Han;Liang Xiao;Mugen Peng","doi":"10.1109/TBC.2023.3335815","DOIUrl":"10.1109/TBC.2023.3335815","url":null,"abstract":"The integrated satellite-terrestrial network (ISTN) is gaining traction for providing seamless communication and various services, i.e., broadcast and unicast information services. However, meeting massive terminal access and diverse information services poses challenges due to limited spectrum resources and complex multiple access interference in ISTN. Recently, rate-splitting multiple access (RSMA) has emerged as a promising solution offering non-orthogonal transmission and robust interference management. Inspired by this, we design the non-orthogonal broadcast and unicast (NOBU) transmission model by utilizing the common and private data streams of RSMA. Taking different levels of cooperation between satellite and base station (BS) into consideration, we propose two cooperative NOBU transmission schemes, where one is that only broadcast messages are shared, and the other is that the broadcast message and the sub-common message split by terminals are shared and jointly encoded into a super-common stream. Building upon this, we formulate joint max-min rate optimization problems while satisfying the broadcast information rate requirement in ISTN. To address these non-convex problems, we introduce an improved alternating optimization algorithm based on weighted minimum mean square error. Simulation results validate the significant gains of cooperative NOBU schemes compared to various baseline schemes.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1052-1064"},"PeriodicalIF":3.2,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.1109/TBC.2023.3336210
The 2023 Scott Helt Memorial Award was awarded to Hequn Zhang, Yue Zhang, John Cosmas, Nawar Jawad, Wei Li, Robert Muller, Tao Jiang for their paper, “mmWave Indoor Channel Measurement Campaign for 5G New Radio Indoor Broadcasting”. The papers appeared in the IEEE Transactions on Broadcasting, vol. 68, no. 2, pp. 331–344, June 2022. The purpose of the IEEE Scott Helt Memorial Award is to recognize exceptional publications in the field and to stimulate interest in and encourage contributions to the fields of interest of the Society.
2023 年斯科特-赫尔特纪念奖授予张贺群、张越、约翰-科斯马斯、纳瓦尔-贾瓦德、李伟、罗伯特-穆勒、蒋涛,以表彰他们的论文 "mmWave Indoor Channel Measurement Campaign for 5G New Radio Indoor Broadcasting"。论文发表于 2022 年 6 月出版的《电气和电子工程师学会广播学报》(IEEE Transactions on Broadcasting)第 68 卷第 2 期第 331-344 页。IEEE Scott Helt 纪念奖的目的是表彰在该领域发表的杰出论文,激发对学会感兴趣领域的兴趣并鼓励为这些领域做出贡献。
{"title":"2023 Scott Helt Memorial Award for the Best Paper Published in the IEEE Transactions on Broadcasting","authors":"","doi":"10.1109/TBC.2023.3336210","DOIUrl":"https://doi.org/10.1109/TBC.2023.3336210","url":null,"abstract":"The 2023 Scott Helt Memorial Award was awarded to Hequn Zhang, Yue Zhang, John Cosmas, Nawar Jawad, Wei Li, Robert Muller, Tao Jiang for their paper, “mmWave Indoor Channel Measurement Campaign for 5G New Radio Indoor Broadcasting”. The papers appeared in the IEEE Transactions on Broadcasting, vol. 68, no. 2, pp. 331–344, June 2022. The purpose of the IEEE Scott Helt Memorial Award is to recognize exceptional publications in the field and to stimulate interest in and encourage contributions to the fields of interest of the Society.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"69 4","pages":"979-980"},"PeriodicalIF":4.5,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10352330","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138558170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.1109/TBC.2023.3336429
{"title":"IEEE Transactions on Broadcasting Information for Authors","authors":"","doi":"10.1109/TBC.2023.3336429","DOIUrl":"https://doi.org/10.1109/TBC.2023.3336429","url":null,"abstract":"","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"69 4","pages":"C3-C4"},"PeriodicalIF":4.5,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10352326","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138558193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-05DOI: 10.1109/TBC.2023.3334141
Boyan Li;Xin Hu;Naixin Kan;Weidong Wang;Fadhel M. Ghannouchi
With the advent of the fifth generation (5G) New Radio (NR), the Non-Terrestrial Network (NTN) stands out as a solution to enable wider coverage of broadcast satellites. NTN systems require higher data rates and bandwidth. Digital predistortion (DPD) is commonly adopted as an effective method to enhance the power efficiency of broadcast satellites’ NTN systems. With the continuous increase of signal bandwidth, the bandwidth of the feedback loop and the sampling rate of analog-to-digital converters (ADCs) need to be reduced so as to reduce the system cost. The computational complexity and overfitting effect of the existing band-limited DPD (BLDPD) method will raise as the decrease of feedback bandwidth. To address this issue, one deep neural network (DNN) assisted band-limited polynomial digital predistortion (DNN-BLP DPD) is proposed in this paper. This method reduces the computational complexity and the overfitting effect of the band-limited basis functions by grouping a small number of band-limited basis functions for online parameter identification while embedding the DNN in the parameter identification module. Compared with the conventional BLDPD, the experimental results show that the proposed method can achieve a low sampling rate and low computational complexity while ensuring modeling accuracy.
{"title":"Computationally Stable Low Sampling Rate Digital Predistortion for Non-Terrestrial Networks","authors":"Boyan Li;Xin Hu;Naixin Kan;Weidong Wang;Fadhel M. Ghannouchi","doi":"10.1109/TBC.2023.3334141","DOIUrl":"https://doi.org/10.1109/TBC.2023.3334141","url":null,"abstract":"With the advent of the fifth generation (5G) New Radio (NR), the Non-Terrestrial Network (NTN) stands out as a solution to enable wider coverage of broadcast satellites. NTN systems require higher data rates and bandwidth. Digital predistortion (DPD) is commonly adopted as an effective method to enhance the power efficiency of broadcast satellites’ NTN systems. With the continuous increase of signal bandwidth, the bandwidth of the feedback loop and the sampling rate of analog-to-digital converters (ADCs) need to be reduced so as to reduce the system cost. The computational complexity and overfitting effect of the existing band-limited DPD (BLDPD) method will raise as the decrease of feedback bandwidth. To address this issue, one deep neural network (DNN) assisted band-limited polynomial digital predistortion (DNN-BLP DPD) is proposed in this paper. This method reduces the computational complexity and the overfitting effect of the band-limited basis functions by grouping a small number of band-limited basis functions for online parameter identification while embedding the DNN in the parameter identification module. Compared with the conventional BLDPD, the experimental results show that the proposed method can achieve a low sampling rate and low computational complexity while ensuring modeling accuracy.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"325-333"},"PeriodicalIF":4.5,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140042849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
360 video applications are attracting more attention in broadcasting systems. In the case of limited bandwidth, the bit fluctuation will affect the perception quality when transmitting 360-degree videos. To further optimize the bit allocation and reduce fluctuation, this paper proposes a virtual-competitors-based Rate Control (RC) algorithm for 360-degree video coding. The virtual competitors’ concept is first proposed to balance the volatility of the bit between the adjacent GOPs. In addition, by introducing game theory, the virtual competitors-based frame-level bit allocation model is instructed. Furthermore, we propose a GOP-level bit allocation scheme with the average encoded bits of the historical GOP, the remaining bits of the encoded video sequence, and the number of unencoded GOPs. Based on the designed frame-level and GOP-level bit allocation scheme, the overall bit allocation method is proposed to implement the lower GOP-level bit fluctuation. Experimental results indicate the proposed method with the optimal RC Error, Bjøntegaard Delta Peak-Signal-to-Noise-Ratio, Bjøntegaard Delta Bit Rate, and bit fluctuation than the benchmarks, which validates the efficiency of the proposed method.
{"title":"Virtual-Competitors-Based Rate Control for 360-Degree Video Coding","authors":"Jielian Lin;Hongbin Lin;Yiwen Xu;Yuanxun Kang;Tiesong Zhao","doi":"10.1109/TBC.2023.3332019","DOIUrl":"https://doi.org/10.1109/TBC.2023.3332019","url":null,"abstract":"360 video applications are attracting more attention in broadcasting systems. In the case of limited bandwidth, the bit fluctuation will affect the perception quality when transmitting 360-degree videos. To further optimize the bit allocation and reduce fluctuation, this paper proposes a virtual-competitors-based Rate Control (RC) algorithm for 360-degree video coding. The virtual competitors’ concept is first proposed to balance the volatility of the bit between the adjacent GOPs. In addition, by introducing game theory, the virtual competitors-based frame-level bit allocation model is instructed. Furthermore, we propose a GOP-level bit allocation scheme with the average encoded bits of the historical GOP, the remaining bits of the encoded video sequence, and the number of unencoded GOPs. Based on the designed frame-level and GOP-level bit allocation scheme, the overall bit allocation method is proposed to implement the lower GOP-level bit fluctuation. Experimental results indicate the proposed method with the optimal RC Error, Bjøntegaard Delta Peak-Signal-to-Noise-Ratio, Bjøntegaard Delta Bit Rate, and bit fluctuation than the benchmarks, which validates the efficiency of the proposed method.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"357-365"},"PeriodicalIF":4.5,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140042835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raw depth images generally contain a large number of erroneous pixels near object boundaries due to the limitation of depth sensors. It induces misalignment of object boundaries between RGB and depth pairs. Most existing methods do not explicitly study such RGB-Depth misalignment problem. Thereby, depth boundaries cannot be accurately recovered. In this paper, a simple yet effective model is developed based on the guided filter (GF) to identify misaligned object boundaries of a raw depth image. Using GF to filter a raw depth image with the guidance of a reference RGB image, structure of the RGB image can be progressively transferred to filtered depth images as the window size of GF increases. Therefore, misaligned object boundaries in raw depth image can be identified from residuals of filtered depth images from large-size and small-size GFs. The model is embedded into Markov random field to correct misaligned object boundaries. It is restricted in fixed-width regions around depth boundaries to avoid texture-copy artifacts. The optimization problem is solved efficiently in an iterative way. Quantitative and visual results on three RGB-Depth datasets verify that the proposed method achieves the best results compared with recent optimization-based or learning-based baselines. In addition, the proposed method is effectively applied in no-reference depth quality assessment, depth super-resolution, and depth estimation enhancement.
{"title":"Misaligned RGB-Depth Boundary Identification and Correction for Depth Image Recovery","authors":"Meng Yang;Lulu Zhang;Delong Suzhang;Ce Zhu;Nanning Zheng","doi":"10.1109/TBC.2023.3332014","DOIUrl":"https://doi.org/10.1109/TBC.2023.3332014","url":null,"abstract":"Raw depth images generally contain a large number of erroneous pixels near object boundaries due to the limitation of depth sensors. It induces misalignment of object boundaries between RGB and depth pairs. Most existing methods do not explicitly study such RGB-Depth misalignment problem. Thereby, depth boundaries cannot be accurately recovered. In this paper, a simple yet effective model is developed based on the guided filter (GF) to identify misaligned object boundaries of a raw depth image. Using GF to filter a raw depth image with the guidance of a reference RGB image, structure of the RGB image can be progressively transferred to filtered depth images as the window size of GF increases. Therefore, misaligned object boundaries in raw depth image can be identified from residuals of filtered depth images from large-size and small-size GFs. The model is embedded into Markov random field to correct misaligned object boundaries. It is restricted in fixed-width regions around depth boundaries to avoid texture-copy artifacts. The optimization problem is solved efficiently in an iterative way. Quantitative and visual results on three RGB-Depth datasets verify that the proposed method achieves the best results compared with recent optimization-based or learning-based baselines. In addition, the proposed method is effectively applied in no-reference depth quality assessment, depth super-resolution, and depth estimation enhancement.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"183-196"},"PeriodicalIF":4.5,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140052874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}