Pub Date : 2025-07-03DOI: 10.1109/TBC.2025.3579225
Shiyu Feng;Yun Zhang;Linwei Zhu;Sam Kwong
Light-Field (LF) image is emerging 4D data of light rays that is capable of realistically presenting spatial and angular information of 3D scene. However, the large data volume of LF images becomes the most challenging issue in real-time processing, transmission, and storage. In this paper, we propose an end-to-end deep LF Image Compression method Using Disentangled Representation and Asymmetrical Strip Convolution (LFIC-DRASC) to improve coding efficiency. Firstly, we formulate the LF image compression problem as learning a disentangled LF representation network and an image encoding-decoding network. Secondly, we propose two novel feature extractors that leverage the structural prior of LF data by integrating features across different dimensions. Meanwhile, disentangled LF representation network is proposed to enhance the LF feature disentangling and decoupling. Thirdly, we propose the LFIC-DRASC for LF image compression, where two Asymmetrical Strip Convolution (ASC) operators, i.e., horizontal and vertical, are proposed to capture long-range correlation in LF feature space. These two ASC operators can be combined with the square convolution to further decouple LF features, which enhances the model’s ability in representing intricate spatial relationships. Experimental results demonstrate that the proposed LFIC-DRASC achieves an average of 20.5% bit rate reductions compared with the state-of-the-art methods. Source code and pre-trained models of LFIC-DRASC are available at https://github.com/SYSU-Video/LFIC-DRASC.
{"title":"LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution","authors":"Shiyu Feng;Yun Zhang;Linwei Zhu;Sam Kwong","doi":"10.1109/TBC.2025.3579225","DOIUrl":"https://doi.org/10.1109/TBC.2025.3579225","url":null,"abstract":"Light-Field (LF) image is emerging 4D data of light rays that is capable of realistically presenting spatial and angular information of 3D scene. However, the large data volume of LF images becomes the most challenging issue in real-time processing, transmission, and storage. In this paper, we propose an end-to-end deep LF Image Compression method Using Disentangled Representation and Asymmetrical Strip Convolution (LFIC-DRASC) to improve coding efficiency. Firstly, we formulate the LF image compression problem as learning a disentangled LF representation network and an image encoding-decoding network. Secondly, we propose two novel feature extractors that leverage the structural prior of LF data by integrating features across different dimensions. Meanwhile, disentangled LF representation network is proposed to enhance the LF feature disentangling and decoupling. Thirdly, we propose the LFIC-DRASC for LF image compression, where two Asymmetrical Strip Convolution (ASC) operators, i.e., horizontal and vertical, are proposed to capture long-range correlation in LF feature space. These two ASC operators can be combined with the square convolution to further decouple LF features, which enhances the model’s ability in representing intricate spatial relationships. Experimental results demonstrate that the proposed LFIC-DRASC achieves an average of 20.5% bit rate reductions compared with the state-of-the-art methods. Source code and pre-trained models of LFIC-DRASC are available at <uri>https://github.com/SYSU-Video/LFIC-DRASC</uri>.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"889-902"},"PeriodicalIF":4.8,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-27DOI: 10.1109/TBC.2025.3579251
Xuehan Wang;Jintao Wang;Jinhong Yuan;Jian Song
Rate splitting multiple access (RSMA) has been regarded as one of the most promising technologies for the next-generation broadcasting and mobile communication systems. Many prior designs for RSMA systems focused on the capacity optimization from the information-theoretic analysis, while the reliability in realistic deployment is less considered. To this end, the linear precoder of downlink multiple-input single-output (MISO)-RSMA systems is elaborately designed in this paper by minimizing the mean square error (MSE) associated with the worst user equipment (UE). The optimization problem is first formulated by investigating the MSE for each UE from the view of signal processing, where the zero-forcing (ZF) precoding is utilized for the private messages. The minimum MSE (MMSE) precoding is then obtained by utilizing the semi-definite relaxation (SDR) for the common precoding vector and deriving the closed-form optimal power allocation coefficient between private and common messages. A heuristic closed-form solution is then developed to reduce the complexity caused by the semi-definite programming (SDP). Simulation results demonstrate the reliability superiority of the proposed schemes beyond space-division multiple access (SDMA) and conventional RSMA approaches based on the max-min fairness (MMF) rate optimization even though the near-optimal weighted MMSE algorithm can be deployed.
{"title":"MMSE Precoding for Reliability Enhancement in Downlink MISO-RSMA Systems","authors":"Xuehan Wang;Jintao Wang;Jinhong Yuan;Jian Song","doi":"10.1109/TBC.2025.3579251","DOIUrl":"https://doi.org/10.1109/TBC.2025.3579251","url":null,"abstract":"Rate splitting multiple access (RSMA) has been regarded as one of the most promising technologies for the next-generation broadcasting and mobile communication systems. Many prior designs for RSMA systems focused on the capacity optimization from the information-theoretic analysis, while the reliability in realistic deployment is less considered. To this end, the linear precoder of downlink multiple-input single-output (MISO)-RSMA systems is elaborately designed in this paper by minimizing the mean square error (MSE) associated with the worst user equipment (UE). The optimization problem is first formulated by investigating the MSE for each UE from the view of signal processing, where the zero-forcing (ZF) precoding is utilized for the private messages. The minimum MSE (MMSE) precoding is then obtained by utilizing the semi-definite relaxation (SDR) for the common precoding vector and deriving the closed-form optimal power allocation coefficient between private and common messages. A heuristic closed-form solution is then developed to reduce the complexity caused by the semi-definite programming (SDP). Simulation results demonstrate the reliability superiority of the proposed schemes beyond space-division multiple access (SDMA) and conventional RSMA approaches based on the max-min fairness (MMF) rate optimization even though the near-optimal weighted MMSE algorithm can be deployed.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"732-740"},"PeriodicalIF":4.8,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-24DOI: 10.1109/TBC.2025.3575339
Ao-Xiang Zhang;Yuan-Gen Wang;Yu Ran;Weixuan Tang;Qingxiao Guan;Chunsheng Yang
The exponential surge in video traffic has intensified the imperative for Video Quality Assessment (VQA). Leveraging cutting-edge architectures, current VQA models have achieved human-comparable accuracy. However, recent studies have revealed the vulnerability of existing VQA models against adversarial attacks. To establish a reliable and practical assessment system, a secure VQA model capable of resisting such malicious attacks is urgently demanded. Unfortunately, no attempt has been made to explore this issue. This paper first attempts to investigate general adversarial defense principles, aiming to endow existing VQA models with security. Specifically, we first introduce random spatial grid sampling on the video frame for intra-frame defense. Then, we design pixel-wise randomization through a guardian map, globally neutralizing adversarial perturbations. Meanwhile, we extract temporal information from the video sequence as compensation for inter-frame defense. Building upon these principles, we present a novel VQA framework from a security-oriented perspective, termed SecureVQA. Extensive experiments indicate that SecureVQA sets a new benchmark in security while achieving competitive VQA performance compared with state-of-the-art models. Ablation studies delve deeper into analyzing the principles of SecureVQA, demonstrating their generalization and contributions to the security of leading VQA models. The code is available at https://github.com/GZHU-DVL/SecureVQA.
{"title":"Secure Video Quality Assessment Resisting Adversarial Attacks","authors":"Ao-Xiang Zhang;Yuan-Gen Wang;Yu Ran;Weixuan Tang;Qingxiao Guan;Chunsheng Yang","doi":"10.1109/TBC.2025.3575339","DOIUrl":"https://doi.org/10.1109/TBC.2025.3575339","url":null,"abstract":"The exponential surge in video traffic has intensified the imperative for Video Quality Assessment (VQA). Leveraging cutting-edge architectures, current VQA models have achieved human-comparable accuracy. However, recent studies have revealed the vulnerability of existing VQA models against adversarial attacks. To establish a reliable and practical assessment system, a secure VQA model capable of resisting such malicious attacks is urgently demanded. Unfortunately, no attempt has been made to explore this issue. This paper first attempts to investigate general adversarial defense principles, aiming to endow existing VQA models with security. Specifically, we first introduce random spatial grid sampling on the video frame for intra-frame defense. Then, we design pixel-wise randomization through a guardian map, globally neutralizing adversarial perturbations. Meanwhile, we extract temporal information from the video sequence as compensation for inter-frame defense. Building upon these principles, we present a novel VQA framework from a security-oriented perspective, termed SecureVQA. Extensive experiments indicate that SecureVQA sets a new benchmark in security while achieving competitive VQA performance compared with state-of-the-art models. Ablation studies delve deeper into analyzing the principles of SecureVQA, demonstrating their generalization and contributions to the security of leading VQA models. The code is available at <uri>https://github.com/GZHU-DVL/SecureVQA</uri>.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"821-832"},"PeriodicalIF":4.8,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-16DOI: 10.1109/TBC.2025.3570838
Yuanlong Cao;Haopeng Zhang;Ming Jiang;Yirui Jiang;Jinquan Nie
Multipath Quick UDP Internet Connections (MPQUIC) integrated with network coding offers a promising approach to improving the Quality of Experience (QoE) for video services over heterogeneous wireless networks. However, a significant challenge arises when encoding nodes transmit potentially redundant packets while awaiting decoding acknowledgments (ACKs) from endpoints. This behavior can limit effective transmission rates, thereby degrading real-time streaming performance and user QoE. In this paper, we propose MP2-QUIC, which addresses these challenges through a novel adaptive Model Predictive Control (MPC) framework for MPQUIC that optimizes both congestion window and encoding redundancy parameters via a discrete state transition model. By incorporating operating point linearization and leveraging the Central Limit Theorem, MP2-QUIC effectively enhances the control performance and effective throughput of the model in heterogeneous wireless network environments. MP2-QUIC further employs Band-Sparse Network Coding (Band-SNC) to minimize computational complexity at endpoints, while utilizing queuing theory principles to determine optimal encoded packet quantities. This integrated approach significantly enhances end-user QoE, and the experimental results demonstrate MP2-QUIC’s superior performance compared to existing MPQUIC encoding solutions, yielding a 68.85% reduction in peak decoding overhead and marked improvements in Peak Signal-to-Noise Ratio (PSNR).
{"title":"When Multipath QUIC Meets Model Predictive Control and Band Sparse Network Coding: A Novel Multipathing Solution for Video Streaming Over Heterogeneous Wireless Networks","authors":"Yuanlong Cao;Haopeng Zhang;Ming Jiang;Yirui Jiang;Jinquan Nie","doi":"10.1109/TBC.2025.3570838","DOIUrl":"https://doi.org/10.1109/TBC.2025.3570838","url":null,"abstract":"Multipath Quick UDP Internet Connections (MPQUIC) integrated with network coding offers a promising approach to improving the Quality of Experience (QoE) for video services over heterogeneous wireless networks. However, a significant challenge arises when encoding nodes transmit potentially redundant packets while awaiting decoding acknowledgments (ACKs) from endpoints. This behavior can limit effective transmission rates, thereby degrading real-time streaming performance and user QoE. In this paper, we propose MP2-QUIC, which addresses these challenges through a novel adaptive Model Predictive Control (MPC) framework for MPQUIC that optimizes both congestion window and encoding redundancy parameters via a discrete state transition model. By incorporating operating point linearization and leveraging the Central Limit Theorem, MP2-QUIC effectively enhances the control performance and effective throughput of the model in heterogeneous wireless network environments. MP2-QUIC further employs Band-Sparse Network Coding (Band-SNC) to minimize computational complexity at endpoints, while utilizing queuing theory principles to determine optimal encoded packet quantities. This integrated approach significantly enhances end-user QoE, and the experimental results demonstrate MP2-QUIC’s superior performance compared to existing MPQUIC encoding solutions, yielding a 68.85% reduction in peak decoding overhead and marked improvements in Peak Signal-to-Noise Ratio (PSNR).","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"756-773"},"PeriodicalIF":4.8,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-12DOI: 10.1109/TBC.2025.3575341
Hyejin Ro;Junghyun Kim;Hosung Park;Sang-Hyo Kim;Seok-Ki Ahn;Sung-Ik Park
The 5G multicast and broadcast service (MBS) has been discussed since 3GPP Release 17, emphasizing resource-efficient transmission for multiple users. A primary focus of 5G MBS is enhancing reliability, even for the broadcast mode without retransmissions. In discussing 6G, the hyper reliable communication is also an important use case. In this context, the design of channel codes with low error floors is crucial to ensure robust communication for such demanding scenarios. Protograph-based raptor-like (PBRL) low-density parity-check (LDPC) codes have good error-correcting performance and rate-compatibility but the construction has focused on waterfall rather than error floor. In this paper, we propose an add-on structure for PBRL LDPC codes to have low error floors, which consists of edges added on the protographs of original PBRL LDPC codes. The added edges play a role of boosting up the reliability of weak variable nodes in the original PBRL LDPC codes. We propose two construction algorithms, one for use at a fixed rate and the other for rate-compatible use. It is shown via simulations that the proposed codes have lower error floors than the original PBRL LDPC codes for various rates. Since the edge addition does not change the existing edge connections in the protograph, an adaptive use with/without the add-on structure has an effect of implementing two PBRL LDPC codes for high-speed and reliable communications in an efficient way while keeping the system backward-compatible with the original PBRL LDPC code.
{"title":"Protograph-Based Raptor-Like LDPC Codes With an Add-On Structure for Reliable Communications","authors":"Hyejin Ro;Junghyun Kim;Hosung Park;Sang-Hyo Kim;Seok-Ki Ahn;Sung-Ik Park","doi":"10.1109/TBC.2025.3575341","DOIUrl":"https://doi.org/10.1109/TBC.2025.3575341","url":null,"abstract":"The 5G multicast and broadcast service (MBS) has been discussed since 3GPP Release 17, emphasizing resource-efficient transmission for multiple users. A primary focus of 5G MBS is enhancing reliability, even for the broadcast mode without retransmissions. In discussing 6G, the hyper reliable communication is also an important use case. In this context, the design of channel codes with low error floors is crucial to ensure robust communication for such demanding scenarios. Protograph-based raptor-like (PBRL) low-density parity-check (LDPC) codes have good error-correcting performance and rate-compatibility but the construction has focused on waterfall rather than error floor. In this paper, we propose an add-on structure for PBRL LDPC codes to have low error floors, which consists of edges added on the protographs of original PBRL LDPC codes. The added edges play a role of boosting up the reliability of weak variable nodes in the original PBRL LDPC codes. We propose two construction algorithms, one for use at a fixed rate and the other for rate-compatible use. It is shown via simulations that the proposed codes have lower error floors than the original PBRL LDPC codes for various rates. Since the edge addition does not change the existing edge connections in the protograph, an adaptive use with/without the add-on structure has an effect of implementing two PBRL LDPC codes for high-speed and reliable communications in an efficient way while keeping the system backward-compatible with the original PBRL LDPC code.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"717-731"},"PeriodicalIF":4.8,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-11DOI: 10.1109/TBC.2025.3573144
Zhuo Zhang;Shuai Xiao;Guipeng Lan;Meng Xi;Jiabao Wen;Jiachen Yang
In the image semantic broadcasting communication system, the resources of the channel are limited, which restricts the transmission and broadcasting of large-scale image data. This paper proposed a deep learning assisted image semantic broadcasting scheme to improve source efficiency and alleviate communication resource pressure at the transmission terminal. We adopt an image informativeness evaluation method to screen high information image data and implement this data-driven source optimization scheme. Specifically, we propose a Multi Attribute Information Proxy (MAIP) method that integrates fine-grained information attributes such as uncertainty, novelty, and diversity to evaluate and screen image semantic broadcast data. Used to support the formation of optimal image data broadcast transmission strategies. To demonstrate the effectiveness of the proposed MAIP, we compared it with state-of-the-art over three benchmarks CIFAR-10, mini ImageNet and Fashion Minst based on active learning as a validation experiment.
在图像语义广播通信系统中,由于信道资源有限,限制了大规模图像数据的传输和广播。本文提出了一种深度学习辅助图像语义广播方案,以提高源效率,缓解传输终端的通信资源压力。采用图像信息量评价方法筛选高信息量的图像数据,实现数据驱动的数据源优化方案。具体而言,我们提出了一种多属性信息代理(MAIP)方法,该方法集成了不确定性、新颖性和多样性等细粒度信息属性,以评估和筛选图像语义广播数据。用于支持形成最优的图像数据广播传输策略。为了证明所提出的MAIP的有效性,我们将其与基于主动学习的最先进的三个基准CIFAR-10, mini ImageNet和Fashion Minst进行了比较,作为验证实验。
{"title":"MAIP: A Multi-Attribute Informativeness Proxy for Image Semantic Broadcasting Communication","authors":"Zhuo Zhang;Shuai Xiao;Guipeng Lan;Meng Xi;Jiabao Wen;Jiachen Yang","doi":"10.1109/TBC.2025.3573144","DOIUrl":"https://doi.org/10.1109/TBC.2025.3573144","url":null,"abstract":"In the image semantic broadcasting communication system, the resources of the channel are limited, which restricts the transmission and broadcasting of large-scale image data. This paper proposed a deep learning assisted image semantic broadcasting scheme to improve source efficiency and alleviate communication resource pressure at the transmission terminal. We adopt an image informativeness evaluation method to screen high information image data and implement this data-driven source optimization scheme. Specifically, we propose a Multi Attribute Information Proxy (MAIP) method that integrates fine-grained information attributes such as uncertainty, novelty, and diversity to evaluate and screen image semantic broadcast data. Used to support the formation of optimal image data broadcast transmission strategies. To demonstrate the effectiveness of the proposed MAIP, we compared it with state-of-the-art over three benchmarks CIFAR-10, mini ImageNet and Fashion Minst based on active learning as a validation experiment.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"903-913"},"PeriodicalIF":4.8,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-09DOI: 10.1109/TBC.2025.3569995
{"title":"IEEE Transactions on Broadcasting Information for Authors","authors":"","doi":"10.1109/TBC.2025.3569995","DOIUrl":"https://doi.org/10.1109/TBC.2025.3569995","url":null,"abstract":"","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"C3-C4"},"PeriodicalIF":3.2,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11027898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-04DOI: 10.1109/TBC.2025.3570860
Yulong Hao;Jiaxuan Weng;Jian Wang;Zhongle Wu;Cheng Yang
To support the planning and development of broadcasting, we first develop a novel fusion prediction model by introducing the coefficient of variation method (CVM) in radio wave propagation prediction to enhance the accuracy of the broadcast propagation model and reduce the complexity of the fusion modeling method. The main contributions of this paper are as follows: (1) The CVM is introduced into the field of channel modeling for the first time, and a fusion modeling approach with high accuracy and low complexity based on this method is proposed. (2) A systematic analysis of the CVM and the fusion modeling approach is conducted, establishing a fusion channel model based on an improved CVM. Experimental results indicate that compared to the ITU-R P.1546, ITU-R P.2001, and ITM models, the improves the prediction accuracy of the proposed by 50.39%, 60.47%, and 55.98%, respectively.
{"title":"Fusion Prediction Model of Broadcast Radio Signal Propagation Based on the Coefficient of Variation Method","authors":"Yulong Hao;Jiaxuan Weng;Jian Wang;Zhongle Wu;Cheng Yang","doi":"10.1109/TBC.2025.3570860","DOIUrl":"https://doi.org/10.1109/TBC.2025.3570860","url":null,"abstract":"To support the planning and development of broadcasting, we first develop a novel fusion prediction model by introducing the coefficient of variation method (CVM) in radio wave propagation prediction to enhance the accuracy of the broadcast propagation model and reduce the complexity of the fusion modeling method. The main contributions of this paper are as follows: (1) The CVM is introduced into the field of channel modeling for the first time, and a fusion modeling approach with high accuracy and low complexity based on this method is proposed. (2) A systematic analysis of the CVM and the fusion modeling approach is conducted, establishing a fusion channel model based on an improved CVM. Experimental results indicate that compared to the ITU-R P.1546, ITU-R P.2001, and ITM models, the improves the prediction accuracy of the proposed by 50.39%, 60.47%, and 55.98%, respectively.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"774-783"},"PeriodicalIF":4.8,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-02DOI: 10.1109/TBC.2025.3570869
Mario Montagud Climent;Marc Martos;Álvaro Egea;Sergi Fernández Langa
Social Virtual Reality (VR) enables shared media experiences between remote people inside immersive and realistic 3D spaces, providing richer and more natural interactions than in classical 2D social conferencing tools. Likewise, the benefits and engagement can even be magnified by integrating realistic and volumetric user representations (i.e., 3D holograms) in these virtual environments rather than synthetic avatars. This paper presents the design and evaluation of an interactive Social VR scenario for a joint and collaborative exploration of a catalogue of professional video clips by a broadcaster. On the one hand, the scenario includes a control panel to select the desired year and clip. After the year selection, a time travel through a lift effect is enforced to teleport users through a multi-level semi-open building in which each level / floor represents one year, and its look-and-feel is customized to resemble that year. On the other hand, the scenario allows the integration of up to four users represented as 3D holograms (full-body and full volume Point Clouds), each one with his/her own screen for video consumption, and arranged in a cross 360° shape to allow for a natural visual interaction among themselves. The evaluation results with N=48 professionals of the broadcast sector not only provide relevant insights about the technical requirements and obtained performance, but confirm the satisfactory user experience (in terms of presence, togetherness, quality of interaction) provided by the presented technology and VR scenario and, most importantly, reveal and contribute to identifying the potential and opportunities of Social VR in the broadcast / video consumption landscape.
{"title":"Social VR With Holographic Comms: Enablers for New Engaging Experiences Within the TV / Video Consumption Landscape","authors":"Mario Montagud Climent;Marc Martos;Álvaro Egea;Sergi Fernández Langa","doi":"10.1109/TBC.2025.3570869","DOIUrl":"https://doi.org/10.1109/TBC.2025.3570869","url":null,"abstract":"Social Virtual Reality (VR) enables shared media experiences between remote people inside immersive and realistic 3D spaces, providing richer and more natural interactions than in classical 2D social conferencing tools. Likewise, the benefits and engagement can even be magnified by integrating realistic and volumetric user representations (i.e., 3D holograms) in these virtual environments rather than synthetic avatars. This paper presents the design and evaluation of an interactive Social VR scenario for a joint and collaborative exploration of a catalogue of professional video clips by a broadcaster. On the one hand, the scenario includes a control panel to select the desired year and clip. After the year selection, a time travel through a lift effect is enforced to teleport users through a multi-level semi-open building in which each level / floor represents one year, and its look-and-feel is customized to resemble that year. On the other hand, the scenario allows the integration of up to four users represented as 3D holograms (full-body and full volume Point Clouds), each one with his/her own screen for video consumption, and arranged in a cross 360° shape to allow for a natural visual interaction among themselves. The evaluation results with N=48 professionals of the broadcast sector not only provide relevant insights about the technical requirements and obtained performance, but confirm the satisfactory user experience (in terms of presence, togetherness, quality of interaction) provided by the presented technology and VR scenario and, most importantly, reveal and contribute to identifying the potential and opportunities of Social VR in the broadcast / video consumption landscape.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"793-807"},"PeriodicalIF":4.8,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}