Pub Date : 2024-07-09DOI: 10.1109/TBC.2024.3414656
Rafael Martínez;Álvaro Llorente;Alberto del Rio;Javier Serrano;David Jimenez
The evolution of telecommunication networks unlocks new possibilities for multimedia services, including enriched and personalized experiences. However, ensuring high Quality of Service and Quality of Experience requires intelligent solutions at the edge. This study investigates the real-time detection of race bib numbers using YOLOv8, a state-of-the-art object detection framework, within the context of 5G/6G edge computing. We train (BDBD and SVHN datasets) and analyze various YOLOv8 models (nano to extreme) across two diverse racing datasets (TGCRBNW and RBNR), encompassing varied environmental conditions (daytime and nighttime). Our assessment focuses on key performance metrics, including processing time, efficiency, and accuracy. For instance, on the TGCRBNW dataset, the extreme-sized model shows a noticeable reduction in prediction time when the more powerful GPU is used, with times decreasing from 1,161 to 54 seconds on a desktop computer. Similarly, on the RBNR dataset, the extreme-sized model exhibits a significant reduction in prediction time from 373 to 15 seconds when using the more powerful GPU. In terms of accuracy, we found varying performance across scenarios and datasets. For example, not good enough results are obtained in most scenarios on the TGCRBNW dataset (lower than 50% in all sets and models), while YOLOv8m obtain the high accuracy in several scenarios on the RBNR dataset (almost 80% of accuracy in the best set). Variability in prediction times was observed between different computer architectures, highlighting the importance of selecting appropriate hardware for specific tasks. These results emphasize the importance of aligning computational resources with the demands of real-world tasks to achieve timely and accurate predictions.
{"title":"Performance Evaluation of YOLOv8-Based Bib Number Detection in Media Streaming Race","authors":"Rafael Martínez;Álvaro Llorente;Alberto del Rio;Javier Serrano;David Jimenez","doi":"10.1109/TBC.2024.3414656","DOIUrl":"10.1109/TBC.2024.3414656","url":null,"abstract":"The evolution of telecommunication networks unlocks new possibilities for multimedia services, including enriched and personalized experiences. However, ensuring high Quality of Service and Quality of Experience requires intelligent solutions at the edge. This study investigates the real-time detection of race bib numbers using YOLOv8, a state-of-the-art object detection framework, within the context of 5G/6G edge computing. We train (BDBD and SVHN datasets) and analyze various YOLOv8 models (nano to extreme) across two diverse racing datasets (TGCRBNW and RBNR), encompassing varied environmental conditions (daytime and nighttime). Our assessment focuses on key performance metrics, including processing time, efficiency, and accuracy. For instance, on the TGCRBNW dataset, the extreme-sized model shows a noticeable reduction in prediction time when the more powerful GPU is used, with times decreasing from 1,161 to 54 seconds on a desktop computer. Similarly, on the RBNR dataset, the extreme-sized model exhibits a significant reduction in prediction time from 373 to 15 seconds when using the more powerful GPU. In terms of accuracy, we found varying performance across scenarios and datasets. For example, not good enough results are obtained in most scenarios on the TGCRBNW dataset (lower than 50% in all sets and models), while YOLOv8m obtain the high accuracy in several scenarios on the RBNR dataset (almost 80% of accuracy in the best set). Variability in prediction times was observed between different computer architectures, highlighting the importance of selecting appropriate hardware for specific tasks. These results emphasize the importance of aligning computational resources with the demands of real-world tasks to achieve timely and accurate predictions.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1126-1138"},"PeriodicalIF":3.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10591494","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1109/TBC.2024.3405313
Fei Qi;Lei Liu;Weiliang Xie
This paper studies the realization of wireless video transmission by leveraging 5G mixed mode with multimedia broadcast multicast services (MBMS). In particular, it investigates a number of key elements, such as physical layer modeling and precoding strategies, for MBMS implementation with large-scale multi-input multi-output (MIMO). A novel hybrid 5G mixed mode system is proposed to seamlessly integrate unicast and multicast transmissions, wherein system architecture, user grouping strategies, interference mitigation techniques, and optimized multicast beamforming approach are comprehensively elucidated. The performance of our proposed system is assessed through comprehensive simulations and analysis. The results indicate significant improvements in coding and spectral efficiencies while combining MIMO with layer division multiplexing (LDM).
{"title":"Hybrid Unicast/Multicast Massive MIMO Precoding for 5G Mixed Mode","authors":"Fei Qi;Lei Liu;Weiliang Xie","doi":"10.1109/TBC.2024.3405313","DOIUrl":"10.1109/TBC.2024.3405313","url":null,"abstract":"This paper studies the realization of wireless video transmission by leveraging 5G mixed mode with multimedia broadcast multicast services (MBMS). In particular, it investigates a number of key elements, such as physical layer modeling and precoding strategies, for MBMS implementation with large-scale multi-input multi-output (MIMO). A novel hybrid 5G mixed mode system is proposed to seamlessly integrate unicast and multicast transmissions, wherein system architecture, user grouping strategies, interference mitigation techniques, and optimized multicast beamforming approach are comprehensively elucidated. The performance of our proposed system is assessed through comprehensive simulations and analysis. The results indicate significant improvements in coding and spectral efficiencies while combining MIMO with layer division multiplexing (LDM).","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1044-1051"},"PeriodicalIF":3.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
HTTP live streaming delivers dynamically video content with varying bitrates to accommodate the dynamic real-time bandwidth fluctuations while considering diverse user preferences and device capabilities. Existing flow control solutions do not provide support for new features such as multi-source content transmission. In this paper, we propose a distributed multi-source rate control optimization algorithm (DMRCA) that maximizes the overall network bandwidth utility and improves viewer Quality of Experience (QoE). First, we model the rate control problem as a dual-optimized multi-source and multi-rate problem. Then, we decompose the problem into sub-problems of source rate selection and user rate adaptation and we prove that solving the original problem is equivalent to solving these two sub-problems. Furthermore, we propose DMRCA as a fully distributed algorithm to solve these sub-problems and derive an optimal solution and we discuss DMRCA’s complexity and convergence. Finally, through a series of simulation tests, we demonstrate the superiority of our proposed algorithm compared to alternative state-of-the-art solutions.
{"title":"A Novel Distributed Multi-Source Optimal Rate Control Solution for HTTP Live Video Streaming","authors":"Shujie Yang;Chuxing Fang;Lujie Zhong;Mu Wang;Zan Zhou;Han Xiao;Hao Hao;Changqiao Xu;Gabriel-Miro Muntean","doi":"10.1109/TBC.2024.3391051","DOIUrl":"10.1109/TBC.2024.3391051","url":null,"abstract":"HTTP live streaming delivers dynamically video content with varying bitrates to accommodate the dynamic real-time bandwidth fluctuations while considering diverse user preferences and device capabilities. Existing flow control solutions do not provide support for new features such as multi-source content transmission. In this paper, we propose a distributed multi-source rate control optimization algorithm (DMRCA) that maximizes the overall network bandwidth utility and improves viewer Quality of Experience (QoE). First, we model the rate control problem as a dual-optimized multi-source and multi-rate problem. Then, we decompose the problem into sub-problems of source rate selection and user rate adaptation and we prove that solving the original problem is equivalent to solving these two sub-problems. Furthermore, we propose DMRCA as a fully distributed algorithm to solve these sub-problems and derive an optimal solution and we discuss DMRCA’s complexity and convergence. Finally, through a series of simulation tests, we demonstrate the superiority of our proposed algorithm compared to alternative state-of-the-art solutions.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"792-807"},"PeriodicalIF":3.2,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10589341","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linear discriminant analysis (LDA) is a well-known feature-extraction technique for data analytic and pattern classification. As the dimensionality of multimedia data has increased in this big era, it is often to characterize data by tensors. Over the past two decades, researchers have thus explored to extend LDA to the general tensor space, especially in two common ways: LDA of tensors using tensor decomposition methods (by conversion of tensors to matrices) and LDA of tensors built upon the T-product. However, both of the aforementioned approaches have restrictions thereby. A critical problem about how to carry out LDA of arbitrary scatter tensors based on the Einstein product still remains unsolved by the existing methods. Therefore, we propose a novel tensor LDA (a.k.a. TLDA) approach, which can carry out the LDA of arbitrary-dimensional scatter-tensors without any need of tensor decomposition. Besides, for reducing the computation time, we also design a parallel paradigm to execute our proposed TLDA in this work. Numerical experiments conducted over real multimedia data demonstrate the efficacy of our proposed new TLDA in terms of classification accuracy. Moreover, the comparison of the classification accuracies, computational-complexities, and memory-complexities of our proposed novel TLDA scheme and other existing tensor-based LDA methods is made. By leveraging TLDA for high-dimensional feature extraction, segmentation, and user-item interaction data processing, future multimedia recommendation systems can facilitate more accurate, engaging, and satisfactory user experience over the Internet.
{"title":"Multimedia Classification via Tensor Linear Discriminant Analysis","authors":"Shih-Yu Chang;Hsiao-Chun Wu;Kun Yan;Scott Chih-Hao Huang;Yiyan Wu","doi":"10.1109/TBC.2024.3417342","DOIUrl":"10.1109/TBC.2024.3417342","url":null,"abstract":"Linear discriminant analysis (LDA) is a well-known feature-extraction technique for data analytic and pattern classification. As the dimensionality of multimedia data has increased in this big era, it is often to characterize data by tensors. Over the past two decades, researchers have thus explored to extend LDA to the general tensor space, especially in two common ways: LDA of tensors using tensor decomposition methods (by conversion of tensors to matrices) and LDA of tensors built upon the T-product. However, both of the aforementioned approaches have restrictions thereby. A critical problem about how to carry out LDA of arbitrary scatter tensors based on the Einstein product still remains unsolved by the existing methods. Therefore, we propose a novel tensor LDA (a.k.a. TLDA) approach, which can carry out the LDA of arbitrary-dimensional scatter-tensors without any need of tensor decomposition. Besides, for reducing the computation time, we also design a parallel paradigm to execute our proposed TLDA in this work. Numerical experiments conducted over real multimedia data demonstrate the efficacy of our proposed new TLDA in terms of classification accuracy. Moreover, the comparison of the classification accuracies, computational-complexities, and memory-complexities of our proposed novel TLDA scheme and other existing tensor-based LDA methods is made. By leveraging TLDA for high-dimensional feature extraction, segmentation, and user-item interaction data processing, future multimedia recommendation systems can facilitate more accurate, engaging, and satisfactory user experience over the Internet.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1139-1152"},"PeriodicalIF":3.2,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the escalating prevalence of datacasting, live streaming and high-quality video consumption on mobile devices, there arises an increasing demand for a cost-effective and reliable approach to transmit large volumes of such content to extensive audiences. While broadband mobile networks can increase capacity through denser base stations and higher frequencies, the linear pace of facility development makes it difficult to match the non-linear growth of the service throughput. Terrestrial Broadcast has proven itself to be significantly more efficient in transmitting popular video streams to mobile devices over a large area. However, due to its downlink-only nature, it falls short of delivering consistently reliable services. Hence, the convergence of terrestrial broadcast and broadband mobile networks has resurfaced as a pertinent topic for consideration. In this paper, terrestrial broadcast is adopted as the main pipe to transmit streaming services to mobile phones, with a 5th generation mobile communications (5G) new radio (NR) mobile carrier employed to provide complementary packet loss retransmission service, ensuring a seamless service experience. First, a cross-standard packet retransmission (CPR) scheme is proposed based on 5G broadcast and 5G NR Systems. Corresponding protocols and schemes are introduced, and a prototype system is realized. CPR is able to support delay-insensitive datacasting services very well, yet its higher layer convergence poses challenges for supporting delay-sensitive real-time services. To address this, a MAC-layer homogeneous packet retransmission (HPR) scheme is proposed. The basic principle is to utilize the carrier aggregation mechanism of 5G, modifying the protocols to enable one carrier to simulate broadcast while maintaining unicast in another carrier. In HPR, packet retransmission can be done at the MAC layer, reducing the retransmission delay to within 5 microseconds. Simulation and trial results are presented based on the proposed schemes.
随着数据传输、流媒体直播和高质量视频消费在移动设备上的日益普及,人们越来越需要一种具有成本效益且可靠的方法来向广大受众传输大量此类内容。虽然宽带移动网络可以通过更密集的基站和更高的频率来提高容量,但设施的线性发展速度很难与服务吞吐量的非线性增长相匹配。事实证明,地面广播在向大范围移动设备传输流行视频流方面效率更高。然而,由于其仅具有下行链路的特性,它无法提供持续可靠的服务。因此,地面广播与宽带移动网络的融合再次成为需要考虑的相关话题。本文采用地面广播作为向手机传输流媒体服务的主要管道,并利用第五代移动通信(5G)新无线电(NR)移动载波提供互补的丢包重传服务,确保无缝的服务体验。首先,提出了一种基于 5G 广播和 5G NR 系统的跨标准数据包重传(CPR)方案。介绍了相应的协议和方案,并实现了一个原型系统。CPR 能够很好地支持对延迟不敏感的数据广播服务,但其高层融合对支持对延迟敏感的实时服务提出了挑战。为解决这一问题,提出了一种 MAC 层同质数据包重传(HPR)方案。其基本原理是利用 5G 的载波聚合机制,修改协议使一个载波能够模拟广播,同时在另一个载波中保持单播。在 HPR 中,数据包重传可在 MAC 层完成,从而将重传延迟减少到 5 微秒以内。本文介绍了基于所提方案的仿真和试验结果。
{"title":"Packet Retransmission Schemes and Trials for Broadcast Services in Mobile Scenarios","authors":"Yin Xu;Hao Ju;Zigang Fu;Xin Lin;Tianyao Ma;Dazhi He;Yang Chen;Dajun Zhang;Ke Wang;Wenjun Zhang;Yiyan Wu","doi":"10.1109/TBC.2024.3410706","DOIUrl":"10.1109/TBC.2024.3410706","url":null,"abstract":"With the escalating prevalence of datacasting, live streaming and high-quality video consumption on mobile devices, there arises an increasing demand for a cost-effective and reliable approach to transmit large volumes of such content to extensive audiences. While broadband mobile networks can increase capacity through denser base stations and higher frequencies, the linear pace of facility development makes it difficult to match the non-linear growth of the service throughput. Terrestrial Broadcast has proven itself to be significantly more efficient in transmitting popular video streams to mobile devices over a large area. However, due to its downlink-only nature, it falls short of delivering consistently reliable services. Hence, the convergence of terrestrial broadcast and broadband mobile networks has resurfaced as a pertinent topic for consideration. In this paper, terrestrial broadcast is adopted as the main pipe to transmit streaming services to mobile phones, with a 5th generation mobile communications (5G) new radio (NR) mobile carrier employed to provide complementary packet loss retransmission service, ensuring a seamless service experience. First, a cross-standard packet retransmission (CPR) scheme is proposed based on 5G broadcast and 5G NR Systems. Corresponding protocols and schemes are introduced, and a prototype system is realized. CPR is able to support delay-insensitive datacasting services very well, yet its higher layer convergence poses challenges for supporting delay-sensitive real-time services. To address this, a MAC-layer homogeneous packet retransmission (HPR) scheme is proposed. The basic principle is to utilize the carrier aggregation mechanism of 5G, modifying the protocols to enable one carrier to simulate broadcast while maintaining unicast in another carrier. In HPR, packet retransmission can be done at the MAC layer, reducing the retransmission delay to within 5 microseconds. Simulation and trial results are presented based on the proposed schemes.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1113-1125"},"PeriodicalIF":3.2,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28DOI: 10.1109/TBC.2024.3407482
Haojiang Li;Wenjun Zhang;Yin Xu;Dazhi He;Haoyang Li
With the arrival of the 6G era, wireless communication networks will face increased pressure due to diversified service traffic with ultra-large bandwidth, ultra-low latency, and massive connections, making it difficult to guarantee quality of service. However, broadcasting can realize wide-area coverage with lower physical transmission resource occupancy. Therefore, the convergence of broadcasting and 6G networks can promote the evolution and upgrade of traditional broadcasting services towards flexibility, dynamics, and personalization, and at the same time, can effectively alleviate the data congestion in mobile communication networks. In this paper, we firstly introduce the three typical application scenarios of broadcasting and 6G convergence in the future, and summarize the vital technologies and challenges in constructing the converged network. On this basis, we propose a broadcasting and 6G converged network architecture and a next-generation 6G broadcasting core network architecture, and finally introduce the typical collaboration modes of the converged network.
{"title":"Broadcasting and 6G Converged Network Architecture","authors":"Haojiang Li;Wenjun Zhang;Yin Xu;Dazhi He;Haoyang Li","doi":"10.1109/TBC.2024.3407482","DOIUrl":"10.1109/TBC.2024.3407482","url":null,"abstract":"With the arrival of the 6G era, wireless communication networks will face increased pressure due to diversified service traffic with ultra-large bandwidth, ultra-low latency, and massive connections, making it difficult to guarantee quality of service. However, broadcasting can realize wide-area coverage with lower physical transmission resource occupancy. Therefore, the convergence of broadcasting and 6G networks can promote the evolution and upgrade of traditional broadcasting services towards flexibility, dynamics, and personalization, and at the same time, can effectively alleviate the data congestion in mobile communication networks. In this paper, we firstly introduce the three typical application scenarios of broadcasting and 6G convergence in the future, and summarize the vital technologies and challenges in constructing the converged network. On this basis, we propose a broadcasting and 6G converged network architecture and a next-generation 6G broadcasting core network architecture, and finally introduce the typical collaboration modes of the converged network.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"971-979"},"PeriodicalIF":3.2,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the emergence of transformer-based feature extractors, the effect of image quality assessment (IQA) has improved, but its interpretability is limited. In addition, images repaired by generative adversarial networks (GANs) produce realistic textures and spatial misalignments with high-quality images. In this paper, we develop a content-aware full-reference IQA method without changing the original convolutional neural network feature extractor. First, image signal-to-noise (SNR) mapping is performed experimentally to verify its superior content-aware ability, and based on the SNR mapping of the reference image, we fuse multiscale distortion and normal image features according to a fusion strategy that enhances the informative area. Second, judging the quality of GAN-generated images from the perspective of focusing on content may ignore the alignment between pixels; therefore, we add a Gram-matrix-based texture enhancement module to boost the texture information between distorted and normal difference features. Finally, experiments on numerous public datasets prove the superior performance of the proposed method in predicting image quality.
{"title":"A Content-Aware Full-Reference Image Quality Assessment Method Using a Gram Matrix and Signal-to-Noise","authors":"Shuqi Han;Yueting Huang;Mingliang Zhou;Xuekai Wei;Fan Jia;Xu Zhuang;Fei Cheng;Tao Xiang;Yong Feng;Huayan Pu;Jun Luo","doi":"10.1109/TBC.2024.3410707","DOIUrl":"10.1109/TBC.2024.3410707","url":null,"abstract":"With the emergence of transformer-based feature extractors, the effect of image quality assessment (IQA) has improved, but its interpretability is limited. In addition, images repaired by generative adversarial networks (GANs) produce realistic textures and spatial misalignments with high-quality images. In this paper, we develop a content-aware full-reference IQA method without changing the original convolutional neural network feature extractor. First, image signal-to-noise (SNR) mapping is performed experimentally to verify its superior content-aware ability, and based on the SNR mapping of the reference image, we fuse multiscale distortion and normal image features according to a fusion strategy that enhances the informative area. Second, judging the quality of GAN-generated images from the perspective of focusing on content may ignore the alignment between pixels; therefore, we add a Gram-matrix-based texture enhancement module to boost the texture information between distorted and normal difference features. Finally, experiments on numerous public datasets prove the superior performance of the proposed method in predicting image quality.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1279-1291"},"PeriodicalIF":3.2,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-27DOI: 10.1109/TBC.2024.3407596
Yang Liu;Jie Wang;Ruohan Cao;Yueming Lu;Yaojun Qiao;Yuanqing Xia;Daoqi Han
This paper presents a comprehensive investigation into the crucial aspect of security within 5G broadcasting environments, with a particular focus on content production centers. It dives into the unique challenges and vulnerabilities associated with 5G technology, specifically within the context of broadcasting media. The study provides an up-to-date survey of the current landscape in 5G network security, emphasizing the specific requirements and risks specific to broadcasting. In response to these challenges, we propose a set of robust security strategies and technologies specifically tailored for these environments. Through rigorous simulations and compelling case studies, we demonstrate the efficacy of these strategies within a 5G broadcasting context. Ultimately, this paper aims to offer invaluable insights for broadcasters, policymakers, and technologists, enabling them to enhance the security and integrity of 5G broadcasting networks through informed decision-making and implementation of best practices.
{"title":"Securing Content Production Centers in 5G Broadcasting: Strategies and Technologies for Mitigating Cybersecurity Risks","authors":"Yang Liu;Jie Wang;Ruohan Cao;Yueming Lu;Yaojun Qiao;Yuanqing Xia;Daoqi Han","doi":"10.1109/TBC.2024.3407596","DOIUrl":"10.1109/TBC.2024.3407596","url":null,"abstract":"This paper presents a comprehensive investigation into the crucial aspect of security within 5G broadcasting environments, with a particular focus on content production centers. It dives into the unique challenges and vulnerabilities associated with 5G technology, specifically within the context of broadcasting media. The study provides an up-to-date survey of the current landscape in 5G network security, emphasizing the specific requirements and risks specific to broadcasting. In response to these challenges, we propose a set of robust security strategies and technologies specifically tailored for these environments. Through rigorous simulations and compelling case studies, we demonstrate the efficacy of these strategies within a 5G broadcasting context. Ultimately, this paper aims to offer invaluable insights for broadcasters, policymakers, and technologists, enabling them to enhance the security and integrity of 5G broadcasting networks through informed decision-making and implementation of best practices.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1008-1017"},"PeriodicalIF":3.2,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-25DOI: 10.1109/TBC.2024.3408643
Jingbo He;Xiaohai He;Shuhua Xiong;Honggang Chen
Single image super-resolution (SISR) is a task of reconstructing high-resolution (HR) images from low-resolution (LR) images, which are obtained by some degradation process. Deep neural networks (DNNs) have greatly advanced the frontier of image super-resolution research and replaced traditional methods as the de facto standard approach. The attention mechanism enables the SR algorithms to achieve breakthrough performance after another. However, limited research has been conducted on the interaction and integration of attention mechanisms across different dimensions. To tackle this issue, in this paper, we propose a cross-dimensional attention fusion network (CAFN) to effectively achieve cross-dimensional inter-action with long-range dependencies. Specifically, the proposed approach involves the utilization of a cross-dimensional aggrega-tion module (CAM) to effectively capture contextual information by integrating both spatial and channel importance maps. The design of information fusion module (IFM) in CAM serves as a bridge for parallel dual-attention information fusion. In addition, a novel memory-adaptive multi-stage (MAMS) training method is proposed. We perform warm-start retraining with the same setting as the previous stage, without increasing memory consumption. If the memory is sufficient, we finetune the model with a larger patch size after the warm-start. The experimental results definitively demonstrate the superior performance of our cross-dimensional attention fusion network and training strategy compared to state-of-the-art (SOTA) methods, as evidenced by both quantitative and qualitative metrics.
单幅图像超分辨率(SISR)是一项从低分辨率(LR)图像重建高分辨率(HR)图像的任务,而低分辨率(LR)图像是通过一定的降解过程获得的。深度神经网络(DNN)极大地推动了图像超分辨率研究的前沿发展,并取代传统方法成为事实上的标准方法。注意力机制使 SR 算法取得了一个又一个突破性的性能。然而,关于注意力机制在不同维度上的交互与融合的研究还很有限。为解决这一问题,我们在本文中提出了一种跨维注意力融合网络(CAFN),以有效实现具有长程依赖性的跨维交互作用。具体来说,所提出的方法包括利用跨维聚合模块(CAM),通过整合空间和通道重要性图来有效捕捉上下文信息。CAM 中信息融合模块(IFM)的设计可作为并行双注意信息融合的桥梁。此外,我们还提出了一种新颖的记忆自适应多阶段(MAMS)训练方法。我们在不增加内存消耗的情况下,以与前一阶段相同的设置执行热启动再训练。如果内存充足,我们会在热启动后使用更大的补丁尺寸对模型进行微调。实验结果从定量和定性指标两方面明确证明,与最先进的(SOTA)方法相比,我们的跨维注意力融合网络和训练策略具有更优越的性能。
{"title":"Cross-Dimensional Attention Fusion Network for Simulated Single Image Super-Resolution","authors":"Jingbo He;Xiaohai He;Shuhua Xiong;Honggang Chen","doi":"10.1109/TBC.2024.3408643","DOIUrl":"10.1109/TBC.2024.3408643","url":null,"abstract":"Single image super-resolution (SISR) is a task of reconstructing high-resolution (HR) images from low-resolution (LR) images, which are obtained by some degradation process. Deep neural networks (DNNs) have greatly advanced the frontier of image super-resolution research and replaced traditional methods as the de facto standard approach. The attention mechanism enables the SR algorithms to achieve breakthrough performance after another. However, limited research has been conducted on the interaction and integration of attention mechanisms across different dimensions. To tackle this issue, in this paper, we propose a cross-dimensional attention fusion network (CAFN) to effectively achieve cross-dimensional inter-action with long-range dependencies. Specifically, the proposed approach involves the utilization of a cross-dimensional aggrega-tion module (CAM) to effectively capture contextual information by integrating both spatial and channel importance maps. The design of information fusion module (IFM) in CAM serves as a bridge for parallel dual-attention information fusion. In addition, a novel memory-adaptive multi-stage (MAMS) training method is proposed. We perform warm-start retraining with the same setting as the previous stage, without increasing memory consumption. If the memory is sufficient, we finetune the model with a larger patch size after the warm-start. The experimental results definitively demonstrate the superior performance of our cross-dimensional attention fusion network and training strategy compared to state-of-the-art (SOTA) methods, as evidenced by both quantitative and qualitative metrics.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"909-923"},"PeriodicalIF":3.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-19DOI: 10.1109/TBC.2024.3399479
Axel De Decker;Jan De Cock;Peter Lambert;Glenn Van Wallendael
As the demand for high-quality video content continues to rise, accurately assessing the visual quality of digital videos has become more crucial than ever before. However, evaluating the perceptual quality of an impaired video in the absence of the original reference signal remains a significant challenge. To address this problem, we propose a novel No-Reference (NR) video quality metric called NR-VMAF. Our method is designed to replicate the popular Full-Reference (FR) metric VMAF in scenarios where the reference signal is unavailable or impractical to obtain. Like its FR counterpart, NR-VMAF is tailored specifically for measuring video quality in the presence of compression and scaling artifacts. The proposed model utilizes a deep convolutional neural network to extract quality-aware features from the pixel information of the distorted video, thereby eliminating the need for manual feature engineering. By adopting a patch-based approach, we are able to process high-resolution video data without any information loss. While the current model is trained solely on H.265/HEVC videos, its performance is verified on subjective datasets containing mainly H.264/AVC content. We demonstrate that NR-VMAF outperforms current state-of-the-art NR metrics while achieving a prediction accuracy that is comparable to VMAF and other FR metrics. Based on this strong performance, we believe that NR-VMAF is a viable approach to efficient and reliable No-Reference video quality assessment.
{"title":"No-Reference VMAF: A Deep Neural Network-Based Approach to Blind Video Quality Assessment","authors":"Axel De Decker;Jan De Cock;Peter Lambert;Glenn Van Wallendael","doi":"10.1109/TBC.2024.3399479","DOIUrl":"10.1109/TBC.2024.3399479","url":null,"abstract":"As the demand for high-quality video content continues to rise, accurately assessing the visual quality of digital videos has become more crucial than ever before. However, evaluating the perceptual quality of an impaired video in the absence of the original reference signal remains a significant challenge. To address this problem, we propose a novel No-Reference (NR) video quality metric called NR-VMAF. Our method is designed to replicate the popular Full-Reference (FR) metric VMAF in scenarios where the reference signal is unavailable or impractical to obtain. Like its FR counterpart, NR-VMAF is tailored specifically for measuring video quality in the presence of compression and scaling artifacts. The proposed model utilizes a deep convolutional neural network to extract quality-aware features from the pixel information of the distorted video, thereby eliminating the need for manual feature engineering. By adopting a patch-based approach, we are able to process high-resolution video data without any information loss. While the current model is trained solely on H.265/HEVC videos, its performance is verified on subjective datasets containing mainly H.264/AVC content. We demonstrate that NR-VMAF outperforms current state-of-the-art NR metrics while achieving a prediction accuracy that is comparable to VMAF and other FR metrics. Based on this strong performance, we believe that NR-VMAF is a viable approach to efficient and reliable No-Reference video quality assessment.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"844-861"},"PeriodicalIF":3.2,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}