IEEE Transactions on Broadcasting最新文献_第9页

Fast Transform Kernel Selection Based on Frequency Matching and Probability Model for AV1 基于频率匹配和概率模型的 AV1 快速变换核选择

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-26 DOI: 10.1109/TBC.2024.3374078

Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan

As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.

作为视频编码的基本组成部分，变换编码将空间域中分散的能量集中到频域的左上方区域。当与量化和熵编码相结合时，这种集中能极大地改善速率-失真性能。为了更好地适应图像内容的动态特性，开放媒体视频联盟 1（AV1）引入了多重变换内核，这带来了巨大的编码性能优势，尽管代价是相当高的计算复杂度。本文提出了一种基于频率匹配和概率模型的 AV1 快速变换内核选择算法，以在可接受的性能损失水平上有效加速编码过程。首先，本文首次定义了基于余弦相似性的频率匹配系数（FMF）概念，用以描述残差块与变换核的主频基图像之间的相似性。统计结果表明，FMF 与归一化率失真优化成本 (nRDOC) 之间存在明显的分布关系。然后，我们利用这些分布特征，通过将正态模型的参数表征为 FMF 的函数，为每个 FMF 建立了 nRDOC 的高斯正态概率模型，从而提高了正态模型的准确性和编码性能。最后，基于推导出的正则模型，我们设计了一种具有可扩展性和硬件友好性的快速选择算法，以跳过不具潜力的变换内核。实验结果表明，当跳过 57.66% 的变换核时，所提出的快速算法的性能损失为 1.15%，从而节省了 20.09% 的编码时间，优于文献中发现的其他快速算法，与 AV1 参考软件中基于神经网络的剪枝算法相比也具有竞争力。

{"title":"Fast Transform Kernel Selection Based on Frequency Matching and Probability Model for AV1","authors":"Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan","doi":"10.1109/TBC.2024.3374078","DOIUrl":"10.1109/TBC.2024.3374078","url":null,"abstract":"As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"693-707"},"PeriodicalIF":4.5,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Diversity Receiver for ATSC 3.0-in-Vehicle: Design and Field Evaluation in Metropolitan SFN 车载 ATSC 3.0 分集接收器：大都市 SFN 中的设计和现场评估

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-26 DOI: 10.1109/TBC.2024.3374061

Sungjun Ahn;Bo-Mi Lim;Sunhyoung Kwon;Sungho Jeon;Xianbin Wang;Sung-Ik Park

This paper demonstrates the feasibility of multi-antenna reception to facilitate mobile broadcasting for vehicular receivers. Starting from a dimension analysis estimating the spatial capacity of automobiles, we confirm multi-antenna embedding as a viable solution for vehicular broadcast receivers. Accordingly, a rolling prototype of an ATSC 3.0 multi-antenna diversity receiver (DivRx) is implemented and repeatedly tested on public roads. The field verification tests in this paper aim to evaluate the performance of DivRx in real broadcast environments, represented by an urban single-frequency network (SFN) with high-power transmissions using ultra-high frequencies. To this end, extensive field trials are drawn in an operating ATSC 3.0 network located in the Seoul Metropolitan Area, South Korea. Public on-air services of 1080p and 4K videos are tested, targeting inter-city journeys and trips in urban centroids, respectively. The mobile reliability gain of DivRx is empirically evaluated in terms of coverage probability and the field strength required for 95% receivability. The results show that leveraging four antennas can achieve 99% coverage of intra-city 4K service in the current network status, deriving 65% more gain over single-antenna systems. It is also exhibited that the signal strength requirement can be reduced by 13 dB or more. In addition to the empirical evaluation, we provide theoretical proofs aligning with the observations.

本文论证了多天线接收的可行性，以促进车载接收器的移动广播。从估算汽车空间容量的维度分析出发，我们确认多天线嵌入是车载广播接收器的可行解决方案。因此，我们实施了 ATSC 3.0 多天线分集接收器（DivRx）的滚动原型，并在公共道路上进行了反复测试。本文中的现场验证测试旨在评估 DivRx 在真实广播环境中的性能，这种广播环境以使用超高频的大功率传输的城市单频网络 (SFN) 为代表。为此，我们在韩国首尔大都会区的一个正在运行的 ATSC 3.0 网络中进行了广泛的现场试验。测试了 1080p 和 4K 视频的公共空中服务，分别针对城际旅行和城市中心区旅行。根据覆盖概率和 95% 接收率所需的场强，对 DivRx 的移动可靠性增益进行了实证评估。结果表明，在当前网络状况下，利用四天线可实现 99% 的市内 4K 服务覆盖率，比单天线系统多 65% 的增益。结果还显示，信号强度要求可降低 13 dB 或更多。除了经验评估，我们还提供了与观测结果一致的理论证明。

{"title":"Diversity Receiver for ATSC 3.0-in-Vehicle: Design and Field Evaluation in Metropolitan SFN","authors":"Sungjun Ahn;Bo-Mi Lim;Sunhyoung Kwon;Sungho Jeon;Xianbin Wang;Sung-Ik Park","doi":"10.1109/TBC.2024.3374061","DOIUrl":"10.1109/TBC.2024.3374061","url":null,"abstract":"This paper demonstrates the feasibility of multi-antenna reception to facilitate mobile broadcasting for vehicular receivers. Starting from a dimension analysis estimating the spatial capacity of automobiles, we confirm multi-antenna embedding as a viable solution for vehicular broadcast receivers. Accordingly, a rolling prototype of an ATSC 3.0 multi-antenna diversity receiver (DivRx) is implemented and repeatedly tested on public roads. The field verification tests in this paper aim to evaluate the performance of DivRx in real broadcast environments, represented by an urban single-frequency network (SFN) with high-power transmissions using ultra-high frequencies. To this end, extensive field trials are drawn in an operating ATSC 3.0 network located in the Seoul Metropolitan Area, South Korea. Public on-air services of 1080p and 4K videos are tested, targeting inter-city journeys and trips in urban centroids, respectively. The mobile reliability gain of DivRx is empirically evaluated in terms of coverage probability and the field strength required for 95% receivability. The results show that leveraging four antennas can achieve 99% coverage of intra-city 4K service in the current network status, deriving 65% more gain over single-antenna systems. It is also exhibited that the signal strength requirement can be reduced by 13 dB or more. In addition to the empirical evaluation, we provide theoretical proofs aligning with the observations.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"367-381"},"PeriodicalIF":4.5,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning Approach for No-Reference Screen Content Video Quality Assessment 用于无参照屏幕内容视频质量评估的深度学习方法

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-26 DOI: 10.1109/TBC.2024.3374042

Ngai-Wing Kwong;Yui-Lam Chan;Sik-Ho Tsang;Ziyin Huang;Kin-Man Lam

Screen content video (SCV) has drawn much more attention than ever during the COVID-19 period and has evolved from a niche to a mainstream due to the recent proliferation of remote offices, online meetings, shared-screen collaboration, and gaming live streaming. Therefore, quality assessments for screen content media are highly demanded to maintain service quality recently. Although many practical natural scene video quality assessment methods have been proposed and achieved promising results, these methods cannot be applied to the screen content video quality assessment (SCVQA) task directly since the content characteristics of SCV are substantially different from natural scene video. Besides, only one no-reference SCVQA (NR-SCVQA) method, which requires handcrafted features, has been proposed in the literature. Therefore, we propose the first deep learning approach explicitly designed for NR-SCVQA. First, a multi-channel convolutional neural network (CNN) model is used to extract spatial quality features of pictorial and textual regions separately. Since there is no human annotated quality for each screen content frame (SCF), the CNN model is pre-trained in a multi-task self-supervised fashion to extract spatial quality feature representation of SCF. Second, we propose a time-distributed CNN transformer model (TCNNT) to further process all SCF spatial quality feature representations of an SCV and learn spatial and temporal features simultaneously so that high-level spatiotemporal features of SCV can be extracted and used to assess the whole SCV quality. Experimental results demonstrate the robustness and validity of our model, which is closely related to human perception.

在 COVID-19 期间，屏幕内容视频（SCV）比以往任何时候都更受关注，由于近年来远程办公、在线会议、共享屏幕协作和游戏直播的普及，它已从一个小众领域发展成为主流领域。因此，屏幕内容媒体的质量评估是近期维持服务质量的高要求。尽管已经提出了许多实用的自然场景视频质量评估方法并取得了可喜的成果，但由于 SCV 的内容特征与自然场景视频有很大不同，因此这些方法无法直接应用于屏幕内容视频质量评估（SCVQA）任务。此外，文献中只提出了一种无参考 SCVQA（NR-SCVQA）方法，该方法需要手工制作特征。因此，我们提出了第一种专为 NR-SCVQA 设计的深度学习方法。首先，使用多通道卷积神经网络（CNN）模型分别提取图像和文本区域的空间质量特征。由于每个屏幕内容帧（SCF）都没有人工标注的质量，因此 CNN 模型采用多任务自监督方式进行预训练，以提取 SCF 的空间质量特征表示。其次，我们提出了一种时间分布式 CNN 变换器模型（TCNNT），以进一步处理 SCV 的所有 SCF 空间质量特征表示，并同时学习空间和时间特征，从而提取 SCV 的高级时空特征，用于评估整个 SCV 质量。实验结果证明了我们的模型与人类感知密切相关，具有鲁棒性和有效性。

{"title":"Deep Learning Approach for No-Reference Screen Content Video Quality Assessment","authors":"Ngai-Wing Kwong;Yui-Lam Chan;Sik-Ho Tsang;Ziyin Huang;Kin-Man Lam","doi":"10.1109/TBC.2024.3374042","DOIUrl":"10.1109/TBC.2024.3374042","url":null,"abstract":"Screen content video (SCV) has drawn much more attention than ever during the COVID-19 period and has evolved from a niche to a mainstream due to the recent proliferation of remote offices, online meetings, shared-screen collaboration, and gaming live streaming. Therefore, quality assessments for screen content media are highly demanded to maintain service quality recently. Although many practical natural scene video quality assessment methods have been proposed and achieved promising results, these methods cannot be applied to the screen content video quality assessment (SCVQA) task directly since the content characteristics of SCV are substantially different from natural scene video. Besides, only one no-reference SCVQA (NR-SCVQA) method, which requires handcrafted features, has been proposed in the literature. Therefore, we propose the first deep learning approach explicitly designed for NR-SCVQA. First, a multi-channel convolutional neural network (CNN) model is used to extract spatial quality features of pictorial and textual regions separately. Since there is no human annotated quality for each screen content frame (SCF), the CNN model is pre-trained in a multi-task self-supervised fashion to extract spatial quality feature representation of SCF. Second, we propose a time-distributed CNN transformer model (TCNNT) to further process all SCF spatial quality feature representations of an SCV and learn spatial and temporal features simultaneously so that high-level spatiotemporal features of SCV can be extracted and used to assess the whole SCV quality. Experimental results demonstrate the robustness and validity of our model, which is closely related to human perception.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"555-569"},"PeriodicalIF":4.5,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Transportation Management in Marine Internet of Vessels: A 5G Broadcasting-Centric Framework Leveraging Federated Learning 加强海洋船舶互联网的运输管理：利用联盟学习的以 5G 广播为中心的框架

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-22 DOI: 10.1109/TBC.2024.3394289

Desheng Chen;Jiabao Wen;Huiao Dai;Meng Xi;Shuai Xiao;Jiachen Yang

The Maritime Internet of Things (MIoT) consists of offshore equipment such as ships, consoles, and base stations, which are used for maritime information sharing to assist driving decision-making. However, with the increase in the number of MIoT access devices, the risks of information security and data reliability have also significantly increased. In this paper, we describe a maritime Dynamic Ship Federated Information Security Sharing Model (DSF-ISS) in Maritime Internet of Vessels (MIoV) based on maritime 5G broadcasting technology. The main object of this study is to solve the problem of maritime information isolated island under the condition of low communication between maritime ship nodes. In this model, maritime ship nodes cooperation is based on the Contract Network Protocol (CNP), which considers task types, spatial, and temporal distribution of different vessels. We then propose an improved federated learning approach for local dynamic nodes based on maritime 5G broadcasting technology. Moreover, this study designs a proof of membership (PoM) to share local task model information in global blockchain. The results showed that DSF-ISS has a positive effect in maritime transportation work. It effectively realizes the secure sharing of information and protects the privacy of node data.

海事物联网（MIoT）由船舶、控制台和基站等近海设备组成，用于海事信息共享，辅助驾驶决策。然而，随着 MIoT 接入设备数量的增加，信息安全和数据可靠性的风险也大大增加。本文介绍了基于海事 5G 广播技术的海事船舶互联网（MIoV）中的海事动态船舶联盟信息安全共享模型（DSF-ISS）。本研究的主要目标是解决海上船舶节点间通信不畅情况下的海上信息孤岛问题。在该模型中，海事船舶节点合作基于合约网络协议（CNP），该协议考虑了不同船舶的任务类型、空间和时间分布。然后，我们提出了一种基于海事 5G 广播技术的改进型本地动态节点联合学习方法。此外，本研究还设计了一种成员证明（PoM），用于在全球区块链中共享本地任务模型信息。研究结果表明，DSF-ISS 在海上运输工作中具有积极作用。它有效地实现了信息的安全共享，保护了节点数据的隐私。

{"title":"Enhancing Transportation Management in Marine Internet of Vessels: A 5G Broadcasting-Centric Framework Leveraging Federated Learning","authors":"Desheng Chen;Jiabao Wen;Huiao Dai;Meng Xi;Shuai Xiao;Jiachen Yang","doi":"10.1109/TBC.2024.3394289","DOIUrl":"10.1109/TBC.2024.3394289","url":null,"abstract":"The Maritime Internet of Things (MIoT) consists of offshore equipment such as ships, consoles, and base stations, which are used for maritime information sharing to assist driving decision-making. However, with the increase in the number of MIoT access devices, the risks of information security and data reliability have also significantly increased. In this paper, we describe a maritime Dynamic Ship Federated Information Security Sharing Model (DSF-ISS) in Maritime Internet of Vessels (MIoV) based on maritime 5G broadcasting technology. The main object of this study is to solve the problem of maritime information isolated island under the condition of low communication between maritime ship nodes. In this model, maritime ship nodes cooperation is based on the Contract Network Protocol (CNP), which considers task types, spatial, and temporal distribution of different vessels. We then propose an improved federated learning approach for local dynamic nodes based on maritime 5G broadcasting technology. Moreover, this study designs a proof of membership (PoM) to share local task model information in global blockchain. The results showed that DSF-ISS has a positive effect in maritime transportation work. It effectively realizes the secure sharing of information and protects the privacy of node data.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1091-1103"},"PeriodicalIF":3.2,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

JUST360: Optimizing 360-Degree Video Streaming Systems With Joint Utility JUST360：利用联合实用程序优化 360 度视频流系统

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-21 DOI: 10.1109/TBC.2024.3374066

Zhijun Li;Yumei Wang;Yu Liu;Junjie Li;Ping Zhu

360-degree videos, as a type of media that offers highly immersive experiences, often result in significant bandwidth waste due to incomplete views by users. This places a heavy demand on streaming systems to support high-bandwidth requirements. Recently, tile-based streaming systems combined with viewport prediction have become popular to improve bandwidth efficiency. However, since the viewport prediction is only reliable in the short term, maintaining a long buffer to avoid rebuffering is challenging. We propose JUST360, a joint utility based two-tier 360-degree video streaming system in this paper. To better improve the accuracy of utility evaluation, a utility model incorporates image quality and prediction accuracy is proposed to evaluate the contribution of each tile so that longer buffer and bandwidth efficiency can coexist. The optimal bitrate allocation strategy is determined by using model predictive control (MPC) to dynamically select the tiles according to their characteristics. Experiments show that our method successfully achieves higher PSNR and less rebuffering. Compared with other state-of-the-art methods, our proposed method can outperform the other methods by 3%-20% in terms of QoE.

360 度视频作为一种提供高度沉浸式体验的媒体类型，往往会因用户观看不完整而造成大量带宽浪费。这就对支持高带宽要求的流媒体系统提出了更高的要求。最近，为提高带宽效率，基于磁贴的流媒体系统与视口预测相结合开始流行起来。然而，由于视口预测仅在短期内可靠，因此维持一个较长的缓冲区以避免回弹具有挑战性。我们在本文中提出了基于联合效用的双层 360 度视频流系统 JUST360。为了更好地提高效用评估的准确性，我们提出了一个包含图像质量和预测准确性的效用模型，用于评估每个磁贴的贡献，从而使较长的缓冲区和带宽效率得以共存。通过使用模型预测控制（MPC）来根据瓦片的特性动态选择瓦片，从而确定最佳比特率分配策略。实验表明，我们的方法成功实现了更高的 PSNR 和更少的回波。与其他最先进的方法相比，我们提出的方法在 QoE 方面比其他方法高出 3%-20%。

{"title":"JUST360: Optimizing 360-Degree Video Streaming Systems With Joint Utility","authors":"Zhijun Li;Yumei Wang;Yu Liu;Junjie Li;Ping Zhu","doi":"10.1109/TBC.2024.3374066","DOIUrl":"10.1109/TBC.2024.3374066","url":null,"abstract":"360-degree videos, as a type of media that offers highly immersive experiences, often result in significant bandwidth waste due to incomplete views by users. This places a heavy demand on streaming systems to support high-bandwidth requirements. Recently, tile-based streaming systems combined with viewport prediction have become popular to improve bandwidth efficiency. However, since the viewport prediction is only reliable in the short term, maintaining a long buffer to avoid rebuffering is challenging. We propose JUST360, a joint utility based two-tier 360-degree video streaming system in this paper. To better improve the accuracy of utility evaluation, a utility model incorporates image quality and prediction accuracy is proposed to evaluate the contribution of each tile so that longer buffer and bandwidth efficiency can coexist. The optimal bitrate allocation strategy is determined by using model predictive control (MPC) to dynamically select the tiles according to their characteristics. Experiments show that our method successfully achieves higher PSNR and less rebuffering. Compared with other state-of-the-art methods, our proposed method can outperform the other methods by 3%-20% in terms of QoE.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"468-481"},"PeriodicalIF":4.5,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Synergistic Temporal-Spatial User-Aware Viewport Prediction for Optimal Adaptive 360-Degree Video Streaming 用户感知时空协同视口预测，实现最佳自适应 360 度视频流

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-21 DOI: 10.1109/TBC.2024.3374119

Yumei Wang;Junjie Li;Zhijun Li;Simou Shang;Yu Liu

360-degree videos usually require extremely high bandwidth and low latency for wireless transmission, which hinders their popularity. A tile-based viewport adaptive streaming scheme, which involves accurate viewport prediction and optimal bitrate adaptation to maintain user Quality of Experience (QoE) under a bandwidth-constrained network, has been proposed by researchers. However, viewport prediction is error-prone in long-term prediction, and bitrate adaptation schemes may waste bandwidth resources due to failing to consider various aspects of QoE. In this paper, we propose a synergistic temporal-spatial user-aware viewport prediction scheme for optimal adaptive 360-Degree video streaming (SPA360) to tackle these challenges. We use a user-aware viewport prediction mode, which offers a white box solution for Field of View (FoV) prediction. Specially, we employ temporal-spatial fusion for enhanced viewport prediction to minimize prediction errors. Our proposed utility prediction model jointly considers viewport probability distribution and metrics that directly affecting QoE to enable more precise bitrate adaptation. To optimize bitrate adaptation for tiled-based 360-degree video streaming, the problem is formulated as a packet knapsack problem and solved efficiently with a dynamic programming-based algorithm to maximize utility. The SPA360 scheme demonstrates improved performance in terms of both viewport prediction accuracy and bandwidth utilization, and our approach enhances the overall quality and efficiency of adaptive 360-degree video streaming.

360 度视频通常需要极高的带宽和低延迟进行无线传输，这阻碍了它们的普及。研究人员提出了一种基于磁贴的视口自适应流媒体方案，包括精确的视口预测和最佳比特率适应，以在带宽受限的网络条件下保持用户体验质量（QoE）。然而，视口预测在长期预测中容易出错，而比特率适应方案由于没有考虑 QoE 的各个方面，可能会浪费带宽资源。在本文中，我们针对最佳自适应 360 度视频流（SPA360）提出了一种协同时空用户感知视口预测方案，以应对这些挑战。我们采用用户感知视口预测模式，为视场（FoV）预测提供了白盒解决方案。特别是，我们采用了时空融合技术来增强视口预测，以最大限度地减少预测误差。我们提出的实用性预测模型联合考虑了视口概率分布和直接影响 QoE 的指标，以实现更精确的比特率适应。为了优化基于平铺的 360 度视频流的比特率适应性，我们将该问题表述为数据包背包问题，并采用基于动态编程的算法有效地解决了该问题，以实现效用最大化。SPA360 方案在视口预测准确性和带宽利用率方面都表现出了更好的性能，我们的方法提高了自适应 360 度视频流的整体质量和效率。

{"title":"Synergistic Temporal-Spatial User-Aware Viewport Prediction for Optimal Adaptive 360-Degree Video Streaming","authors":"Yumei Wang;Junjie Li;Zhijun Li;Simou Shang;Yu Liu","doi":"10.1109/TBC.2024.3374119","DOIUrl":"10.1109/TBC.2024.3374119","url":null,"abstract":"360-degree videos usually require extremely high bandwidth and low latency for wireless transmission, which hinders their popularity. A tile-based viewport adaptive streaming scheme, which involves accurate viewport prediction and optimal bitrate adaptation to maintain user Quality of Experience (QoE) under a bandwidth-constrained network, has been proposed by researchers. However, viewport prediction is error-prone in long-term prediction, and bitrate adaptation schemes may waste bandwidth resources due to failing to consider various aspects of QoE. In this paper, we propose a synergistic temporal-spatial user-aware viewport prediction scheme for optimal adaptive 360-Degree video streaming (SPA360) to tackle these challenges. We use a user-aware viewport prediction mode, which offers a white box solution for Field of View (FoV) prediction. Specially, we employ temporal-spatial fusion for enhanced viewport prediction to minimize prediction errors. Our proposed utility prediction model jointly considers viewport probability distribution and metrics that directly affecting QoE to enable more precise bitrate adaptation. To optimize bitrate adaptation for tiled-based 360-degree video streaming, the problem is formulated as a packet knapsack problem and solved efficiently with a dynamic programming-based algorithm to maximize utility. The SPA360 scheme demonstrates improved performance in terms of both viewport prediction accuracy and bandwidth utilization, and our approach enhances the overall quality and efficiency of adaptive 360-degree video streaming.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"453-467"},"PeriodicalIF":4.5,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

No-Reference Multi-Level Video Quality Assessment Metric for 3D-Synthesized Videos 三维合成视频的无参考多级视频质量评估指标

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-21 DOI: 10.1109/TBC.2024.3396696

Guangcheng Wang;Baojin Huang;Ke Gu;Yuchen Liu;Hongyan Liu;Quan Shi;Guangtao Zhai;Wenjun Zhang

The visual quality of 3D-synthesized videos is closely related to the development and broadcasting of immersive media such as free-viewpoint videos and six degrees of freedom navigation. Therefore, studying the 3D-Synthesized video quality assessment is helpful to promote the popularity of immersive media applications. Inspired by the texture compression, depth compression and virtual view synthesis polluting the visual quality of 3D-synthesized videos at pixel-, structure- and content-levels, this paper proposes a Multi-Level 3D-Synthesized Video Quality Assessment algorithm, namely ML-SVQA, which consists of a quality feature perception module and a quality feature regression module. Specifically, the quality feature perception module firstly extracts motion vector fields of the 3D-synthesized video at pixel-, structure- and content-levels by combining the perception mechanism of human visual system. Then, the quality feature perception module measures the temporal flicker distortion intensity in the no-reference environment by calculating the self-similarity of adjacent motion vector fields. Finally, the quality feature regression module uses the machine learning algorithm to learn the mapping of the developed quality features to the quality score. Experiments constructed on the public IRCCyN/IVC and SIAT synthesized video datasets show that our ML-SVQA is more effective than state-of-the-art image/video quality assessment methods in evaluating the quality of 3D-Synthesized videos.

三维合成视频的视觉质量与自由视点视频和六自由度导航等沉浸式媒体的开发和播放密切相关。因此，研究三维合成视频质量评估有助于促进身临其境媒体应用的普及。受纹理压缩、深度压缩和虚拟视图合成在像素级、结构级和内容级污染三维合成视频视觉质量的启发，本文提出了一种多级三维合成视频质量评估算法，即 ML-SVQA，该算法由质量特征感知模块和质量特征回归模块组成。具体来说，质量特征感知模块首先结合人类视觉系统的感知机制，从像素、结构和内容三个层面提取三维合成视频的运动矢量场。然后，质量特征感知模块通过计算相邻运动矢量场的自相似性来测量无参照环境下的时间闪烁失真强度。最后，质量特征回归模块使用机器学习算法来学习所开发的质量特征与质量得分之间的映射关系。在公开的 IRCCyN/IVC 和 SIAT 合成视频数据集上构建的实验表明，在评估 3D 合成视频质量方面，我们的 ML-SVQA 比最先进的图像/视频质量评估方法更有效。

{"title":"No-Reference Multi-Level Video Quality Assessment Metric for 3D-Synthesized Videos","authors":"Guangcheng Wang;Baojin Huang;Ke Gu;Yuchen Liu;Hongyan Liu;Quan Shi;Guangtao Zhai;Wenjun Zhang","doi":"10.1109/TBC.2024.3396696","DOIUrl":"10.1109/TBC.2024.3396696","url":null,"abstract":"The visual quality of 3D-synthesized videos is closely related to the development and broadcasting of immersive media such as free-viewpoint videos and six degrees of freedom navigation. Therefore, studying the 3D-Synthesized video quality assessment is helpful to promote the popularity of immersive media applications. Inspired by the texture compression, depth compression and virtual view synthesis polluting the visual quality of 3D-synthesized videos at pixel-, structure- and content-levels, this paper proposes a Multi-Level 3D-Synthesized Video Quality Assessment algorithm, namely ML-SVQA, which consists of a quality feature perception module and a quality feature regression module. Specifically, the quality feature perception module firstly extracts motion vector fields of the 3D-synthesized video at pixel-, structure- and content-levels by combining the perception mechanism of human visual system. Then, the quality feature perception module measures the temporal flicker distortion intensity in the no-reference environment by calculating the self-similarity of adjacent motion vector fields. Finally, the quality feature regression module uses the machine learning algorithm to learn the mapping of the developed quality features to the quality score. Experiments constructed on the public IRCCyN/IVC and SIAT synthesized video datasets show that our ML-SVQA is more effective than state-of-the-art image/video quality assessment methods in evaluating the quality of 3D-Synthesized videos.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"584-596"},"PeriodicalIF":4.5,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Compressed Video Super-Resolution With Guidance of Coding Priors 利用编码先验的指导实现深度压缩视频超分辨率

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-21 DOI: 10.1109/TBC.2024.3394291

Qiang Zhu;Feiyu Chen;Yu Liu;Shuyuan Zhu;Bing Zeng

Compressed video super-resolution (VSR) is employed to generate high-resolution (HR) videos from low-resolution (LR) compressed videos. Recently, some compressed VSR methods have adopted coding priors, such as partition maps, compressed residual frames, predictive pictures and motion vectors, to generate HR videos. However, these methods disregard the design of modules according to the specific characteristics of coding information, which limits the application efficiency of coding priors. In this paper, we propose a deep compressed VSR network that effectively introduces coding priors to construct high-quality HR videos. Specifically, we design a partition-guided feature extraction module to extract features from the LR video with the guidance of the partition average image. Moreover, we separate the video features into sparse features and dense features according to the energy distribution of the compressed residual frame to achieve feature enhancement. Additionally, we construct a temporal attention-based feature fusion module to use motion vectors and predictive pictures to eliminate motion errors between frames and temporally fuse features. Based on these modules, the coding priors are effectively employed in our model for constructing high-quality HR videos. The experimental results demonstrate that our method achieves better performance and lower complexity than the state-of-the-arts.

压缩视频超分辨率（VSR）用于从低分辨率（LR）压缩视频生成高分辨率（HR）视频。最近，一些压缩 VSR 方法采用了编码先验（如分区图、压缩残留帧、预测图片和运动向量）来生成高分辨率视频。然而，这些方法忽略了根据编码信息的具体特点设计模块，限制了编码前置的应用效率。在本文中，我们提出了一种深度压缩 VSR 网络，它能有效地引入编码先验来构建高质量的 HR 视频。具体来说，我们设计了一个分区引导的特征提取模块，在分区平均图像的引导下从 LR 视频中提取特征。此外，我们还根据压缩残留帧的能量分布，将视频特征分为稀疏特征和密集特征，以实现特征增强。此外，我们还构建了基于时间注意力的特征融合模块，利用运动向量和预测图片消除帧间的运动误差，并对特征进行时间融合。在这些模块的基础上，我们的模型有效地利用了编码先验来构建高质量的 HR 视频。实验结果表明，我们的方法比现有技术取得了更好的性能和更低的复杂度。

{"title":"Deep Compressed Video Super-Resolution With Guidance of Coding Priors","authors":"Qiang Zhu;Feiyu Chen;Yu Liu;Shuyuan Zhu;Bing Zeng","doi":"10.1109/TBC.2024.3394291","DOIUrl":"10.1109/TBC.2024.3394291","url":null,"abstract":"Compressed video super-resolution (VSR) is employed to generate high-resolution (HR) videos from low-resolution (LR) compressed videos. Recently, some compressed VSR methods have adopted coding priors, such as partition maps, compressed residual frames, predictive pictures and motion vectors, to generate HR videos. However, these methods disregard the design of modules according to the specific characteristics of coding information, which limits the application efficiency of coding priors. In this paper, we propose a deep compressed VSR network that effectively introduces coding priors to construct high-quality HR videos. Specifically, we design a partition-guided feature extraction module to extract features from the LR video with the guidance of the partition average image. Moreover, we separate the video features into sparse features and dense features according to the energy distribution of the compressed residual frame to achieve feature enhancement. Additionally, we construct a temporal attention-based feature fusion module to use motion vectors and predictive pictures to eliminate motion errors between frames and temporally fuse features. Based on these modules, the coding priors are effectively employed in our model for constructing high-quality HR videos. The experimental results demonstrate that our method achieves better performance and lower complexity than the state-of-the-arts.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"505-515"},"PeriodicalIF":4.5,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution ACDMSR：用于单图像超级分辨率的加速条件扩散模型

IF 4.5 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-21 DOI: 10.1109/TBC.2024.3374122

Axi Niu;Trung X. Pham;Kang Zhang;Jinqiu Sun;Yu Zhu;Qingsen Yan;In So Kweon;Yanning Zhang

Diffusion models have gained significant popularity for image-to-image translation tasks. Previous efforts applying diffusion models to image super-resolution have demonstrated that iteratively refining pure Gaussian noise using a U-Net architecture trained on denoising at various noise levels can yield satisfactory high-resolution images from low-resolution inputs. However, this iterative refinement process comes with the drawback of low inference speed, which strongly limits its applications. To speed up inference and further enhance the performance, our research revisits diffusion models in image super-resolution and proposes a straightforward yet significant diffusion model-based super-resolution method called ACDMSR (accelerated conditional diffusion model for image super-resolution). Specifically, we adopt existing image super-resolution methods and finetune them to provide conditional images from given low-resolution images, which can help to achieve better high-resolution results than just taking low-resolution images as conditional images. Then we adapt the diffusion model to perform super-resolution through a deterministic iterative denoising process, which helps to strongly decline the inference time. We demonstrate that our method surpasses previous attempts in qualitative and quantitative results through extensive experiments conducted on benchmark datasets such as Set5, Set14, Urban100, BSD100, and Manga109. Moreover, our approach generates more visually realistic counterparts for low-resolution images, emphasizing its effectiveness in practical scenarios.

扩散模型在图像到图像的转换任务中大受欢迎。之前将扩散模型应用于图像超分辨率的研究表明，使用在不同噪声水平下进行去噪训练的 U-Net 架构对纯高斯噪声进行迭代细化，可以从低分辨率输入生成令人满意的高分辨率图像。然而，这种迭代细化过程存在推理速度低的缺点，这极大地限制了它的应用。为了加快推理速度并进一步提高性能，我们的研究重新审视了图像超分辨率中的扩散模型，并提出了一种简单但意义重大的基于扩散模型的超分辨率方法，即 ACDMSR（用于图像超分辨率的加速条件扩散模型）。具体来说，我们采用了现有的图像超分辨率方法，并对其进行了微调，以从给定的低分辨率图像中提供条件图像，这有助于获得比仅将低分辨率图像作为条件图像更好的高分辨率结果。然后，我们调整扩散模型，通过确定性迭代去噪过程来执行超分辨率，这有助于大大减少推理时间。我们在 Set5、Set14、Urban100、BSD100 和 Manga109 等基准数据集上进行了大量实验，证明我们的方法在定性和定量结果上都超越了之前的尝试。此外，我们的方法还能为低分辨率图像生成视觉上更逼真的对应图像，从而强调了它在实际应用场景中的有效性。

{"title":"ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution","authors":"Axi Niu;Trung X. Pham;Kang Zhang;Jinqiu Sun;Yu Zhu;Qingsen Yan;In So Kweon;Yanning Zhang","doi":"10.1109/TBC.2024.3374122","DOIUrl":"10.1109/TBC.2024.3374122","url":null,"abstract":"Diffusion models have gained significant popularity for image-to-image translation tasks. Previous efforts applying diffusion models to image super-resolution have demonstrated that iteratively refining pure Gaussian noise using a U-Net architecture trained on denoising at various noise levels can yield satisfactory high-resolution images from low-resolution inputs. However, this iterative refinement process comes with the drawback of low inference speed, which strongly limits its applications. To speed up inference and further enhance the performance, our research revisits diffusion models in image super-resolution and proposes a straightforward yet significant diffusion model-based super-resolution method called ACDMSR (accelerated conditional diffusion model for image super-resolution). Specifically, we adopt existing image super-resolution methods and finetune them to provide conditional images from given low-resolution images, which can help to achieve better high-resolution results than just taking low-resolution images as conditional images. Then we adapt the diffusion model to perform super-resolution through a deterministic iterative denoising process, which helps to strongly decline the inference time. We demonstrate that our method surpasses previous attempts in qualitative and quantitative results through extensive experiments conducted on benchmark datasets such as Set5, Set14, Urban100, BSD100, and Manga109. Moreover, our approach generates more visually realistic counterparts for low-resolution images, emphasizing its effectiveness in practical scenarios.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"492-504"},"PeriodicalIF":4.5,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140205610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Accurate Network Dynamics for Enhanced Adaptive Video Streaming 学习准确的网络动态以增强自适应视频流

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-03-17 DOI: 10.1109/TBC.2024.3396698

Jiaoyang Yin;Hao Chen;Yiling Xu;Zhan Ma;Xiaozhong Xu

The adaptive bitrate (ABR) algorithm plays a crucial role in ensuring satisfactory quality of experience (QoE) in video streaming applications. Most existing approaches, either rule-based or learning-driven, tend to conduct ABR decisions based on limited network statistics, e.g., mean/standard deviation of recent throughput measurements. However, all of them lack a good understanding of network dynamics given the varying network conditions from time to time, leading to compromised performance, especially when the network condition changes significantly. In this paper, we propose a framework named ANT that aims to enhance adaptive video streaming by accurately learning network dynamics. ANT represents and detects specific network conditions by characterizing the entire spectrum of network fluctuations. It further trains multiple dedicated ABR models for each condition using deep reinforcement learning. During inference, a dynamic switching mechanism is devised to activate the appropriate ABR model based on real-time network condition sensing, enabling ANT to automatically adjust its control policies to different network conditions. Extensive experimental results demonstrate that our proposed ANT achieves a significant improvement in user QoE of 20.8%-41.2% in the video-on-demand scenario and 67.4%-134.5% in the live-streaming scenario compared to state-of-the-art methods, across a wide range of network conditions.

在视频流应用中，自适应比特率（ABR）算法对确保令人满意的体验质量（QoE）起着至关重要的作用。大多数现有方法，无论是基于规则的还是学习驱动的，都倾向于根据有限的网络统计数据（如最近吞吐量测量的平均值/标准偏差）做出 ABR 决定。然而，所有这些方法都缺乏对网络动态的充分了解，因为网络条件时常变化，导致性能受损，尤其是当网络条件发生重大变化时。在本文中，我们提出了一个名为 ANT 的框架，旨在通过准确学习网络动态来增强自适应视频流。ANT 通过描述整个网络波动频谱来表示和检测特定的网络条件。它还利用深度强化学习为每种情况训练多个专用 ABR 模型。在推理过程中，我们设计了一种动态切换机制，根据实时网络状况感知激活适当的 ABR 模型，使 ANT 能够根据不同的网络状况自动调整其控制策略。广泛的实验结果表明，与最先进的方法相比，在各种网络条件下，我们提出的 ANT 在视频点播场景中显著改善了用户 QoE，改善幅度为 20.8%-41.2%，在直播场景中改善幅度为 67.4%-134.5%。

{"title":"Learning Accurate Network Dynamics for Enhanced Adaptive Video Streaming","authors":"Jiaoyang Yin;Hao Chen;Yiling Xu;Zhan Ma;Xiaozhong Xu","doi":"10.1109/TBC.2024.3396698","DOIUrl":"10.1109/TBC.2024.3396698","url":null,"abstract":"The adaptive bitrate (ABR) algorithm plays a crucial role in ensuring satisfactory quality of experience (QoE) in video streaming applications. Most existing approaches, either rule-based or learning-driven, tend to conduct ABR decisions based on limited network statistics, e.g., mean/standard deviation of recent throughput measurements. However, all of them lack a good understanding of network dynamics given the varying network conditions from time to time, leading to compromised performance, especially when the network condition changes significantly. In this paper, we propose a framework named ANT that aims to enhance adaptive video streaming by accurately learning network dynamics. ANT represents and detects specific network conditions by characterizing the entire spectrum of network fluctuations. It further trains multiple dedicated ABR models for each condition using deep reinforcement learning. During inference, a dynamic switching mechanism is devised to activate the appropriate ABR model based on real-time network condition sensing, enabling ANT to automatically adjust its control policies to different network conditions. Extensive experimental results demonstrate that our proposed ANT achieves a significant improvement in user QoE of 20.8%-41.2% in the video-on-demand scenario and 67.4%-134.5% in the live-streaming scenario compared to state-of-the-art methods, across a wide range of network conditions.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"808-821"},"PeriodicalIF":3.2,"publicationDate":"2024-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0