IEEE Transactions on Broadcasting最新文献

英文中文

Next-Gen Satellite System: Integrative Non-Orthogonal Broadcast and Unicast Services Based on Innovative Frequency Reuse Patterns 下一代卫星系统：基于创新频率重用模式的非正交广播和单播综合服务

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-16 DOI: 10.1109/TBC.2024.3434731

Shuai Han;Zhiqiang Li;Weixiao Meng;Cheng Li

The multibeam satellite system is crucial for providing seamless and various information services, such as broadcast and unicast messages. However, catering to the burgeoning number of users within a limited spectrum of resources presents formidable challenges. Therefore, we devise the non-orthogonal broadcast and unicast (NOBU) joint transmission framework using rate-splitting multiple access (RSMA), which leverages non-orthogonal transmission and precoding strategies. Furthermore, amalgamating traditional precoding with frequency reuse techniques, we propose two novel distributed frequency reuse (DFR) and centralized frequency reuse (CFR) strategies. Taking satellite beam gain characteristics and interference tolerance threshold into consideration, we further propose another two expansions of DFR and CFR strategies with innovative inner and outer divisions. For the NOBU joint transmission based on four novel frequency reuse patterns, we maximize the weighted sum rate (WSR). Subsequently, we introduce an improved alternating optimization algorithm, adept at converting intricate non-convex problems into tractable convex counterparts. Simulation outcomes demonstrate that our proposed schemes have significant improvements in WSR performance and are promising for various practical applications.

多波束卫星系统对于提供无缝和多样化的信息服务至关重要，例如广播和单播消息。然而，在有限的资源范围内满足迅速增长的用户数量带来了巨大的挑战。因此，我们设计了采用分频多址（RSMA）的非正交广播和单播（NOBU）联合传输框架，该框架利用非正交传输和预编码策略。在此基础上，将传统的预编码技术与频率复用技术相结合，提出了分布式频率复用和集中式频率复用两种策略。在考虑到卫星波束增益特性和干扰容忍阈值的基础上，进一步提出了DFR和CFR策略的另外两种扩展，并进行了创新的内外划分。对于基于四种新的频率复用模式的NOBU联合传输，我们最大化了加权和速率（WSR）。随后，我们引入了一种改进的交替优化算法，该算法擅长将复杂的非凸问题转化为可处理的凸问题。仿真结果表明，我们提出的方案在WSR性能上有显著的提高，具有广泛的应用前景。

{"title":"Next-Gen Satellite System: Integrative Non-Orthogonal Broadcast and Unicast Services Based on Innovative Frequency Reuse Patterns","authors":"Shuai Han;Zhiqiang Li;Weixiao Meng;Cheng Li","doi":"10.1109/TBC.2024.3434731","DOIUrl":"10.1109/TBC.2024.3434731","url":null,"abstract":"The multibeam satellite system is crucial for providing seamless and various information services, such as broadcast and unicast messages. However, catering to the burgeoning number of users within a limited spectrum of resources presents formidable challenges. Therefore, we devise the non-orthogonal broadcast and unicast (NOBU) joint transmission framework using rate-splitting multiple access (RSMA), which leverages non-orthogonal transmission and precoding strategies. Furthermore, amalgamating traditional precoding with frequency reuse techniques, we propose two novel distributed frequency reuse (DFR) and centralized frequency reuse (CFR) strategies. Taking satellite beam gain characteristics and interference tolerance threshold into consideration, we further propose another two expansions of DFR and CFR strategies with innovative inner and outer divisions. For the NOBU joint transmission based on four novel frequency reuse patterns, we maximize the weighted sum rate (WSR). Subsequently, we introduce an improved alternating optimization algorithm, adept at converting intricate non-convex problems into tractable convex counterparts. Simulation outcomes demonstrate that our proposed schemes have significant improvements in WSR performance and are promising for various practical applications.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1153-1166"},"PeriodicalIF":3.2,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IEEE Transactions on Broadcasting Publication Information 电气和电子工程师学会《广播学报》出版信息

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-16 DOI: 10.1109/TBC.2024.3453629

引用次数: 0

IEEE Transactions on Broadcasting Information for Authors 电气和电子工程师学会（IEEE）《关于广播作者信息的论文集》（IEEE Transactions on Broadcasting Information for Authors

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-16 DOI: 10.1109/TBC.2024.3453611

引用次数: 0

IEEE Transactions on Broadcasting Publication Information 电气和电子工程师学会《广播学报》出版信息

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-16 DOI: 10.1109/TBC.2024.3453609

引用次数: 0

Guest Editorial Special Issue on Intelligent Multicast/Broadcast Services Over 5G/6G 关于 5G/6G 智能多播/广播服务的特邀编辑特刊

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-16 DOI: 10.1109/TBC.2024.3450134

Bo Rong;Eneko Iradier;Jordi Joan Gimenez;Sungjun Ahn;Cristiano Akamine;Jong-Soo Seo;Peng Yu;Yin Xu;Pablo Angueira;Yiyan Wu;Weiliang Xie

引用次数: 0

SGIQA: Semantic-Guided No-Reference Image Quality Assessment SGIQA：语义引导的无参考图像质量评估

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-12 DOI: 10.1109/TBC.2024.3450320

Linpeng Pan;Xiaozhe Zhang;Fengying Xie;Haopeng Zhang;Yushan Zheng

Existing no reference image quality assessment(NR-IQA) methods have not incorporated image semantics explicitly in the assessment process, thus overlooking the significant correlation between image content and its quality. To address this gap, we leverages image semantics as guiding information for quality assessment, integrating it explicitly into the NR-IQA process through a Semantic-Guided NR-IQA model(SGIQA), which is based on the Swin Transformer. Specifically, we introduce a Semantic Attention Module and a Perceptual Rule Learning Module. The Semantic Attention Module refines the features extracted by the deep network according to the image content, allowing the network to dynamically extract quality perceptual features according to the semantic context of the image. The Perceptual Rule Learning Module generates parameters for the image quality regression module tailored to the image content, facilitating a dynamic assessment of image quality based on its semantic information. Employing the Swin Transformer and integrating these two modules, we have developed the final semantic-guided NR-IQA model. Extensive experiments on five widely-used IQA datasets demonstrate that our method not only exhibits excellent generalization capabilities but also achieves state-of-the-art performance.

现有的无参考图像质量评估（NR-IQA）方法没有将图像语义明确地纳入评估过程，从而忽略了图像内容与其质量之间的显著相关性。为了解决这一差距，我们利用图像语义作为质量评估的指导信息，通过基于Swin Transformer的语义导向NR-IQA模型（SGIQA）将其明确地集成到NR-IQA过程中。具体来说，我们引入了语义注意模块和感知规则学习模块。语义关注模块根据图像内容对深度网络提取的特征进行细化，使网络能够根据图像的语义上下文动态提取优质的感知特征。感知规则学习模块为图像内容定制的图像质量回归模块生成参数，促进基于图像语义信息的图像质量动态评估。使用Swin Transformer并集成这两个模块，我们开发了最终的语义引导NR-IQA模型。在五个广泛使用的IQA数据集上进行的大量实验表明，我们的方法不仅具有出色的泛化能力，而且达到了最先进的性能。

{"title":"SGIQA: Semantic-Guided No-Reference Image Quality Assessment","authors":"Linpeng Pan;Xiaozhe Zhang;Fengying Xie;Haopeng Zhang;Yushan Zheng","doi":"10.1109/TBC.2024.3450320","DOIUrl":"10.1109/TBC.2024.3450320","url":null,"abstract":"Existing no reference image quality assessment(NR-IQA) methods have not incorporated image semantics explicitly in the assessment process, thus overlooking the significant correlation between image content and its quality. To address this gap, we leverages image semantics as guiding information for quality assessment, integrating it explicitly into the NR-IQA process through a Semantic-Guided NR-IQA model(SGIQA), which is based on the Swin Transformer. Specifically, we introduce a Semantic Attention Module and a Perceptual Rule Learning Module. The Semantic Attention Module refines the features extracted by the deep network according to the image content, allowing the network to dynamically extract quality perceptual features according to the semantic context of the image. The Perceptual Rule Learning Module generates parameters for the image quality regression module tailored to the image content, facilitating a dynamic assessment of image quality based on its semantic information. Employing the Swin Transformer and integrating these two modules, we have developed the final semantic-guided NR-IQA model. Extensive experiments on five widely-used IQA datasets demonstrate that our method not only exhibits excellent generalization capabilities but also achieves state-of-the-art performance.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1292-1301"},"PeriodicalIF":3.2,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Near-Optimal Piecewise Linear Companding Transform for PAPR Reduction of OFDM Systems 用于降低 OFDM 系统 PAPR 的近优iecewise Linear Companding 变换

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-06 DOI: 10.1109/TBC.2024.3443466

Meixia Hu;Jingqing Wang;Wenchi Cheng;Hailin Zhang

Since the inherent high envelope fluctuation characteristics of OFDM signals present a significant challenge in reducing energy consumption, it is crucial to minimize the range of the envelope fluctuations of OFDM signals. As companding is a well-known technique for reducing the envelope fluctuations of OFDM signals, we explore the optimal companding transform by building a multi-objective optimization model with the goal of minimizing peak-to-average power ratio (PAPR), inner-band distortions, and out-of-band (OOB) radiations in this paper. The solution reveals that the optimal form of companding transform is a piecewise one and closely resembles a linear transform. Furthermore, we find that the average power of the optimal companded signal is never greater than that of the original signal, which contradicts the constraint of constant average signal power usually used in the design of companding transform. Based on the characteristics of the optimal companding transform, we propose a near-optimal piecewise linear companding transform to obviate the extremely high computational complexity of the optimal companding transform. The proposed near-optimal piecewise linear companding transform is a promising solution for mitigating companding distortions while reducing PAPR. However, it should be noted that there may still be some unavoidable distortions after decompanding, which results in a degradation of the BER performance. Thus, we diminish the remaining distortions after decompanding by relaxing the constraint of the proposed near-optimal piecewise linear companding transform on the average power of the companded signals. Simulation results demonstrate that the relaxation can improve the BER performance while ensuring the PAPR performance with only a small sacrifice on OOB radiations.

{"title":"Near-Optimal Piecewise Linear Companding Transform for PAPR Reduction of OFDM Systems","authors":"Meixia Hu;Jingqing Wang;Wenchi Cheng;Hailin Zhang","doi":"10.1109/TBC.2024.3443466","DOIUrl":"10.1109/TBC.2024.3443466","url":null,"abstract":"Since the inherent high envelope fluctuation characteristics of OFDM signals present a significant challenge in reducing energy consumption, it is crucial to minimize the range of the envelope fluctuations of OFDM signals. As companding is a well-known technique for reducing the envelope fluctuations of OFDM signals, we explore the optimal companding transform by building a multi-objective optimization model with the goal of minimizing peak-to-average power ratio (PAPR), inner-band distortions, and out-of-band (OOB) radiations in this paper. The solution reveals that the optimal form of companding transform is a piecewise one and closely resembles a linear transform. Furthermore, we find that the average power of the optimal companded signal is never greater than that of the original signal, which contradicts the constraint of constant average signal power usually used in the design of companding transform. Based on the characteristics of the optimal companding transform, we propose a near-optimal piecewise linear companding transform to obviate the extremely high computational complexity of the optimal companding transform. The proposed near-optimal piecewise linear companding transform is a promising solution for mitigating companding distortions while reducing PAPR. However, it should be noted that there may still be some unavoidable distortions after decompanding, which results in a degradation of the BER performance. Thus, we diminish the remaining distortions after decompanding by relaxing the constraint of the proposed near-optimal piecewise linear companding transform on the average power of the companded signals. Simulation results demonstrate that the relaxation can improve the BER performance while ensuring the PAPR performance with only a small sacrifice on OOB radiations.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"350-359"},"PeriodicalIF":3.2,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scale-Adaptive Asymmetric Sparse Variational AutoEncoder for Point Cloud Compression 用于点云压缩的规模自适应非对称稀疏变分自动编码器

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-05 DOI: 10.1109/TBC.2024.3437161

Jian Chen;Yingtao Zhu;Wei Huang;Chengdong Lan;Tiesong Zhao

Learning-based point cloud compression has achieved great success in Rate-Distortion (RD) efficiency. Existing methods usually utilize Variational AutoEncoder (VAE) network, which might lead to poor detail reconstruction and high computational complexity. To address these issues, we propose a Scale-adaptive Asymmetric Sparse Variational AutoEncoder (SAS-VAE) in this work. First, we develop an Asymmetric Multiscale Sparse Convolution (AMSC), which exploits multi-resolution branches to aggregate multiscale features at encoder, and excludes symmetric feature fusion branches to control the model complexity at decoder. Second, we design a Scale Adaptive Feature Refinement Structure (SAFRS) to adaptively adjust the number of Feature Refinement Modules (FRMs), thereby improving RD performance with an acceptable computational overhead. Third, we implement our framework with AMSC and SAFRS, and train it with an RD loss based on Fine-grained Weighted Binary Cross-Entropy (FWBCE) function. Experimental results on 8iVFB, Owlii, and MVUV datasets show that our method outperforms several popular methods, with a 90.0% time reduction and a 51.8% BD-BR saving compared with V-PCC. The code will be available soon at https://github.com/fancj2017/SAS-VAE.

基于学习的点云压缩在速率-失真（RD）效率方面取得了巨大成功。现有的方法通常使用变异自动编码器（VAE）网络，这可能会导致细节重建效果差和计算复杂度高。为了解决这些问题，我们在本研究中提出了一种规模自适应非对称稀疏变异自动编码器（SAS-VAE）。首先，我们开发了非对称多尺度稀疏卷积（AMSC），在编码器中利用多分辨率分支聚合多尺度特征，在解码器中排除对称特征融合分支以控制模型复杂度。其次，我们设计了规模自适应特征细化结构（SAFRS），以自适应地调整特征细化模块（FRM）的数量，从而在可接受的计算开销下提高 RD 性能。第三，我们利用 AMSC 和 SAFRS 实现了我们的框架，并使用基于细粒度加权二元交叉熵（FWBCE）函数的 RD 损失对其进行了训练。在 8iVFB、Owlii 和 MVUV 数据集上的实验结果表明，我们的方法优于几种流行的方法，与 V-PCC 相比，时间缩短了 90.0%，BD-BR 节省了 51.8%。代码即将在 https://github.com/fancj2017/SAS-VAE 上发布。

{"title":"Scale-Adaptive Asymmetric Sparse Variational AutoEncoder for Point Cloud Compression","authors":"Jian Chen;Yingtao Zhu;Wei Huang;Chengdong Lan;Tiesong Zhao","doi":"10.1109/TBC.2024.3437161","DOIUrl":"10.1109/TBC.2024.3437161","url":null,"abstract":"Learning-based point cloud compression has achieved great success in Rate-Distortion (RD) efficiency. Existing methods usually utilize Variational AutoEncoder (VAE) network, which might lead to poor detail reconstruction and high computational complexity. To address these issues, we propose a Scale-adaptive Asymmetric Sparse Variational AutoEncoder (SAS-VAE) in this work. First, we develop an Asymmetric Multiscale Sparse Convolution (AMSC), which exploits multi-resolution branches to aggregate multiscale features at encoder, and excludes symmetric feature fusion branches to control the model complexity at decoder. Second, we design a Scale Adaptive Feature Refinement Structure (SAFRS) to adaptively adjust the number of Feature Refinement Modules (FRMs), thereby improving RD performance with an acceptable computational overhead. Third, we implement our framework with AMSC and SAFRS, and train it with an RD loss based on Fine-grained Weighted Binary Cross-Entropy (FWBCE) function. Experimental results on 8iVFB, Owlii, and MVUV datasets show that our method outperforms several popular methods, with a 90.0% time reduction and a 51.8% BD-BR saving compared with V-PCC. The code will be available soon at \u0000<uri>https://github.com/fancj2017/SAS-VAE</uri>\u0000.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"884-894"},"PeriodicalIF":3.2,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retouched Face Image Quality Assessment Based on Differential Perception and Textual Prompt 基于差异感知和文字提示的修饰后人脸图像质量评估

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-09-02 DOI: 10.1109/TBC.2024.3447454

Tianwei Zhou;Songbai Tan;Gang Li;Shishun Tian;Chang Tang;Zhihua Wang;Guanghui Yue

Face retouching involves using digital techniques to alter an individual’s appearance, commonly using in social media. However, excessively retouched face (RF) images can lead to issues such as unrealistic beauty standards and psychological stress. Therefore, it is crucial to develop a reliable quality assessment method for RF images. In this paper, we propose a novel network named DIRF-IQA for RF image quality assessment (IQA). DIRF-IQA mainly includes a parameter-shared image encoder, a text encoder, and three key components, namely the Differential Feature Attention Module (DFAM), the Text-image Interaction Module (TIM), and the Multi-scale Feature Fusion Module (MFFM). Specifically, the DFAM captures both local and global differences between original and retouched images by processing multi-scale features and utilizing cross-attention and self-attention blocks for differential perception. In the TIM, textual prompts summarizing retouching operations are encoded by a text encoder and integrated with differential features extracted by the DFAM to enhance the understanding of distortions in RF images. The MFFM then fuses these text-enhanced features across different layers and combines them with the global differential feature to predict the quality of the retouched images. We conduct extensive experiments on two RF IQA databases and the results demonstrate the superiority of DIDF-IQA compared to 12 state-of-the-art full-reference IQA methods in evaluating RF images.

{"title":"Retouched Face Image Quality Assessment Based on Differential Perception and Textual Prompt","authors":"Tianwei Zhou;Songbai Tan;Gang Li;Shishun Tian;Chang Tang;Zhihua Wang;Guanghui Yue","doi":"10.1109/TBC.2024.3447454","DOIUrl":"10.1109/TBC.2024.3447454","url":null,"abstract":"Face retouching involves using digital techniques to alter an individual’s appearance, commonly using in social media. However, excessively retouched face (RF) images can lead to issues such as unrealistic beauty standards and psychological stress. Therefore, it is crucial to develop a reliable quality assessment method for RF images. In this paper, we propose a novel network named DIRF-IQA for RF image quality assessment (IQA). DIRF-IQA mainly includes a parameter-shared image encoder, a text encoder, and three key components, namely the Differential Feature Attention Module (DFAM), the Text-image Interaction Module (TIM), and the Multi-scale Feature Fusion Module (MFFM). Specifically, the DFAM captures both local and global differences between original and retouched images by processing multi-scale features and utilizing cross-attention and self-attention blocks for differential perception. In the TIM, textual prompts summarizing retouching operations are encoded by a text encoder and integrated with differential features extracted by the DFAM to enhance the understanding of distortions in RF images. The MFFM then fuses these text-enhanced features across different layers and combines them with the global differential feature to predict the quality of the retouched images. We conduct extensive experiments on two RF IQA databases and the results demonstrate the superiority of DIDF-IQA compared to 12 state-of-the-art full-reference IQA methods in evaluating RF images.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"240-251"},"PeriodicalIF":3.2,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Long-Term and Short-Term Information Propagation and Fusion for Learned Video Compression 学习视频压缩的长短期信息传播与融合

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting

Pub Date : 2024-08-30 DOI: 10.1109/TBC.2024.3434702

Shen Wang;Donghui Feng;Guo Lu;Zhengxue Cheng;Li Song;Wenjun Zhang

In recent years, numerous learned video compression (LVC) methods have emerged, demonstrating rapid developments and satisfactory performance. However, in most previous methods, only the previous one frame is used as reference. Although some works introduce the usage of the previous multiple frames, the exploitation of temporal information is not comprehensive. Our proposed method not only utilizes the short-term information from multiple neighboring frames but also introduces long-term feature information as the reference, which effectively enhances the quality of the context and improves the compression efficiency. In our scheme, we propose the long-term information exploitation mechanism to capture long-term temporal relevance. The update and propagation of long-term information establish an implicit connection between the latent representation of the current frame and distant reference frames, aiding in the generation of long-term context. Meanwhile, the short-term neighboring frames are also utilized to extract local information and generate short-term context. The fusion of long-term context and short-term context results in a more comprehensive and high-quality context to achieve sufficient temporal information mining. Besides, the multiple frames information also helps to improve the efficiency of motion compression. They are used to generate the predicted motion and remove spatio-temporal redundancies in motion information by second-order motion prediction and fusion. Experimental results demonstrate that our proposed efficient learned video compression scheme can achieve 4.79% BD-rate saving when compared with H.266 (VTM).

近年来，出现了许多学习视频压缩（LVC）方法，发展迅速，性能令人满意。然而，在大多数以前的方法中，只使用前一个帧作为参考。虽然有些作品引入了对之前多帧的使用，但对时间信息的利用并不全面。该方法既利用了多个相邻帧的短期信息，又引入了长期特征信息作为参考，有效地增强了上下文的质量，提高了压缩效率。在我们的方案中，我们提出了长期信息利用机制来捕获长期时间相关性。长期信息的更新和传播在当前框架的潜在表征和遥远参考框架之间建立了隐式联系，有助于长期语境的生成。同时，利用短时相邻帧提取局部信息，生成短时上下文。将长期上下文和短期上下文融合，形成更全面、高质量的上下文，实现充分的时间信息挖掘。此外，多帧信息也有助于提高运动压缩的效率。它们用于生成预测运动，并通过二阶运动预测和融合消除运动信息中的时空冗余。实验结果表明，与H.266 （VTM）相比，我们提出的高效学习视频压缩方案可节省4.79%的帧率。

{"title":"Long-Term and Short-Term Information Propagation and Fusion for Learned Video Compression","authors":"Shen Wang;Donghui Feng;Guo Lu;Zhengxue Cheng;Li Song;Wenjun Zhang","doi":"10.1109/TBC.2024.3434702","DOIUrl":"10.1109/TBC.2024.3434702","url":null,"abstract":"In recent years, numerous learned video compression (LVC) methods have emerged, demonstrating rapid developments and satisfactory performance. However, in most previous methods, only the previous one frame is used as reference. Although some works introduce the usage of the previous multiple frames, the exploitation of temporal information is not comprehensive. Our proposed method not only utilizes the short-term information from multiple neighboring frames but also introduces long-term feature information as the reference, which effectively enhances the quality of the context and improves the compression efficiency. In our scheme, we propose the long-term information exploitation mechanism to capture long-term temporal relevance. The update and propagation of long-term information establish an implicit connection between the latent representation of the current frame and distant reference frames, aiding in the generation of long-term context. Meanwhile, the short-term neighboring frames are also utilized to extract local information and generate short-term context. The fusion of long-term context and short-term context results in a more comprehensive and high-quality context to achieve sufficient temporal information mining. Besides, the multiple frames information also helps to improve the efficiency of motion compression. They are used to generate the predicted motion and remove spatio-temporal redundancies in motion information by second-order motion prediction and fusion. Experimental results demonstrate that our proposed efficient learned video compression scheme can achieve 4.79% BD-rate saving when compared with H.266 (VTM).","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1254-1265"},"PeriodicalIF":3.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Transactions on Broadcasting

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀