Layered Division Multiplexing (LDM) is a Power-based Non-Orthogonal Multiplexing (P-NOM) technique that has been implemented in the Advanced Television System Committee (ATSC) 3.0 terrestrial TV physical layer to effectively multiplex services with different robustness and data rate requirements. As communication systems quickly evolve, the services to be delivered are becoming more diverse and versatile. Up to now, the LDM system adopted in the terrestrial TV system uses a uniform injection level for the lower-level (or Layer 2) signal injection. This paper investigates the non-uniform injection level LDM (NULDM). The proposed technique can explore the Unequal Error Protection (UEP) property of Low-Density Parity-Check (LDPC) codes and the flexible power allocation nature of the NULDM to improve the system performance and spectrum efficiency. NULDM enables the seamless integration of broadcast/multicast and unicast services in one RF channel, where the unicast signal can assign different resources (power, frequency, and time) based on the UE distance and service requirements. Meanwhile, more power could be allocated to improve the upper layer (or Layer 1) broadcast and datacast services. To make better use of the UEP property of LDPC codes in NULDM, the extended Gaussian mixture approximation (EGMA) method is used to design bit interleaving patterns. Additionally, inspired by the channel order of polar codes, this paper proposes an LDPC sub-block interleaving order (SBIO) scheme that performs similarly to the EGMA interleaving model, while better adapting to the diverse needs of proposed mixed service delivery scenarios for convergence of broadband wireless communications and broadcasting systems.
{"title":"LDPC-Coded LDM Systems Employing Non-Uniform Injection Level for Combining Broadcast and Multicast/Unicast Services","authors":"Hao Ju;Yin Xu;Ruiqi Liu;Dazhi He;Sungjun Ahn;Namho Hur;Sung-Ik Park;Wenjun Zhang;Yiyan Wu","doi":"10.1109/TBC.2024.3394296","DOIUrl":"10.1109/TBC.2024.3394296","url":null,"abstract":"Layered Division Multiplexing (LDM) is a Power-based Non-Orthogonal Multiplexing (P-NOM) technique that has been implemented in the Advanced Television System Committee (ATSC) 3.0 terrestrial TV physical layer to effectively multiplex services with different robustness and data rate requirements. As communication systems quickly evolve, the services to be delivered are becoming more diverse and versatile. Up to now, the LDM system adopted in the terrestrial TV system uses a uniform injection level for the lower-level (or Layer 2) signal injection. This paper investigates the non-uniform injection level LDM (NULDM). The proposed technique can explore the Unequal Error Protection (UEP) property of Low-Density Parity-Check (LDPC) codes and the flexible power allocation nature of the NULDM to improve the system performance and spectrum efficiency. NULDM enables the seamless integration of broadcast/multicast and unicast services in one RF channel, where the unicast signal can assign different resources (power, frequency, and time) based on the UE distance and service requirements. Meanwhile, more power could be allocated to improve the upper layer (or Layer 1) broadcast and datacast services. To make better use of the UEP property of LDPC codes in NULDM, the extended Gaussian mixture approximation (EGMA) method is used to design bit interleaving patterns. Additionally, inspired by the channel order of polar codes, this paper proposes an LDPC sub-block interleaving order (SBIO) scheme that performs similarly to the EGMA interleaving model, while better adapting to the diverse needs of proposed mixed service delivery scenarios for convergence of broadband wireless communications and broadcasting systems.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1032-1043"},"PeriodicalIF":3.2,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-13DOI: 10.1109/TBC.2024.3363455
Mohammed Amine Togou;Anderson Augusto Simiscuka;Rohit Verma;Noel E. O’Connor;Iñigo Tamayo;Stefano Masneri;Mikel Zorrilla;Gabriel-Miro Muntean
Due to the COVID-19 pandemic, most arts and cultural activities have moved online. This has contributed to the surge in development of artistic tools that enable professional artists to produce engaging and immersive shows remotely. This article introduces TRACTION Co-Creation Stage (TCS), a novel Web-based solution, designed and developed in the context of the EU Horizon 2020 TRACTION project, which allows for remote creation and delivery of artistic shows. TCS supports multiple artists performing simultaneously, either live or pre-recorded, on multiple stages at different geographical locations. It employs a client-server approach. The client has two major components: Control and Display. The former is used by the production teams to create shows by specifying layouts, scenes, and media sources to be included. The latter is used by viewers to watch the various shows. To ensure viewers’ good quality of experience (QoE) levels, TCS employs adaptive streaming based on a novel Prioritised Adaptation solution based on the DASH standard for pre-recorded content delivery (PADA), which is introduced in this paper. User tests and experiments are carried out to evaluate the performance of TCS’ Control and Display applications and that of PADA algorithm when creating and distributing opera shows.
{"title":"An Innovative Adaptive Web-Based Solution for Improved Remote Co-Creation and Delivery of Artistic Performances","authors":"Mohammed Amine Togou;Anderson Augusto Simiscuka;Rohit Verma;Noel E. O’Connor;Iñigo Tamayo;Stefano Masneri;Mikel Zorrilla;Gabriel-Miro Muntean","doi":"10.1109/TBC.2024.3363455","DOIUrl":"10.1109/TBC.2024.3363455","url":null,"abstract":"Due to the COVID-19 pandemic, most arts and cultural activities have moved online. This has contributed to the surge in development of artistic tools that enable professional artists to produce engaging and immersive shows remotely. This article introduces TRACTION Co-Creation Stage (TCS), a novel Web-based solution, designed and developed in the context of the EU Horizon 2020 TRACTION project, which allows for remote creation and delivery of artistic shows. TCS supports multiple artists performing simultaneously, either live or pre-recorded, on multiple stages at different geographical locations. It employs a client-server approach. The client has two major components: Control and Display. The former is used by the production teams to create shows by specifying layouts, scenes, and media sources to be included. The latter is used by viewers to watch the various shows. To ensure viewers’ good quality of experience (QoE) levels, TCS employs adaptive streaming based on a novel Prioritised Adaptation solution based on the DASH standard for pre-recorded content delivery (PADA), which is introduced in this paper. User tests and experiments are carried out to evaluate the performance of TCS’ Control and Display applications and that of PADA algorithm when creating and distributing opera shows.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"719-730"},"PeriodicalIF":4.5,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10472407","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140125226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-10DOI: 10.1109/TBC.2024.3391056
John A. Snoap;Dimitrie C. Popescu;Chad M. Spooner
The paper presents a novel deep-learning (DL) based classifier for digitally modulated signals that uses a capsule network (CAP) with custom-designed feature extraction layers. The classifier takes the in-phase/quadrature (I/Q) components of the digitally modulated signal as input, and the feature extraction layers are inspired by cyclostationary signal processing (CSP) techniques, which extract the cyclic cumulant (CC) features that are employed by conventional CSP-based approaches to blind modulation classification and signal identification. Specifically, the feature extraction layers implement a proxy of the mathematical functions used in the calculation of the CC features and include a squaring layer, a raise-to-the-power-of-three layer, and a fast-Fourier-transform (FFT) layer, along with additional normalization and warping layers to ensure that the relative signal powers are retained and to prevent the trainable neural network (NN) layers from diverging in the training process. The classification performance and the generalization abilities of the proposed CAP are tested using two distinct datasets that contain similar classes of digitally modulated signals but that have been generated independently, and numerical results obtained reveal that the proposed CAP with novel feature extraction layers achieves high classification accuracy while also outperforming alternative DL-based approaches for signal classification in terms of both classification accuracy and generalization abilities.
本文介绍了一种基于深度学习(DL)的新型数字调制信号分类器,该分类器使用带有定制设计特征提取层的胶囊网络(CAP)。该分类器将数字调制信号的同相/正交(I/Q)分量作为输入,而特征提取层则受到环静止信号处理(CSP)技术的启发,该技术可提取循环累积(CC)特征,这些特征被传统的基于 CSP 的方法用于盲调制分类和信号识别。具体来说,特征提取层实现了用于计算 CC 特征的数学函数的代理,包括一个平方层、一个三倍功率层和一个快速傅里叶变换(FFT)层,以及额外的归一化和翘曲层,以确保保留相对信号功率,并防止可训练神经网络(NN)层在训练过程中发散。使用两个不同的数据集测试了所提出的 CAP 的分类性能和泛化能力,这两个数据集包含类似类别的数字调制信号,但都是独立生成的。数值结果表明,所提出的 CAP 连同新颖的特征提取层实现了较高的分类精度,同时在分类精度和泛化能力方面也优于其他基于 DL 的信号分类方法。
{"title":"Deep-Learning-Based Classifier With Custom Feature-Extraction Layers for Digitally Modulated Signals","authors":"John A. Snoap;Dimitrie C. Popescu;Chad M. Spooner","doi":"10.1109/TBC.2024.3391056","DOIUrl":"10.1109/TBC.2024.3391056","url":null,"abstract":"The paper presents a novel deep-learning (DL) based classifier for digitally modulated signals that uses a capsule network (CAP) with custom-designed feature extraction layers. The classifier takes the in-phase/quadrature (I/Q) components of the digitally modulated signal as input, and the feature extraction layers are inspired by cyclostationary signal processing (CSP) techniques, which extract the cyclic cumulant (CC) features that are employed by conventional CSP-based approaches to blind modulation classification and signal identification. Specifically, the feature extraction layers implement a proxy of the mathematical functions used in the calculation of the CC features and include a squaring layer, a raise-to-the-power-of-three layer, and a fast-Fourier-transform (FFT) layer, along with additional normalization and warping layers to ensure that the relative signal powers are retained and to prevent the trainable neural network (NN) layers from diverging in the training process. The classification performance and the generalization abilities of the proposed CAP are tested using two distinct datasets that contain similar classes of digitally modulated signals but that have been generated independently, and numerical results obtained reveal that the proposed CAP with novel feature extraction layers achieves high classification accuracy while also outperforming alternative DL-based approaches for signal classification in terms of both classification accuracy and generalization abilities.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"763-773"},"PeriodicalIF":3.2,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-10DOI: 10.1109/TBC.2024.3394297
Fei Zhou;Zikang Zheng;Guoping Qiu
Displaying standard dynamic range (SDR) videos on high dynamic range (HDR) devices requires inverse tone mapping (ITM). However, such mapping can introduce banding artifacts. This paper presents a banding removal method for inversely tone mapped HDR videos based on deep convolutional neural networks (DCNNs) and adaptive filtering. Three banding relevant feature maps are first extracted and then fed to two DCNNs, a ShapeNet and a PositionNet. The PositionNet learns a soft mask indicating the locations where banding is likely to have occurred and filtering is required while the ShapeNet predicts the filter shapes appropriate for different locations. An advantage of the method is that the adaptive filters can be jointly optimized with a learning-based ITM algorithm for creating high-quality HDR videos. Experimental results show that our method outperforms state-of-the-art algorithms qualitatively and quantitatively.
{"title":"Removing Banding Artifacts in HDR Videos Generated From Inverse Tone Mapping","authors":"Fei Zhou;Zikang Zheng;Guoping Qiu","doi":"10.1109/TBC.2024.3394297","DOIUrl":"10.1109/TBC.2024.3394297","url":null,"abstract":"Displaying standard dynamic range (SDR) videos on high dynamic range (HDR) devices requires inverse tone mapping (ITM). However, such mapping can introduce banding artifacts. This paper presents a banding removal method for inversely tone mapped HDR videos based on deep convolutional neural networks (DCNNs) and adaptive filtering. Three banding relevant feature maps are first extracted and then fed to two DCNNs, a ShapeNet and a PositionNet. The PositionNet learns a soft mask indicating the locations where banding is likely to have occurred and filtering is required while the ShapeNet predicts the filter shapes appropriate for different locations. An advantage of the method is that the adaptive filters can be jointly optimized with a learning-based ITM algorithm for creating high-quality HDR videos. Experimental results show that our method outperforms state-of-the-art algorithms qualitatively and quantitatively.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"753-762"},"PeriodicalIF":4.5,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Orthogonal frequency division multiplexing indexed modulation (OFDM-IM), an emerging multi-carrier modulation technique, offers significant advantages over traditional OFDM. The OFDM-IM scheme exhibits superior performance in terms of bit error rate (BER) at low and medium data rates, while also enhancing resilience to inter-carrier interference in dynamically changing channels. However, the challenge of a high peak-to-average ratio (PAPR) also persists in OFDM-IM. In this study, we propose a novel approach to mitigate PAPR by introducing a small dither signal to the idle subcarrier, leveraging the inherent characteristics of OFDM-IM. Subsequently, we address the nonconvex and non-smooth optimization problem of minimizing the maximum amplitude of dither signals while maintaining a constant PAPR constraint. To effectively tackle this challenging optimization task, we adopt the linearized alternating direction multiplier method (LADMM), referred to as the LADMM-direct algorithm, which provides a simple closed-form solution for each subproblem encountered during the optimization process. To improve the convergence rate of the LADMM-direct algorithm, a LADMM-relax algorithm is also proposed to address the PAPR problem. Simulation results demonstrate that our proposed LADMM-direct and LADMM-relax algorithms significantly reduce computational complexity and achieve superior performance in terms of both PAPR and bit error rate (BER) compared to state-of-the-art algorithms.
{"title":"Optimal OFDM-IM Signals With Constant PAPR","authors":"Jiabo Hu;Yajun Wang;Zhuxian Lian;Yinjie Su;Zhibin Xie","doi":"10.1109/TBC.2024.3394292","DOIUrl":"10.1109/TBC.2024.3394292","url":null,"abstract":"Orthogonal frequency division multiplexing indexed modulation (OFDM-IM), an emerging multi-carrier modulation technique, offers significant advantages over traditional OFDM. The OFDM-IM scheme exhibits superior performance in terms of bit error rate (BER) at low and medium data rates, while also enhancing resilience to inter-carrier interference in dynamically changing channels. However, the challenge of a high peak-to-average ratio (PAPR) also persists in OFDM-IM. In this study, we propose a novel approach to mitigate PAPR by introducing a small dither signal to the idle subcarrier, leveraging the inherent characteristics of OFDM-IM. Subsequently, we address the nonconvex and non-smooth optimization problem of minimizing the maximum amplitude of dither signals while maintaining a constant PAPR constraint. To effectively tackle this challenging optimization task, we adopt the linearized alternating direction multiplier method (LADMM), referred to as the LADMM-direct algorithm, which provides a simple closed-form solution for each subproblem encountered during the optimization process. To improve the convergence rate of the LADMM-direct algorithm, a LADMM-relax algorithm is also proposed to address the PAPR problem. Simulation results demonstrate that our proposed LADMM-direct and LADMM-relax algorithms significantly reduce computational complexity and achieve superior performance in terms of both PAPR and bit error rate (BER) compared to state-of-the-art algorithms.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"945-954"},"PeriodicalIF":3.2,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently, screen content video applications are increasingly widespread in our daily lives. The latest Screen Content Coding (SCC) standard, known as Versatile Video Coding (VVC) SCC, employs a quad-tree plus multi-type tree (QTMT) coding structure for Coding Unit (CU) partitioning and screen content Coding Modes (CMs) selection. While VVC SCC achieves high coding efficiency, its coding complexity poses a significant obstacle to the further widespread adoption of screen content video. Hence, it is crucial to enhance the coding speed of VVC SCC. In this paper, we propose a fast mode and splitting decision for Intra prediction in VVC SCC. Specifically, we initially exploit deep learning techniques to predict content types for all CUs. Subsequently, we examine CM distributions of different content types to predict candidate CMs for CUs. We then introduce early skip and early terminate CM decisions for different content types of CUs to further eliminate unlikely CMs. Finally, we develop Block-based Differential Pulse-Code Modulation (BDPCM) early termination and CU splitting early termination to improve coding speed. Experimental results demonstrate that the proposed algorithm improves coding speed on average by 41.14%, with the BDBR increasing by 1.17%.
目前,屏幕内容视频应用在我们的日常生活中越来越广泛。最新的屏幕内容编码(SCC)标准,即多功能视频编码(VVC)SCC,采用四叉树加多类型树(QTMT)编码结构进行编码单元(CU)划分和屏幕内容编码模式(CM)选择。虽然 VVC SCC 实现了较高的编码效率,但其编码复杂性对屏幕内容视频的进一步广泛应用构成了重大障碍。因此,提高 VVC SCC 的编码速度至关重要。本文提出了 VVC SCC 中内预测的快速模式和分割决策。具体来说,我们首先利用深度学习技术来预测所有 CU 的内容类型。随后,我们检查不同内容类型的 CM 分布,预测 CU 的候选 CM。然后,我们针对不同内容类型的 CU 引入早期跳过和早期终止 CM 的决策,以进一步消除不可能的 CM。最后,我们开发了基于块的差分脉冲编码调制(BDPCM)早期终止和 CU 分割早期终止,以提高编码速度。实验结果表明,所提算法的编码速度平均提高了 41.14%,BDBR 提高了 1.17%。
{"title":"Fast Mode and CU Splitting Decision for Intra Prediction in VVC SCC","authors":"Dayong Wang;Junyi Yu;Xin Lu;Frederic Dufaux;Bo Hang;Hui Guo;Ce Zhu","doi":"10.1109/TBC.2024.3394288","DOIUrl":"10.1109/TBC.2024.3394288","url":null,"abstract":"Currently, screen content video applications are increasingly widespread in our daily lives. The latest Screen Content Coding (SCC) standard, known as Versatile Video Coding (VVC) SCC, employs a quad-tree plus multi-type tree (QTMT) coding structure for Coding Unit (CU) partitioning and screen content Coding Modes (CMs) selection. While VVC SCC achieves high coding efficiency, its coding complexity poses a significant obstacle to the further widespread adoption of screen content video. Hence, it is crucial to enhance the coding speed of VVC SCC. In this paper, we propose a fast mode and splitting decision for Intra prediction in VVC SCC. Specifically, we initially exploit deep learning techniques to predict content types for all CUs. Subsequently, we examine CM distributions of different content types to predict candidate CMs for CUs. We then introduce early skip and early terminate CM decisions for different content types of CUs to further eliminate unlikely CMs. Finally, we develop Block-based Differential Pulse-Code Modulation (BDPCM) early termination and CU splitting early termination to improve coding speed. Experimental results demonstrate that the proposed algorithm improves coding speed on average by 41.14%, with the BDBR increasing by 1.17%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"872-883"},"PeriodicalIF":3.2,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing maturity of the text-to-image and image-to-image generative models, AI-generated images (AGIs) have shown great application potential in advertisement, entertainment, education, social media, etc. Although remarkable advancements have been achieved in generative models, very few efforts have been paid to design relevant quality assessment models. In this paper, we propose a novel blind image quality assessment (IQA) network, named AMFF-Net, for AGIs. AMFF-Net evaluates AGI quality from three dimensions, i.e., “visual quality”, “authenticity”, and “consistency”. Specifically, inspired by the characteristics of the human visual system and motivated by the observation that “visual quality” and “authenticity” are characterized by both local and global aspects, AMFF-Net scales the image up and down and takes the scaled images and original-sized image as the inputs to obtain multi-scale features. After that, an Adaptive Feature Fusion (AFF) block is used to adaptively fuse the multi-scale features with learnable weights. In addition, considering the correlation between the image and prompt, AMFF-Net compares the semantic features from text encoder and image encoder to evaluate the text-to-image alignment. We carry out extensive experiments on three AGI quality assessment databases, and the experimental results show that our AMFF-Net obtains better performance than nine state-of-the-art blind IQA methods. The results of ablation experiments further demonstrate the effectiveness of the proposed multi-scale input strategy and AFF block.
{"title":"Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment","authors":"Tianwei Zhou;Songbai Tan;Wei Zhou;Yu Luo;Yuan-Gen Wang;Guanghui Yue","doi":"10.1109/TBC.2024.3391060","DOIUrl":"10.1109/TBC.2024.3391060","url":null,"abstract":"With the increasing maturity of the text-to-image and image-to-image generative models, AI-generated images (AGIs) have shown great application potential in advertisement, entertainment, education, social media, etc. Although remarkable advancements have been achieved in generative models, very few efforts have been paid to design relevant quality assessment models. In this paper, we propose a novel blind image quality assessment (IQA) network, named AMFF-Net, for AGIs. AMFF-Net evaluates AGI quality from three dimensions, i.e., “visual quality”, “authenticity”, and “consistency”. Specifically, inspired by the characteristics of the human visual system and motivated by the observation that “visual quality” and “authenticity” are characterized by both local and global aspects, AMFF-Net scales the image up and down and takes the scaled images and original-sized image as the inputs to obtain multi-scale features. After that, an Adaptive Feature Fusion (AFF) block is used to adaptively fuse the multi-scale features with learnable weights. In addition, considering the correlation between the image and prompt, AMFF-Net compares the semantic features from text encoder and image encoder to evaluate the text-to-image alignment. We carry out extensive experiments on three AGI quality assessment databases, and the experimental results show that our AMFF-Net obtains better performance than nine state-of-the-art blind IQA methods. The results of ablation experiments further demonstrate the effectiveness of the proposed multi-scale input strategy and AFF block.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"833-843"},"PeriodicalIF":3.2,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140886564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-05DOI: 10.1109/TBC.2024.3364859
{"title":"IEEE Transactions on Broadcasting Information for Authors","authors":"","doi":"10.1109/TBC.2024.3364859","DOIUrl":"https://doi.org/10.1109/TBC.2024.3364859","url":null,"abstract":"","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"C3-C4"},"PeriodicalIF":4.5,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10460268","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140042909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-29DOI: 10.1109/TBC.2024.3349789
Pansoo Kim;Hyuncheol Park
Variable Coding and Modulation (VCM), one of the Orthogonal Multiple Access (OMA) schemes and also known as channel adaptive Time-sharing Division Multiplexing (TDM) in satellite broadcasting and communication systems, has been widely utilized to mitigate heavy rain fading in Ku/Ka-band to enhance link availability in DVB-S2x (Digital Video Broadcasting - Satellite 2nd generation eXtension) standard. For next-generation satellite broadcasting and communication, we exploit Layer Division Multiplexing (LDM) technology, which is also referred to as Non-Orthogonal Multiple Access (NOMA). We conduct a performance assessment of VCM in terms of the total achievable data rate under Additive White Gaussian Noise (AWGN) channel and nonlinear satellite High Power Amplifier (HPA) impairments. In addition, we consider realistic Radio Frequency (RF) inaccuracies characterized by timing and carrier offsets. Through the identification of the performance impacts, we propose a robust carrier phase synchronization scheme to mitigate phase noise impairment. Numerical results demonstrate that our proposed scheme can enhance Packet Error Rate (PER) performance compared to the conventional one in a phase noise environment.
{"title":"Performance Assessment for LDM Transmission Based on DVB Satellite Standard","authors":"Pansoo Kim;Hyuncheol Park","doi":"10.1109/TBC.2024.3349789","DOIUrl":"10.1109/TBC.2024.3349789","url":null,"abstract":"Variable Coding and Modulation (VCM), one of the Orthogonal Multiple Access (OMA) schemes and also known as channel adaptive Time-sharing Division Multiplexing (TDM) in satellite broadcasting and communication systems, has been widely utilized to mitigate heavy rain fading in Ku/Ka-band to enhance link availability in DVB-S2x (Digital Video Broadcasting - Satellite 2nd generation eXtension) standard. For next-generation satellite broadcasting and communication, we exploit Layer Division Multiplexing (LDM) technology, which is also referred to as Non-Orthogonal Multiple Access (NOMA). We conduct a performance assessment of VCM in terms of the total achievable data rate under Additive White Gaussian Noise (AWGN) channel and nonlinear satellite High Power Amplifier (HPA) impairments. In addition, we consider realistic Radio Frequency (RF) inaccuracies characterized by timing and carrier offsets. Through the identification of the performance impacts, we propose a robust carrier phase synchronization scheme to mitigate phase noise impairment. Numerical results demonstrate that our proposed scheme can enhance Packet Error Rate (PER) performance compared to the conventional one in a phase noise environment.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"382-390"},"PeriodicalIF":4.5,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140009014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}