首页 > 最新文献

IEEE Transactions on Broadcasting最新文献

英文 中文
A Radio Propagation Modeling for a Cost-Effective DAB+ Service Coverage in Tunnels 隧道中低成本DAB+业务覆盖的无线电传播建模
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-28 DOI: 10.1109/TBC.2024.3484268
Bruno Sacco;Assunta De Vita
Providing a satisfactory coverage of Digital Audio Broadcasting (DAB+) service inside tunnels, in the VHF band, represents a very challenging task. The classic - but expensive - solution adopted so far is the use of radiating cables (“leaky feeders”) installed on the tunnel’s ceiling over its entire length. An alternative and cheaper solution, investigated in the present paper, is the so-called “direct RF radiation” approach, consisting of antennas placed inside the tunnel or just outside its entrance. A simulative analysis has been carried out in order to evaluate the impact of the design parameters, but also to serve as a tool for the estimation of the achievable service coverage. In addition, assuming a gallery to behave like a lossy waveguide, a mode analysis has been performed on the tunnel cross section, providing a fairly good estimation of the wave propagation attenuation. Interesting outcomes have been obtained from this simulative study: for instance, the behavior of the electric field as a function of distance suggests that, in the absence of geometric perturbations, the slope in the far zone is in good agreement with the attenuation value per unit distance of the main propagation mode. Curved sections cause a further attenuation which depends on the radius of curvature and the geometric dimensions of the tunnel section have a very strong impact on attenuation. Furthermore, more interesting outcomes show that, for arched tunnel sections, the fundamental propagation mode is horizontally polarized. As a result, the typical “whip” vehicular receiving antenna is not adequate: a horizontally polarized antenna would provide a much better service inside the tunnels. The investigation of the above findings have led to the set-up of a tool that can be applicable to every type of tunnel’s configuration for the verification and optimization of direct RF radiation installations for DAB/DAB+ services.
在隧道内VHF频段提供令人满意的数字音频广播(DAB+)服务覆盖是一项非常具有挑战性的任务。迄今为止采用的经典但昂贵的解决方案是在隧道的整个天花板上安装辐射电缆(“漏水馈线”)。本文研究的另一种更便宜的解决方案是所谓的“直接射频辐射”方法,由放置在隧道内或入口外的天线组成。为了评估设计参数的影响,还进行了模拟分析,以作为估计可实现的服务覆盖范围的工具。此外,假设一个通道表现得像一个有耗波导,对隧道横截面进行了模态分析,提供了一个相当好的波传播衰减估计。从这个模拟研究中得到了有趣的结果:例如,电场作为距离函数的行为表明,在没有几何扰动的情况下,远区斜率与主传播模式单位距离的衰减值很好地吻合。弯曲截面引起进一步的衰减,这取决于曲率半径和隧道截面的几何尺寸对衰减有很强的影响。此外,更有趣的结果表明,对于拱形隧道截面,基本传播模式是水平极化。因此,典型的“鞭状”车载接收天线是不够的:水平极化天线将在隧道内提供更好的服务。对上述发现的调查导致了一个工具的建立,该工具可适用于每种类型的隧道配置,用于验证和优化DAB/DAB+服务的直接射频辐射装置。
{"title":"A Radio Propagation Modeling for a Cost-Effective DAB+ Service Coverage in Tunnels","authors":"Bruno Sacco;Assunta De Vita","doi":"10.1109/TBC.2024.3484268","DOIUrl":"https://doi.org/10.1109/TBC.2024.3484268","url":null,"abstract":"Providing a satisfactory coverage of Digital Audio Broadcasting (DAB+) service inside tunnels, in the VHF band, represents a very challenging task. The classic - but expensive - solution adopted so far is the use of radiating cables (“leaky feeders”) installed on the tunnel’s ceiling over its entire length. An alternative and cheaper solution, investigated in the present paper, is the so-called “direct RF radiation” approach, consisting of antennas placed inside the tunnel or just outside its entrance. A simulative analysis has been carried out in order to evaluate the impact of the design parameters, but also to serve as a tool for the estimation of the achievable service coverage. In addition, assuming a gallery to behave like a lossy waveguide, a <italic>mode analysis</i> has been performed on the tunnel cross section, providing a fairly good estimation of the wave propagation attenuation. Interesting outcomes have been obtained from this simulative study: for instance, the behavior of the electric field as a function of distance suggests that, in the absence of geometric perturbations, the slope in the far zone is in good agreement with the attenuation value per unit distance of the main propagation mode. Curved sections cause a further attenuation which depends on the radius of curvature and the geometric dimensions of the tunnel section have a very strong impact on attenuation. Furthermore, more interesting outcomes show that, for arched tunnel sections, the fundamental propagation mode is horizontally polarized. As a result, the typical “whip” vehicular receiving antenna is not adequate: a horizontally polarized antenna would provide a much better service inside the tunnels. The investigation of the above findings have led to the set-up of a tool that can be applicable to every type of tunnel’s configuration for the verification and optimization of direct RF radiation installations for DAB/DAB+ services.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"52-62"},"PeriodicalIF":3.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outage Probability Analysis of Cooperative NOMA With Successive Refinement 逐次细化的协同NOMA中断概率分析
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-25 DOI: 10.1109/TBC.2024.3477000
Meng Cheng;Yifan Zhou;Shuang Wei;Shen Qian
This paper proposes a broadcasting system with cooperative non-orthogonal multiple access (CO-NOMA) and successive refinement (SR) coding. Specifically, signals containing the basic description of the source and the refinement are overlapped at the transmitter, and broadcast to user equipment (UE) having different qualities-of-service (QoS) requirements. Although the far UEs may only be capable of decoding the basic description allocated with higher transmit power, some of them may still demand a high QoS like the near UE. To address this issue, this work utilizes the near UE to establish a relay transmission, thereby the information recovered at the far UE can be refined. Considering three different relaying schemes, the outage probabilities of the proposed system are derived in closed-form, assuming all channels suffer from block Rayleigh fading. Based on the optimal power allocations, the best scheme yielding the lowest outage probabilities is found, and the advantages over down-link NOMA with SR (DN-SR) and conventional CO-NOMA are also demonstrated.
提出了一种基于协同非正交多址(CO-NOMA)和逐次细化(SR)编码的广播系统。具体来说,包含源的基本描述和改进的信号在发射机重叠,并广播到具有不同服务质量(QoS)要求的用户设备(UE)。尽管远端终端可能只能够解码分配给更高发射功率的基本描述,但其中一些终端可能仍然需要像近端终端一样的高QoS。为了解决这一问题,本工作利用近端终端建立中继传输,从而可以对远端终端恢复的信息进行细化。考虑三种不同的中继方案,在假定所有信道都存在块瑞利衰落的情况下,以封闭形式推导了系统的中断概率。在最优功率分配的基础上,找到了中断概率最小的最佳方案,并论证了该方案相对于带SR的下行NOMA (DN-SR)和传统CO-NOMA的优势。
{"title":"Outage Probability Analysis of Cooperative NOMA With Successive Refinement","authors":"Meng Cheng;Yifan Zhou;Shuang Wei;Shen Qian","doi":"10.1109/TBC.2024.3477000","DOIUrl":"https://doi.org/10.1109/TBC.2024.3477000","url":null,"abstract":"This paper proposes a broadcasting system with cooperative non-orthogonal multiple access (CO-NOMA) and successive refinement (SR) coding. Specifically, signals containing the basic description of the source and the refinement are overlapped at the transmitter, and broadcast to user equipment (UE) having different qualities-of-service (QoS) requirements. Although the far UEs may only be capable of decoding the basic description allocated with higher transmit power, some of them may still demand a high QoS like the near UE. To address this issue, this work utilizes the near UE to establish a relay transmission, thereby the information recovered at the far UE can be refined. Considering three different relaying schemes, the outage probabilities of the proposed system are derived in closed-form, assuming all channels suffer from block Rayleigh fading. Based on the optimal power allocations, the best scheme yielding the lowest outage probabilities is found, and the advantages over down-link NOMA with SR (DN-SR) and conventional CO-NOMA are also demonstrated.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"42-51"},"PeriodicalIF":3.2,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rate Control for Geometry-Based LiDAR Point Cloud Compression via Multi-Factor Modeling 基于多因素建模的几何激光雷达点云压缩速率控制
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-25 DOI: 10.1109/TBC.2024.3475808
Lizhi Hou;Linyao Gao;Qian Zhang;Yiling Xu;Jenq-Neng Hwang;Dong Wang
Geometry-based Point Cloud Compression (G-PCC) standard developed by the Moving Picture Experts Group has shown a promising prospect for compressing extremely sparse point clouds captured by the Light Detection And Ranging (LiDAR) equipment. However, as an essential functionality for low delay and limited bandwidth transmission, rate control for Geometry-based LiDAR Point Cloud Compression (G-LPCC) has not been fully studied. In this paper, we propose a rate control scheme for G-LPCC. We first adopt the best configuration of G-PCC for the LiDAR point cloud as the basis in terms of the Rate-Distortion (R-D) performance, which is the predictive tree (PT) for geometry compression and Region Adaptive Haar Transform (RAHT) for attribute compression. The common challenge of designing rate control algorithms for PT and RAHT is that their rates are determined by multiple factors. To address that, we propose a l domain rate control algorithm for PT that unifies the various geometry influential factors in the expression of the minimum arc length $mathrm {d}l$ to determine the final rate. A power-style geometry rate curve characterized by $mathrm {d}l$ has been modeled. By analyzing the distortion behavior of different quantization parameters, an adaptive bitrate control method is proposed to improve the R-D performance. In addition, we borrow the $rho $ factor from the previous 2D video rate control and successfully apply it to RAHT rate control. A simple and clean linear attribute rate curve characterized by $rho $ has been modeled, and a corresponding parameter estimation method based on the cumulative distribution function is proposed for bitrate control. The experimental results demonstrate that the proposed rate control algorithm can achieve accurate rate control with additional Bjontegaard-Delta-rate (BD-rate) gains.
由运动图像专家组开发的基于几何的点云压缩(G-PCC)标准显示了压缩由光探测和测距(LiDAR)设备捕获的极稀疏点云的良好前景。然而,作为低延迟和有限带宽传输的基本功能,基于几何的激光雷达点云压缩(G-LPCC)的速率控制尚未得到充分研究。本文提出了一种G-LPCC的速率控制方案。我们首先采用了LiDAR点云的G-PCC的最佳配置作为率失真(R-D)性能的基础,即用于几何压缩的预测树(PT)和用于属性压缩的区域自适应哈尔变换(RAHT)。设计PT和RAHT的速率控制算法的共同挑战是它们的速率由多种因素决定。为了解决这个问题,我们提出了一种用于PT的l域速率控制算法,该算法将最小弧长表达式中的各种几何影响因素统一起来,以确定最终速率。建立了以$ mathm {d}l$为特征的幂型几何速率曲线。通过分析不同量化参数的失真行为,提出了一种自适应比特率控制方法来提高R-D性能。此外,我们从之前的2D视频速率控制中借用了$rho $因子,并成功地应用到RAHT速率控制中。建立了以$rho $为特征的简单清晰的线性属性率曲线模型,并提出了一种基于累积分布函数的相应参数估计方法用于比特率控制。实验结果表明,所提出的速率控制算法可以在增加Bjontegaard-Delta-rate (BD-rate)增益的情况下实现精确的速率控制。
{"title":"Rate Control for Geometry-Based LiDAR Point Cloud Compression via Multi-Factor Modeling","authors":"Lizhi Hou;Linyao Gao;Qian Zhang;Yiling Xu;Jenq-Neng Hwang;Dong Wang","doi":"10.1109/TBC.2024.3475808","DOIUrl":"https://doi.org/10.1109/TBC.2024.3475808","url":null,"abstract":"Geometry-based Point Cloud Compression (G-PCC) standard developed by the Moving Picture Experts Group has shown a promising prospect for compressing extremely sparse point clouds captured by the Light Detection And Ranging (LiDAR) equipment. However, as an essential functionality for low delay and limited bandwidth transmission, rate control for Geometry-based LiDAR Point Cloud Compression (G-LPCC) has not been fully studied. In this paper, we propose a rate control scheme for G-LPCC. We first adopt the best configuration of G-PCC for the LiDAR point cloud as the basis in terms of the Rate-Distortion (R-D) performance, which is the predictive tree (PT) for geometry compression and Region Adaptive Haar Transform (RAHT) for attribute compression. The common challenge of designing rate control algorithms for PT and RAHT is that their rates are determined by multiple factors. To address that, we propose a <italic>l</i> domain rate control algorithm for PT that unifies the various geometry influential factors in the expression of the minimum arc length <inline-formula> <tex-math>$mathrm {d}l$ </tex-math></inline-formula> to determine the final rate. A power-style geometry rate curve characterized by <inline-formula> <tex-math>$mathrm {d}l$ </tex-math></inline-formula> has been modeled. By analyzing the distortion behavior of different quantization parameters, an adaptive bitrate control method is proposed to improve the R-D performance. In addition, we borrow the <inline-formula> <tex-math>$rho $ </tex-math></inline-formula> factor from the previous 2D video rate control and successfully apply it to RAHT rate control. A simple and clean linear attribute rate curve characterized by <inline-formula> <tex-math>$rho $ </tex-math></inline-formula> has been modeled, and a corresponding parameter estimation method based on the cumulative distribution function is proposed for bitrate control. The experimental results demonstrate that the proposed rate control algorithm can achieve accurate rate control with additional Bjontegaard-Delta-rate (BD-rate) gains.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"167-179"},"PeriodicalIF":3.2,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Canceling Signals for PTS Schemes to Improve the PAPR of OFDM Systems Without Side Information 改进无侧信息OFDM系统PAPR的PTS方案的优化取消信号
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-16 DOI: 10.1109/TBC.2024.3475748
The Khai Nguyen;Ebrahim Bedeer;Ha H. Nguyen;J. Eric Salt;Colin Howlett
This paper introduces a novel blind partial transmission sequence (PTS) scheme to lower the peak-to-average-power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) systems. Unlike existing PTS schemes in which the first sub-block (SB) is preserved as a phase reference for other SBs, we propose to add an optimized canceling signal (CS) to the first SB to further reduce the PAPR. The CS is designed such that they can be reconstructed by the receiver, and subtracted from the received signals before demodulation without requiring side information (SI). Since errors in reproducing the CS at the receiver can degrade the error performance, we design a novel CS protection mechanism specifically to protect the reconstruction of the CS. The proposed method is shown to significantly reduce the PAPR and symbol error rate (SER) without sacrificing the data rate due to using SI as many other existing PTS schemes.
为了降低正交频分复用(OFDM)系统的峰均功率比(PAPR),提出了一种新的盲部分传输序列(PTS)方案。现有的PTS方案保留了第一个子块(SB)作为其他子块的相位参考,我们提出在第一个子块中添加优化的抵消信号(CS)以进一步降低PAPR。CS的设计使得它们可以由接收器重建,并在解调前从接收信号中减去,而不需要侧信息(SI)。由于在接收端复制CS的错误会降低错误性能,我们设计了一种新的CS保护机制,专门用于保护CS的重建。结果表明,该方法在不牺牲数据速率的情况下显著降低了PAPR和符号错误率(SER)。
{"title":"Optimized Canceling Signals for PTS Schemes to Improve the PAPR of OFDM Systems Without Side Information","authors":"The Khai Nguyen;Ebrahim Bedeer;Ha H. Nguyen;J. Eric Salt;Colin Howlett","doi":"10.1109/TBC.2024.3475748","DOIUrl":"https://doi.org/10.1109/TBC.2024.3475748","url":null,"abstract":"This paper introduces a novel blind partial transmission sequence (PTS) scheme to lower the peak-to-average-power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) systems. Unlike existing PTS schemes in which the first sub-block (SB) is preserved as a phase reference for other SBs, we propose to add an optimized canceling signal (CS) to the first SB to further reduce the PAPR. The CS is designed such that they can be reconstructed by the receiver, and subtracted from the received signals before demodulation without requiring side information (SI). Since errors in reproducing the CS at the receiver can degrade the error performance, we design a novel CS protection mechanism specifically to protect the reconstruction of the CS. The proposed method is shown to significantly reduce the PAPR and symbol error rate (SER) without sacrificing the data rate due to using SI as many other existing PTS schemes.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"360-370"},"PeriodicalIF":3.2,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient and Flexible Complexity Control Method for Versatile Video Coding 一种高效灵活的多用途视频编码复杂度控制方法
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-15 DOI: 10.1109/TBC.2024.3475811
Yan Zhao;Chen Zhu;Jun Xu;Guo Lu;Li Song;Siwei Ma
Recently, numerous complexity control approaches have been proposed to achieve the target encoding complexity. However, only few of them were developed for VVC encoders. This paper fills this gap by proposing an efficient and flexible complexity control approach for VVC. The support for both Acceleration Ratio Control (ARC) and Encoding Time Control (ETC) makes our method highly versatile for various applications. At first, we introduce a sequence-level complexity estimation model to merge the ARC and ETC tasks. Then, four key modules are involved for complexity control: complexity allocation, complexity estimation, encoding configuration decision, and feedback. Specifically, we hierarchically allocate the complexity budget to three coding levels: GOP, frame, and Basic Unit (BU). Each BU’s allocation weight is decided by its SSIM distortion, whereby the perceptual quality can be ensured. The multi-complexity configurations are established by altering the partition depth and number of reference frames. Via tuning each BU’s configuration according to its target acceleration ratio and adaptively updating the control strategies based on the feedback, our scheme can precisely realize any achievable acceleration targets within one-pass encoding. Moreover, each BU’s un-accelerated reference encoding time, which is used to calculate its target acceleration ratio, is estimated by SVR models. Experiments prove that for both the ARC and ETC tasks, our scheme can precisely achieve a wide range of complexity targets (30% $sim ~100$ %) with negligible RD loss in PSNR and SSIM, outperforming other state-of-the-art methods.
近年来,人们提出了许多复杂度控制方法来实现目标编码复杂度。然而,只有少数是为VVC编码器开发的。本文提出了一种高效灵活的VVC复杂性控制方法,填补了这一空白。同时支持加速比控制(ARC)和编码时间控制(ETC),使我们的方法在各种应用中具有很高的通用性。首先,我们引入了一个序列级的复杂度估计模型来合并ARC和ETC任务。然后,复杂度控制涉及四个关键模块:复杂度分配、复杂度估计、编码配置决策和反馈。具体来说,我们将复杂性预算分层分配到三个编码级别:GOP,帧和基本单元(BU)。每个BU的分配权重由其SSIM失真度决定,从而保证感知质量。通过改变分割深度和参考帧的数量,建立了多复杂度结构。该方案根据目标加速度比调整各BU的配置,并根据反馈自适应更新控制策略,可以在一次编码中精确实现任何可实现的加速目标。此外,利用SVR模型估计每个BU的非加速参考编码时间,并以此计算其目标加速比。实验证明,对于ARC和ETC任务,我们的方案可以精确地实现范围广泛的复杂性目标(30% $sim ~100$ %),而PSNR和SSIM的RD损失可以忽略不计,优于其他最先进的方法。
{"title":"An Efficient and Flexible Complexity Control Method for Versatile Video Coding","authors":"Yan Zhao;Chen Zhu;Jun Xu;Guo Lu;Li Song;Siwei Ma","doi":"10.1109/TBC.2024.3475811","DOIUrl":"https://doi.org/10.1109/TBC.2024.3475811","url":null,"abstract":"Recently, numerous complexity control approaches have been proposed to achieve the target encoding complexity. However, only few of them were developed for VVC encoders. This paper fills this gap by proposing an efficient and flexible complexity control approach for VVC. The support for both Acceleration Ratio Control (ARC) and Encoding Time Control (ETC) makes our method highly versatile for various applications. At first, we introduce a sequence-level complexity estimation model to merge the ARC and ETC tasks. Then, four key modules are involved for complexity control: complexity allocation, complexity estimation, encoding configuration decision, and feedback. Specifically, we hierarchically allocate the complexity budget to three coding levels: GOP, frame, and Basic Unit (BU). Each BU’s allocation weight is decided by its SSIM distortion, whereby the perceptual quality can be ensured. The multi-complexity configurations are established by altering the partition depth and number of reference frames. Via tuning each BU’s configuration according to its target acceleration ratio and adaptively updating the control strategies based on the feedback, our scheme can precisely realize any achievable acceleration targets within one-pass encoding. Moreover, each BU’s un-accelerated reference encoding time, which is used to calculate its target acceleration ratio, is estimated by SVR models. Experiments prove that for both the ARC and ETC tasks, our scheme can precisely achieve a wide range of complexity targets (30% <inline-formula> <tex-math>$sim ~100$ </tex-math></inline-formula>%) with negligible RD loss in PSNR and SSIM, outperforming other state-of-the-art methods.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"96-110"},"PeriodicalIF":3.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rate-Splitting Multiple Access for Overloaded Multi-Group Multicast: A First Experimental Study 超载多组组播的分速多址首次实验研究
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-15 DOI: 10.1109/TBC.2024.3475743
Xinze Lyu;Sundar Aditya;Bruno Clerckx
Multi-group multicast (MGM) is an increasingly important form of multi-user wireless communications with several potential applications, such as video streaming, federated learning, safety-critical vehicular communications, etc. Rate-Splitting Multiple Access (RSMA) is a powerful interference management technique that can, in principle, achieve higher data rates and greater fairness for all types of multi-user wireless communications, including MGM. This paper presents the first-ever experimental evaluation of RSMA-based MGM, as well as the first-ever three-way comparison of RSMA-based, Space Division Multiple Access (SDMA)-based and Non-Orthogonal Multiple Access (NOMA)-based MGM. Using a measurement setup involving a two-antenna transmitter and two groups of two single-antenna users per group, we consider the problem of realizing throughput (max-min) fairness across groups for each of three multiple access schemes, over nine experimental cases in a line-of-sight environment capturing varying levels of pathloss difference and channel correlation across the groups. Over these cases, we observe that RSMA-based MGM achieves fairness at a higher throughput for each group than SDMA- and NOMA-based MGM. These findings validate RSMA-based MGM’s promised gains from the theoretical literature.
多组多播(MGM)是一种日益重要的多用户无线通信形式,具有许多潜在的应用,如视频流、联合学习、安全关键型车辆通信等。速率分割多址(RSMA)是一种强大的干扰管理技术,原则上可以为包括MGM在内的所有类型的多用户无线通信实现更高的数据速率和更高的公平性。本文首次对基于rsma的MGM进行了实验评估,并首次对基于rsma、基于空分多址(SDMA)和基于非正交多址(NOMA)的MGM进行了三方比较。使用涉及双天线发射机和每组两个单天线用户的两组的测量设置,我们考虑了在三种多址方案中实现组间吞吐量(最大最小)公平性的问题,在视距环境中超过九个实验案例,捕获了组间不同水平的路径损耗差异和信道相关性。在这些情况下,我们观察到基于rsma的MGM比基于SDMA和noma的MGM在每个组中以更高的吞吐量实现公平性。这些发现证实了基于rsma的米高梅公司从理论文献中所承诺的收益。
{"title":"Rate-Splitting Multiple Access for Overloaded Multi-Group Multicast: A First Experimental Study","authors":"Xinze Lyu;Sundar Aditya;Bruno Clerckx","doi":"10.1109/TBC.2024.3475743","DOIUrl":"https://doi.org/10.1109/TBC.2024.3475743","url":null,"abstract":"Multi-group multicast (MGM) is an increasingly important form of multi-user wireless communications with several potential applications, such as video streaming, federated learning, safety-critical vehicular communications, etc. Rate-Splitting Multiple Access (RSMA) is a powerful interference management technique that can, in principle, achieve higher data rates and greater fairness for all types of multi-user wireless communications, including MGM. This paper presents the first-ever experimental evaluation of RSMA-based MGM, as well as the first-ever three-way comparison of RSMA-based, Space Division Multiple Access (SDMA)-based and Non-Orthogonal Multiple Access (NOMA)-based MGM. Using a measurement setup involving a two-antenna transmitter and two groups of two single-antenna users per group, we consider the problem of realizing throughput (max-min) fairness across groups for each of three multiple access schemes, over nine experimental cases in a line-of-sight environment capturing varying levels of pathloss difference and channel correlation across the groups. Over these cases, we observe that RSMA-based MGM achieves fairness at a higher throughput for each group than SDMA- and NOMA-based MGM. These findings validate RSMA-based MGM’s promised gains from the theoretical literature.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"30-41"},"PeriodicalIF":3.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fast CU Partition Algorithm for AVS3 Based on Adaptive Tree Search and Pruning Optimization 基于自适应树搜索和剪枝优化的AVS3快速CU分区算法
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-08 DOI: 10.1109/TBC.2024.3465838
Jihang Yin;Honggang Qi;Liang Zhong;Zhiyuan Zhao;Qiang Wang;Jingran Wu;Xianguo Zhang
In the third generation of the Audio Video Coding Standard (AVS3), the size of Coding Tree Units (CTUs) has been expanded to four times larger than the previous generation, and more Coding Unit (CU) partition modes have been introduced, enhancing adaptability and efficiency in video encoding. CU partition in AVS3 not only brings improvements in encoding performance but also significantly increases the computational complexity, posing substantial challenges to real-time encoding. We propose a fast algorithm for CU partition, which features adaptive tree search and pruning optimization. Firstly, it adjusts the tree search order based on neighbor CU and lookahead information. Specifically, the analysis order of sub-blocks and parent blocks is adaptively adjusted: the potential optimal partition is prioritized, the non-optimal partitions are deferred, and an optimized order of first-full-then-sub or first-sub-then-full is selected. Secondly, the pruning optimization algorithm utilizes analyzed information to skip non-optimal partitions to reduce computational complexity. Due to the adjusted tree search order and the prioritization of potential optimal partitions, more analyzed information becomes available when evaluating non-optimal partitions, thereby improving the recall and precision rates of non-optimal partitions detection, saving more time, and introducing negligible loss in coding performance. The proposed algorithm has been implemented in the open-source encoder uavs3e. Experimental results indicate that under the three encoding configurations of AI, LD B, and RA, the algorithm achieves significant time saving of 51.41%, 40.57%, and 40.57%, with BDBR increases of 0.64%, 1.61%, and 1.04%, respectively. These results outperform the state-of-the-art fast CU partition algorithms.
第三代音视频编码标准AVS3 (Audio Video Coding Standard)将编码树单元(Coding Tree Units,简称ctu)的大小扩展到上一代的4倍,并引入了更多的编码单元(Coding Unit,简称CU)划分模式,提高了视频编码的适应性和效率。AVS3中的CU分区不仅提高了编码性能,但也显著增加了计算复杂度,给实时编码带来了很大的挑战。提出了一种基于自适应树搜索和剪枝优化的快速分区算法。首先,根据邻居CU和前瞻信息调整树的搜索顺序;具体来说,对子块和父块的分析顺序进行自适应调整:对潜在的最优分区进行优先级排序,对非最优分区进行延迟,并选择先满后子或先满后子的优化顺序。其次,剪枝优化算法利用分析信息跳过非最优分区,降低计算复杂度;由于调整了树的搜索顺序和潜在最优分区的优先级,在评估非最优分区时可以获得更多的分析信息,从而提高了非最优分区检测的召回率和准确率,节省了更多的时间,并且对编码性能的损失可以忽略不计。该算法已在开源编码器uavs3e中实现。实验结果表明,在AI、LD B和RA三种编码配置下,算法的时间节约率分别为51.41%、40.57%和40.57%,BDBR分别提高了0.64%、1.61%和1.04%。这些结果优于最先进的快速CU分区算法。
{"title":"A Fast CU Partition Algorithm for AVS3 Based on Adaptive Tree Search and Pruning Optimization","authors":"Jihang Yin;Honggang Qi;Liang Zhong;Zhiyuan Zhao;Qiang Wang;Jingran Wu;Xianguo Zhang","doi":"10.1109/TBC.2024.3465838","DOIUrl":"https://doi.org/10.1109/TBC.2024.3465838","url":null,"abstract":"In the third generation of the Audio Video Coding Standard (AVS3), the size of Coding Tree Units (CTUs) has been expanded to four times larger than the previous generation, and more Coding Unit (CU) partition modes have been introduced, enhancing adaptability and efficiency in video encoding. CU partition in AVS3 not only brings improvements in encoding performance but also significantly increases the computational complexity, posing substantial challenges to real-time encoding. We propose a fast algorithm for CU partition, which features adaptive tree search and pruning optimization. Firstly, it adjusts the tree search order based on neighbor CU and lookahead information. Specifically, the analysis order of sub-blocks and parent blocks is adaptively adjusted: the potential optimal partition is prioritized, the non-optimal partitions are deferred, and an optimized order of first-full-then-sub or first-sub-then-full is selected. Secondly, the pruning optimization algorithm utilizes analyzed information to skip non-optimal partitions to reduce computational complexity. Due to the adjusted tree search order and the prioritization of potential optimal partitions, more analyzed information becomes available when evaluating non-optimal partitions, thereby improving the recall and precision rates of non-optimal partitions detection, saving more time, and introducing negligible loss in coding performance. The proposed algorithm has been implemented in the open-source encoder uavs3e. Experimental results indicate that under the three encoding configurations of AI, LD B, and RA, the algorithm achieves significant time saving of 51.41%, 40.57%, and 40.57%, with BDBR increases of 0.64%, 1.61%, and 1.04%, respectively. These results outperform the state-of-the-art fast CU partition algorithms.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"125-141"},"PeriodicalIF":3.2,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Pixels to Rich-Nodes: A Cognition-Inspired Framework for Blind Image Quality Assessment 从像素到富节点:盲图像质量评估的认知启发框架
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-07 DOI: 10.1109/TBC.2024.3464418
Tian He;Lin Shi;Wenjia Xu;Yu Wang;Weijie Qiu;Houbang Guo;Zhuqing Jiang
Blind image quality assessment (BIQA) is a subjective perception-driven task, which necessitates assessment results consistent with human cognition. The human cognitive system inherently involves both separation and integration mechanisms. Recent works have witnessed the success of deep learning methods in separating distortion features. Nonetheless, traditional deep-learning-based BIQA methods predominantly depend on fixed topology to mimic the information integration in the brain, which gives rise to scale sensitivity and low flexibility. To handle this challenge, we delve into the dynamic interactions among neurons and propose a cognition-inspired BIQA model. Drawing insights from the rich club structure in network neuroscience, a graph-inspired feature integrator is devised to reconstruct the network topology. Specifically, we argue that the activity of individual neurons (pixels) tends to exhibit a random fluctuation with ambiguous meaning, while clear and coherent cognition arises from neurons with high connectivity (rich-nodes). Therefore, a self-attention mechanism is employed to establish strong semantic associations between pixels and rich-nodes. Subsequently, we design intra- and inter-layer graph structures to promote the feature interaction across spatial and scale dimensions. Such dynamic circuits endow the BIQA method with efficient, flexible, and robust information processing capabilities, so as to achieve more human-subjective assessment results. Moreover, since the limited samples in existing IQA datasets are prone to model overfitting, we devise two prior hypotheses: frequency prior and ranking prior. The former stepwise augments high-frequency components that reflect the distortion degree during the multilevel feature extraction, while the latter seeks to motivate the model’s in-depth comprehension of differences in sample quality. Extensive experiments on five publicly datasets reveal that the proposed algorithm achieves competitive results.
盲图像质量评价(BIQA)是一项主观感知驱动的任务,要求评价结果与人类认知一致。人类认知系统固有地包含分离机制和整合机制。最近的工作见证了深度学习方法在分离失真特征方面的成功。然而,传统的基于深度学习的BIQA方法主要依赖于固定的拓扑结构来模拟大脑中的信息集成,这导致了规模敏感性和低灵活性。为了应对这一挑战,我们深入研究了神经元之间的动态相互作用,并提出了一个认知启发的BIQA模型。从网络神经科学中的富俱乐部结构中汲取灵感,设计了一个图形启发的特征积分器来重建网络拓扑。具体来说,我们认为单个神经元(像素)的活动倾向于表现出具有模糊含义的随机波动,而清晰和连贯的认知来自具有高连接性(富节点)的神经元。因此,采用自注意机制在像素和富节点之间建立强语义关联。随后,我们设计了层内和层间的图结构,以促进跨空间和尺度维度的特征交互。这种动态电路赋予了BIQA方法高效、灵活、鲁棒的信息处理能力,从而获得更人性化的主观评价结果。此外,由于现有IQA数据集的有限样本容易出现模型过拟合,我们设计了两个先验假设:频率先验和排名先验。前者在多层特征提取过程中逐步增强反映失真程度的高频分量,而后者旨在激发模型对样本质量差异的深入理解。在五个公开的数据集上进行的大量实验表明,所提出的算法取得了具有竞争力的结果。
{"title":"From Pixels to Rich-Nodes: A Cognition-Inspired Framework for Blind Image Quality Assessment","authors":"Tian He;Lin Shi;Wenjia Xu;Yu Wang;Weijie Qiu;Houbang Guo;Zhuqing Jiang","doi":"10.1109/TBC.2024.3464418","DOIUrl":"https://doi.org/10.1109/TBC.2024.3464418","url":null,"abstract":"Blind image quality assessment (BIQA) is a subjective perception-driven task, which necessitates assessment results consistent with human cognition. The human cognitive system inherently involves both separation and integration mechanisms. Recent works have witnessed the success of deep learning methods in separating distortion features. Nonetheless, traditional deep-learning-based BIQA methods predominantly depend on fixed topology to mimic the information integration in the brain, which gives rise to scale sensitivity and low flexibility. To handle this challenge, we delve into the dynamic interactions among neurons and propose a cognition-inspired BIQA model. Drawing insights from the rich club structure in network neuroscience, a graph-inspired feature integrator is devised to reconstruct the network topology. Specifically, we argue that the activity of individual neurons (pixels) tends to exhibit a random fluctuation with ambiguous meaning, while clear and coherent cognition arises from neurons with high connectivity (rich-nodes). Therefore, a self-attention mechanism is employed to establish strong semantic associations between pixels and rich-nodes. Subsequently, we design intra- and inter-layer graph structures to promote the feature interaction across spatial and scale dimensions. Such dynamic circuits endow the BIQA method with efficient, flexible, and robust information processing capabilities, so as to achieve more human-subjective assessment results. Moreover, since the limited samples in existing IQA datasets are prone to model overfitting, we devise two prior hypotheses: frequency prior and ranking prior. The former stepwise augments high-frequency components that reflect the distortion degree during the multilevel feature extraction, while the latter seeks to motivate the model’s in-depth comprehension of differences in sample quality. Extensive experiments on five publicly datasets reveal that the proposed algorithm achieves competitive results.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"229-239"},"PeriodicalIF":3.2,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10706639","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
JND-LIC: Learned Image Compression via Just Noticeable Difference for Human Visual Perception JND-LIC:基于人类视觉感知的可察觉差异的学习图像压缩
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-27 DOI: 10.1109/TBC.2024.3464413
Zhaoqing Pan;Guoyu Zhang;Bo Peng;Jianjun Lei;Haoran Xie;Fu Lee Wang;Nam Ling
Existing human visual perception-oriented image compression methods well maintain the perceptual quality of compressed images, but they may introduce fake details into the compressed images, and cannot dynamically improve the perceptual rate-distortion performance at the pixel level. To address these issues, a just noticeable difference (JND)-based learned image compression (JND-LIC) method is proposed for human visual perception in this paper, in which a weight-shared model is used to extract image features and JND features, and the learned JND features are utilized as perceptual prior knowledge to assist the image coding process. In order to generate a highly compact image feature representation, a JND-based feature transform module is proposed to model the pixel-to-pixel masking correlation between the image features and the JND features. Furthermore, inspired by eye movement research that the human visual system perceives image degradation unevenly, a JND-guided quantization mechanism is proposed for the entropy coding, which adjusts the quantization step of each pixel to further eliminate perceptual redundancies. Extensive experimental results show that our proposed JND-LIC significantly improves the perceptual quality of compressed images with fewer coding bits compared to state-of-the-art learned image compression methods. Additionally, the proposed method can be flexibly integrated with various advanced learned image compression methods, and has robust generalization capabilities to improve the efficiency of perceptual coding.
现有的以人类视觉感知为导向的图像压缩方法能很好地保持压缩图像的感知质量,但可能会在压缩图像中引入虚假的细节,无法在像素级上动态提高感知率失真性能。为了解决这些问题,本文提出了一种基于JND的人类视觉感知学习图像压缩(JND- lic)方法,该方法使用权重共享模型提取图像特征和JND特征,并将学习到的JND特征作为感知先验知识来辅助图像编码过程。为了生成高度紧凑的图像特征表示,提出了一种基于JND的特征变换模块,对图像特征与JND特征之间的像素间掩蔽相关性进行建模。进一步,受人眼视觉系统感知图像退化不均匀的眼动研究启发,提出了一种基于jnd的熵编码量化机制,通过调整每个像素的量化步长,进一步消除感知冗余。大量的实验结果表明,与最先进的学习图像压缩方法相比,我们提出的JND-LIC以更少的编码位显著提高了压缩图像的感知质量。此外,该方法可以与各种先进的学习图像压缩方法灵活集成,并具有鲁棒的泛化能力,提高了感知编码的效率。
{"title":"JND-LIC: Learned Image Compression via Just Noticeable Difference for Human Visual Perception","authors":"Zhaoqing Pan;Guoyu Zhang;Bo Peng;Jianjun Lei;Haoran Xie;Fu Lee Wang;Nam Ling","doi":"10.1109/TBC.2024.3464413","DOIUrl":"https://doi.org/10.1109/TBC.2024.3464413","url":null,"abstract":"Existing human visual perception-oriented image compression methods well maintain the perceptual quality of compressed images, but they may introduce fake details into the compressed images, and cannot dynamically improve the perceptual rate-distortion performance at the pixel level. To address these issues, a just noticeable difference (JND)-based learned image compression (JND-LIC) method is proposed for human visual perception in this paper, in which a weight-shared model is used to extract image features and JND features, and the learned JND features are utilized as perceptual prior knowledge to assist the image coding process. In order to generate a highly compact image feature representation, a JND-based feature transform module is proposed to model the pixel-to-pixel masking correlation between the image features and the JND features. Furthermore, inspired by eye movement research that the human visual system perceives image degradation unevenly, a JND-guided quantization mechanism is proposed for the entropy coding, which adjusts the quantization step of each pixel to further eliminate perceptual redundancies. Extensive experimental results show that our proposed JND-LIC significantly improves the perceptual quality of compressed images with fewer coding bits compared to state-of-the-art learned image compression methods. Additionally, the proposed method can be flexibly integrated with various advanced learned image compression methods, and has robust generalization capabilities to improve the efficiency of perceptual coding.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"217-228"},"PeriodicalIF":3.2,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TSC-PCAC: Voxel Transformer and Sparse Convolution-Based Point Cloud Attribute Compression for 3D Broadcasting 基于体素变换和稀疏卷积的3D广播点云属性压缩
IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-25 DOI: 10.1109/TBC.2024.3464417
Zixi Guo;Yun Zhang;Linwei Zhu;Hanli Wang;Gangyi Jiang
Point cloud has been the mainstream representation for advanced 3D applications, such as virtual reality and augmented reality. However, the massive data amounts of point clouds is one of the most challenging issues for transmission and storage. In this paper, we propose an end-to-end voxel Transformer and Sparse Convolution based Point Cloud Attribute Compression (TSC-PCAC) for 3D broadcasting. Firstly, we present a framework of the TSC-PCAC, which includes Transformer and Sparse Convolutional Module (TSCM) based variational autoencoder and channel context module. Secondly, we propose a two-stage TSCM, where the first stage focuses on modeling local dependencies and feature representations of the point clouds, and the second stage captures global features through spatial and channel pooling encompassing larger receptive fields. This module effectively extracts global and local inter-point relevance to reduce informational redundancy. Thirdly, we design a TSCM based channel context module to exploit inter-channel correlations, which improves the predicted probability distribution of quantized latent representations and thus reduces the bitrate. Experimental results indicate that the proposed TSC-PCAC method achieves an average of 38.53%, 21.30%, and 11.19% bitrate reductions on datasets 8iVFB, Owlii, 8iVSLF, Volograms, and MVUB compared to the Sparse-PCAC, NF-PCAC, and G-PCC v23 methods, respectively. The encoding/decoding time costs are reduced 97.68%/98.78% on average compared to the Sparse-PCAC. The source code and the trained TSC-PCAC models are available at https://github.com/igizuxo/TSC-PCAC.
点云已经成为虚拟现实和增强现实等高级3D应用的主流代表。然而,点云的海量数据是传输和存储最具挑战性的问题之一。在本文中,我们提出了一种基于点云属性压缩(TSC-PCAC)的端到端体素变换和稀疏卷积的3D广播。首先,我们提出了TSC-PCAC的框架,其中包括基于变压器和稀疏卷积模块(TSCM)的变分自编码器和信道上下文模块。其次,我们提出了一个两阶段的TSCM,其中第一阶段侧重于对点云的局部依赖关系和特征表示进行建模,第二阶段通过包含更大接受域的空间和通道池来捕获全局特征。该模块有效地提取了全局和局部的点间关联,减少了信息冗余。第三,我们设计了一个基于TSCM的信道上下文模块,利用信道间的相关性,改善了量化潜在表示的预测概率分布,从而降低了比特率。实验结果表明,与稀疏pcac、NF-PCAC和G-PCC v23方法相比,TSC-PCAC方法在8iVFB、Owlii、8iVSLF、Volograms和MVUB数据集上的平均比特率分别降低了38.53%、21.30%和11.19%。与Sparse-PCAC相比,编码/解码时间成本平均降低了97.68%/98.78%。源代码和训练过的TSC-PCAC模型可在https://github.com/igizuxo/TSC-PCAC上获得。
{"title":"TSC-PCAC: Voxel Transformer and Sparse Convolution-Based Point Cloud Attribute Compression for 3D Broadcasting","authors":"Zixi Guo;Yun Zhang;Linwei Zhu;Hanli Wang;Gangyi Jiang","doi":"10.1109/TBC.2024.3464417","DOIUrl":"https://doi.org/10.1109/TBC.2024.3464417","url":null,"abstract":"Point cloud has been the mainstream representation for advanced 3D applications, such as virtual reality and augmented reality. However, the massive data amounts of point clouds is one of the most challenging issues for transmission and storage. In this paper, we propose an end-to-end voxel Transformer and Sparse Convolution based Point Cloud Attribute Compression (TSC-PCAC) for 3D broadcasting. Firstly, we present a framework of the TSC-PCAC, which includes Transformer and Sparse Convolutional Module (TSCM) based variational autoencoder and channel context module. Secondly, we propose a two-stage TSCM, where the first stage focuses on modeling local dependencies and feature representations of the point clouds, and the second stage captures global features through spatial and channel pooling encompassing larger receptive fields. This module effectively extracts global and local inter-point relevance to reduce informational redundancy. Thirdly, we design a TSCM based channel context module to exploit inter-channel correlations, which improves the predicted probability distribution of quantized latent representations and thus reduces the bitrate. Experimental results indicate that the proposed TSC-PCAC method achieves an average of 38.53%, 21.30%, and 11.19% bitrate reductions on datasets 8iVFB, Owlii, 8iVSLF, Volograms, and MVUB compared to the Sparse-PCAC, NF-PCAC, and G-PCC v23 methods, respectively. The encoding/decoding time costs are reduced 97.68%/98.78% on average compared to the Sparse-PCAC. The source code and the trained TSC-PCAC models are available at <uri>https://github.com/igizuxo/TSC-PCAC</uri>.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"154-166"},"PeriodicalIF":3.2,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Broadcasting
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1