SPAC：基于采样的高密度点云渐进式属性压缩

arXiv - EE - Image and Video Processing Pub Date : 2024-09-16 DOI:arxiv-2409.10293

Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong

{"title":"SPAC：基于采样的高密度点云渐进式属性压缩","authors":"Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong","doi":"arxiv-2409.10293","DOIUrl":null,"url":null,"abstract":"We propose an end-to-end attribute compression method for dense point clouds.\nThe proposed method combines a frequency sampling module, an adaptive scale\nfeature extraction module with geometry assistance, and a global hyperprior\nentropy model. The frequency sampling module uses a Hamming window and the Fast\nFourier Transform to extract high-frequency components of the point cloud. The\ndifference between the original point cloud and the sampled point cloud is\ndivided into multiple sub-point clouds. These sub-point clouds are then\npartitioned using an octree, providing a structured input for feature\nextraction. The feature extraction module integrates adaptive convolutional\nlayers and uses offset-attention to capture both local and global features.\nThen, a geometry-assisted attribute feature refinement module is used to refine\nthe extracted attribute features. Finally, a global hyperprior model is\nintroduced for entropy encoding. This model propagates hyperprior parameters\nfrom the deepest (base) layer to the other layers, further enhancing the\nencoding efficiency. At the decoder, a mirrored network is used to\nprogressively restore features and reconstruct the color attribute through\ntransposed convolutional layers. The proposed method encodes base layer\ninformation at a low bitrate and progressively adds enhancement layer\ninformation to improve reconstruction accuracy. Compared to the latest G-PCC\ntest model (TMC13v23) under the MPEG common test conditions (CTCs), the\nproposed method achieved an average Bjontegaard delta bitrate reduction of\n24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid\ndataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG\nCategory Dense dataset. This is the first instance of a learning-based codec\noutperforming the G-PCC standard on these datasets under the MPEG CTCs.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds\",\"authors\":\"Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong\",\"doi\":\"arxiv-2409.10293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose an end-to-end attribute compression method for dense point clouds.\\nThe proposed method combines a frequency sampling module, an adaptive scale\\nfeature extraction module with geometry assistance, and a global hyperprior\\nentropy model. The frequency sampling module uses a Hamming window and the Fast\\nFourier Transform to extract high-frequency components of the point cloud. The\\ndifference between the original point cloud and the sampled point cloud is\\ndivided into multiple sub-point clouds. These sub-point clouds are then\\npartitioned using an octree, providing a structured input for feature\\nextraction. The feature extraction module integrates adaptive convolutional\\nlayers and uses offset-attention to capture both local and global features.\\nThen, a geometry-assisted attribute feature refinement module is used to refine\\nthe extracted attribute features. Finally, a global hyperprior model is\\nintroduced for entropy encoding. This model propagates hyperprior parameters\\nfrom the deepest (base) layer to the other layers, further enhancing the\\nencoding efficiency. At the decoder, a mirrored network is used to\\nprogressively restore features and reconstruct the color attribute through\\ntransposed convolutional layers. The proposed method encodes base layer\\ninformation at a low bitrate and progressively adds enhancement layer\\ninformation to improve reconstruction accuracy. Compared to the latest G-PCC\\ntest model (TMC13v23) under the MPEG common test conditions (CTCs), the\\nproposed method achieved an average Bjontegaard delta bitrate reduction of\\n24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid\\ndataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG\\nCategory Dense dataset. This is the first instance of a learning-based codec\\noutperforming the G-PCC standard on these datasets under the MPEG CTCs.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一种端到端的高密度点云属性压缩方法，该方法结合了频率采样模块、几何辅助自适应比例特征提取模块和全局超前熵模型。频率采样模块使用汉明窗和快速傅里叶变换来提取点云的高频成分。原始点云和采样点云之间的差异被划分为多个子点云。然后使用八叉树对这些子点云进行分区，为特征提取提供结构化输入。特征提取模块集成了自适应卷积分层，并使用偏移注意来捕捉局部和全局特征，然后使用几何辅助属性特征细化模块来细化提取的属性特征。最后，引入全局超先验模型进行熵编码。该模型将超先验参数从最深（基础）层传播到其他层，进一步提高了编码效率。在解码器中，使用镜像网络逐步还原特征，并通过交叉卷积层重建颜色属性。所提出的方法以较低的比特率对基础层信息进行编码，并逐步增加增强层信息以提高重构精度。在 MPEG 通用测试条件（CTCs）下，与最新的 G-PCC 测试模型（TMC13v23）相比，拟议方法在 MPEG 类别 Soliddataset 上的 Y 分量平均比特率降低了 24.58%（YUV 合并比特率降低了 21.23%），在 MPEG 类别 Dense 数据集上的 Y 分量平均比特率降低了 22.48%（YUV 合并比特率降低了 17.19%）。这是基于学习的编解码器在 MPEG CTC 下的这些数据集上首次超越 G-PCC 标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds

We propose an end-to-end attribute compression method for dense point clouds. The proposed method combines a frequency sampling module, an adaptive scale feature extraction module with geometry assistance, and a global hyperprior entropy model. The frequency sampling module uses a Hamming window and the Fast Fourier Transform to extract high-frequency components of the point cloud. The difference between the original point cloud and the sampled point cloud is divided into multiple sub-point clouds. These sub-point clouds are then partitioned using an octree, providing a structured input for feature extraction. The feature extraction module integrates adaptive convolutional layers and uses offset-attention to capture both local and global features. Then, a geometry-assisted attribute feature refinement module is used to refine the extracted attribute features. Finally, a global hyperprior model is introduced for entropy encoding. This model propagates hyperprior parameters from the deepest (base) layer to the other layers, further enhancing the encoding efficiency. At the decoder, a mirrored network is used to progressively restore features and reconstruct the color attribute through transposed convolutional layers. The proposed method encodes base layer information at a low bitrate and progressively adds enhancement layer information to improve reconstruction accuracy. Compared to the latest G-PCC test model (TMC13v23) under the MPEG common test conditions (CTCs), the proposed method achieved an average Bjontegaard delta bitrate reduction of 24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid dataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG Category Dense dataset. This is the first instance of a learning-based codec outperforming the G-PCC standard on these datasets under the MPEG CTCs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量