{"title":"SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds","authors":"Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong","doi":"arxiv-2409.10293","DOIUrl":null,"url":null,"abstract":"We propose an end-to-end attribute compression method for dense point clouds.\nThe proposed method combines a frequency sampling module, an adaptive scale\nfeature extraction module with geometry assistance, and a global hyperprior\nentropy model. The frequency sampling module uses a Hamming window and the Fast\nFourier Transform to extract high-frequency components of the point cloud. The\ndifference between the original point cloud and the sampled point cloud is\ndivided into multiple sub-point clouds. These sub-point clouds are then\npartitioned using an octree, providing a structured input for feature\nextraction. The feature extraction module integrates adaptive convolutional\nlayers and uses offset-attention to capture both local and global features.\nThen, a geometry-assisted attribute feature refinement module is used to refine\nthe extracted attribute features. Finally, a global hyperprior model is\nintroduced for entropy encoding. This model propagates hyperprior parameters\nfrom the deepest (base) layer to the other layers, further enhancing the\nencoding efficiency. At the decoder, a mirrored network is used to\nprogressively restore features and reconstruct the color attribute through\ntransposed convolutional layers. The proposed method encodes base layer\ninformation at a low bitrate and progressively adds enhancement layer\ninformation to improve reconstruction accuracy. Compared to the latest G-PCC\ntest model (TMC13v23) under the MPEG common test conditions (CTCs), the\nproposed method achieved an average Bjontegaard delta bitrate reduction of\n24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid\ndataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG\nCategory Dense dataset. This is the first instance of a learning-based codec\noutperforming the G-PCC standard on these datasets under the MPEG CTCs.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We propose an end-to-end attribute compression method for dense point clouds.
The proposed method combines a frequency sampling module, an adaptive scale
feature extraction module with geometry assistance, and a global hyperprior
entropy model. The frequency sampling module uses a Hamming window and the Fast
Fourier Transform to extract high-frequency components of the point cloud. The
difference between the original point cloud and the sampled point cloud is
divided into multiple sub-point clouds. These sub-point clouds are then
partitioned using an octree, providing a structured input for feature
extraction. The feature extraction module integrates adaptive convolutional
layers and uses offset-attention to capture both local and global features.
Then, a geometry-assisted attribute feature refinement module is used to refine
the extracted attribute features. Finally, a global hyperprior model is
introduced for entropy encoding. This model propagates hyperprior parameters
from the deepest (base) layer to the other layers, further enhancing the
encoding efficiency. At the decoder, a mirrored network is used to
progressively restore features and reconstruct the color attribute through
transposed convolutional layers. The proposed method encodes base layer
information at a low bitrate and progressively adds enhancement layer
information to improve reconstruction accuracy. Compared to the latest G-PCC
test model (TMC13v23) under the MPEG common test conditions (CTCs), the
proposed method achieved an average Bjontegaard delta bitrate reduction of
24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid
dataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG
Category Dense dataset. This is the first instance of a learning-based codec
outperforming the G-PCC standard on these datasets under the MPEG CTCs.