{"title":"SPAC:基于采样的高密度点云渐进式属性压缩","authors":"Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong","doi":"arxiv-2409.10293","DOIUrl":null,"url":null,"abstract":"We propose an end-to-end attribute compression method for dense point clouds.\nThe proposed method combines a frequency sampling module, an adaptive scale\nfeature extraction module with geometry assistance, and a global hyperprior\nentropy model. The frequency sampling module uses a Hamming window and the Fast\nFourier Transform to extract high-frequency components of the point cloud. The\ndifference between the original point cloud and the sampled point cloud is\ndivided into multiple sub-point clouds. These sub-point clouds are then\npartitioned using an octree, providing a structured input for feature\nextraction. The feature extraction module integrates adaptive convolutional\nlayers and uses offset-attention to capture both local and global features.\nThen, a geometry-assisted attribute feature refinement module is used to refine\nthe extracted attribute features. Finally, a global hyperprior model is\nintroduced for entropy encoding. This model propagates hyperprior parameters\nfrom the deepest (base) layer to the other layers, further enhancing the\nencoding efficiency. At the decoder, a mirrored network is used to\nprogressively restore features and reconstruct the color attribute through\ntransposed convolutional layers. The proposed method encodes base layer\ninformation at a low bitrate and progressively adds enhancement layer\ninformation to improve reconstruction accuracy. Compared to the latest G-PCC\ntest model (TMC13v23) under the MPEG common test conditions (CTCs), the\nproposed method achieved an average Bjontegaard delta bitrate reduction of\n24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid\ndataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG\nCategory Dense dataset. This is the first instance of a learning-based codec\noutperforming the G-PCC standard on these datasets under the MPEG CTCs.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds\",\"authors\":\"Xiaolong Mao, Hui Yuan, Tian Guo, Shiqi Jiang, Raouf Hamzaoui, Sam Kwong\",\"doi\":\"arxiv-2409.10293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose an end-to-end attribute compression method for dense point clouds.\\nThe proposed method combines a frequency sampling module, an adaptive scale\\nfeature extraction module with geometry assistance, and a global hyperprior\\nentropy model. The frequency sampling module uses a Hamming window and the Fast\\nFourier Transform to extract high-frequency components of the point cloud. The\\ndifference between the original point cloud and the sampled point cloud is\\ndivided into multiple sub-point clouds. These sub-point clouds are then\\npartitioned using an octree, providing a structured input for feature\\nextraction. The feature extraction module integrates adaptive convolutional\\nlayers and uses offset-attention to capture both local and global features.\\nThen, a geometry-assisted attribute feature refinement module is used to refine\\nthe extracted attribute features. Finally, a global hyperprior model is\\nintroduced for entropy encoding. This model propagates hyperprior parameters\\nfrom the deepest (base) layer to the other layers, further enhancing the\\nencoding efficiency. At the decoder, a mirrored network is used to\\nprogressively restore features and reconstruct the color attribute through\\ntransposed convolutional layers. The proposed method encodes base layer\\ninformation at a low bitrate and progressively adds enhancement layer\\ninformation to improve reconstruction accuracy. Compared to the latest G-PCC\\ntest model (TMC13v23) under the MPEG common test conditions (CTCs), the\\nproposed method achieved an average Bjontegaard delta bitrate reduction of\\n24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid\\ndataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG\\nCategory Dense dataset. This is the first instance of a learning-based codec\\noutperforming the G-PCC standard on these datasets under the MPEG CTCs.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SPAC: Sampling-based Progressive Attribute Compression for Dense Point Clouds
We propose an end-to-end attribute compression method for dense point clouds.
The proposed method combines a frequency sampling module, an adaptive scale
feature extraction module with geometry assistance, and a global hyperprior
entropy model. The frequency sampling module uses a Hamming window and the Fast
Fourier Transform to extract high-frequency components of the point cloud. The
difference between the original point cloud and the sampled point cloud is
divided into multiple sub-point clouds. These sub-point clouds are then
partitioned using an octree, providing a structured input for feature
extraction. The feature extraction module integrates adaptive convolutional
layers and uses offset-attention to capture both local and global features.
Then, a geometry-assisted attribute feature refinement module is used to refine
the extracted attribute features. Finally, a global hyperprior model is
introduced for entropy encoding. This model propagates hyperprior parameters
from the deepest (base) layer to the other layers, further enhancing the
encoding efficiency. At the decoder, a mirrored network is used to
progressively restore features and reconstruct the color attribute through
transposed convolutional layers. The proposed method encodes base layer
information at a low bitrate and progressively adds enhancement layer
information to improve reconstruction accuracy. Compared to the latest G-PCC
test model (TMC13v23) under the MPEG common test conditions (CTCs), the
proposed method achieved an average Bjontegaard delta bitrate reduction of
24.58% for the Y component (21.23% for YUV combined) on the MPEG Category Solid
dataset and 22.48% for the Y component (17.19% for YUV combined) on the MPEG
Category Dense dataset. This is the first instance of a learning-based codec
outperforming the G-PCC standard on these datasets under the MPEG CTCs.