Efficient Encoding and Decoding of Voxelized Models for Machine Learning-Based Applications

IF 3.4 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Access Pub Date : 2025-01-06 DOI:10.1109/ACCESS.2025.3526202

Damjan Strnad;Štefan Kohek;Borut Žalik;Libor Váša;Andrej Nerat

{"title":"Efficient Encoding and Decoding of Voxelized Models for Machine Learning-Based Applications","authors":"Damjan Strnad;Štefan Kohek;Borut Žalik;Libor Váša;Andrej Nerat","doi":"10.1109/ACCESS.2025.3526202","DOIUrl":null,"url":null,"abstract":"Point clouds have become a popular training data for many practical applications of machine learning in the fields of environmental modeling and precision agriculture. In order to reduce high space requirements and the effect of noise in the data, point clouds are often transformed to a structured representation such as a voxel grid. Storing, transmitting and consuming voxelized geometry, however, remains a challenging problem for machine learning pipelines running on devices with limited amount of on-chip memory with low access latency. A viable solution is to store the data in a compact encoded format, and perform on-the-fly decoding when it is needed for processing. Such on-demand expansion must be fast in order to avoid introducing substantial additional delay to the pipeline. This can be achieved by parallel decoding, which is particularly suitable for massively parallel architecture of GPUs on which the majority of machine learning is currently executed. In this paper, we present such method for efficient and parallelizable encoding/decoding of voxelized geometry. The method employs multi-level context-aware prediction of voxel occupancy based on the extracted binary feature prediction table, and encodes the residual grid with a pointerless sparse voxel octree (PSVO). We particularly focused on encoding the datasets of voxelized trees, obtained from both synthetic tree models and LiDAR point clouds of real trees. The method achieved 15.6% and 12.8% reduction of storage size with respect to plain PSVO on synthetic and real dataset, respectively. We also tested the method on a general set of diverse voxelized objects, where an average 11% improvement of storage space was achieved.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"5551-5561"},"PeriodicalIF":3.4000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10829598","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10829598/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Point clouds have become a popular training data for many practical applications of machine learning in the fields of environmental modeling and precision agriculture. In order to reduce high space requirements and the effect of noise in the data, point clouds are often transformed to a structured representation such as a voxel grid. Storing, transmitting and consuming voxelized geometry, however, remains a challenging problem for machine learning pipelines running on devices with limited amount of on-chip memory with low access latency. A viable solution is to store the data in a compact encoded format, and perform on-the-fly decoding when it is needed for processing. Such on-demand expansion must be fast in order to avoid introducing substantial additional delay to the pipeline. This can be achieved by parallel decoding, which is particularly suitable for massively parallel architecture of GPUs on which the majority of machine learning is currently executed. In this paper, we present such method for efficient and parallelizable encoding/decoding of voxelized geometry. The method employs multi-level context-aware prediction of voxel occupancy based on the extracted binary feature prediction table, and encodes the residual grid with a pointerless sparse voxel octree (PSVO). We particularly focused on encoding the datasets of voxelized trees, obtained from both synthetic tree models and LiDAR point clouds of real trees. The method achieved 15.6% and 12.8% reduction of storage size with respect to plain PSVO on synthetic and real dataset, respectively. We also tested the method on a general set of diverse voxelized objects, where an average 11% improvement of storage space was achieved.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.