{"title":"Point cloud upsampling via a coarse-to-fine network with transformer-encoder","authors":"Yixi Li, Yanzhe Liu, Rong Chen, Hui Li, Na Zhao","doi":"10.1007/s00371-024-03535-8","DOIUrl":null,"url":null,"abstract":"<p>Point clouds provide a common geometric representation for burgeoning 3D graphics and vision tasks. To deal with the sparse, noisy and non-uniform output of most 3D data acquisition devices, this paper presents a novel coarse-to-fine learning framework that incorporates the Transformer-encoder and positional feature fusion. Its long-range dependencies with sensitive positional information allow robust feature embedding and fusion of points, especially noising elements and non-regular outliers. The proposed network consists of a Coarse Points Generator and a Points Offsets Refiner. The generator embodies a multi-feature Transformer-encoder and an EdgeConv-based feature reshaping to infer the coarse but dense upsampling point sets, whereas the refiner further learns the positions of upsampled points based on multi-feature fusion strategy that can adaptively adjust the fused features’ weights of coarse points and points offsets. Extensive qualitative and quantitative results on both synthetic and real-scanned datasets demonstrate the superiority of our method over the state-of-the-arts. Our code is publicly available at https://github.com/Superlyxi/CFT-PU.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"66 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03535-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Point clouds provide a common geometric representation for burgeoning 3D graphics and vision tasks. To deal with the sparse, noisy and non-uniform output of most 3D data acquisition devices, this paper presents a novel coarse-to-fine learning framework that incorporates the Transformer-encoder and positional feature fusion. Its long-range dependencies with sensitive positional information allow robust feature embedding and fusion of points, especially noising elements and non-regular outliers. The proposed network consists of a Coarse Points Generator and a Points Offsets Refiner. The generator embodies a multi-feature Transformer-encoder and an EdgeConv-based feature reshaping to infer the coarse but dense upsampling point sets, whereas the refiner further learns the positions of upsampled points based on multi-feature fusion strategy that can adaptively adjust the fused features’ weights of coarse points and points offsets. Extensive qualitative and quantitative results on both synthetic and real-scanned datasets demonstrate the superiority of our method over the state-of-the-arts. Our code is publicly available at https://github.com/Superlyxi/CFT-PU.