Point-MPP: Point Cloud Self-Supervised Learning From Masked Position Prediction

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2024-10-24 DOI:10.1109/TNNLS.2024.3479309
Songlin Fan;Wei Gao;Ge Li
{"title":"Point-MPP: Point Cloud Self-Supervised Learning From Masked Position Prediction","authors":"Songlin Fan;Wei Gao;Ge Li","doi":"10.1109/TNNLS.2024.3479309","DOIUrl":null,"url":null,"abstract":"Masked autoencoding has gained momentum for improving fine-tuning performance in many downstream tasks. However, it tends to focus on low-level reconstruction details, lacking high-level semantics and resulting in weak transfer capability. This article presents a novel jigsaw puzzle solver inspired by the idea that predicting the positions of disordered point cloud patches provides more semantic information, similar to how children learn by solving jigsaw puzzles. Our method adopts the mask-then-predict paradigm, erasing the positions of selected point patches rather than their contents. We first partition input point clouds into irregular patches and randomly erase the positions of some patches. Then, a Transformer-based model is used to learn high-level semantic features and regress the positions of the masked patches. This approach forces the model to focus on learning transfer-robust semantics while paying less attention to low-level details. To tie the predictions within the encoding space, we further introduce a consistency constraint on their latent representations to encourage the encoded features to contain more semantic cues. We demonstrate that a standard Transformer backbone with our pretraining scheme can capture discriminative point cloud semantic information. Furthermore, extensive experiments indicate that our method outperforms the previous best competitor across six popular downstream vision tasks, achieving new state-of-the-art performance. Codes will be available at <uri>https://git.openi.org.cn/OpenPointCloud/Point-MPP</uri>.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 7","pages":"12964-12976"},"PeriodicalIF":8.9000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10734244/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Masked autoencoding has gained momentum for improving fine-tuning performance in many downstream tasks. However, it tends to focus on low-level reconstruction details, lacking high-level semantics and resulting in weak transfer capability. This article presents a novel jigsaw puzzle solver inspired by the idea that predicting the positions of disordered point cloud patches provides more semantic information, similar to how children learn by solving jigsaw puzzles. Our method adopts the mask-then-predict paradigm, erasing the positions of selected point patches rather than their contents. We first partition input point clouds into irregular patches and randomly erase the positions of some patches. Then, a Transformer-based model is used to learn high-level semantic features and regress the positions of the masked patches. This approach forces the model to focus on learning transfer-robust semantics while paying less attention to low-level details. To tie the predictions within the encoding space, we further introduce a consistency constraint on their latent representations to encourage the encoded features to contain more semantic cues. We demonstrate that a standard Transformer backbone with our pretraining scheme can capture discriminative point cloud semantic information. Furthermore, extensive experiments indicate that our method outperforms the previous best competitor across six popular downstream vision tasks, achieving new state-of-the-art performance. Codes will be available at https://git.openi.org.cn/OpenPointCloud/Point-MPP.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Point-MPP:从屏蔽位置预测中进行点云自我监督学习
掩码自动编码在许多下游任务中获得了改善微调性能的势头。然而,它往往侧重于底层的重构细节,缺乏高层语义,导致传输能力较弱。这篇文章提出了一种新的拼图游戏解决方案,其灵感来自于预测无序点云补丁的位置可以提供更多的语义信息,类似于儿童如何通过解决拼图游戏来学习。我们的方法采用mask-then-predict模式,擦除所选点补丁的位置而不是其内容。我们首先将输入点云划分成不规则的小块,并随机擦除一些小块的位置。然后,使用基于transformer的模型学习高级语义特征并回归掩码补丁的位置。这种方法迫使模型专注于学习迁移鲁棒语义,而较少关注底层细节。为了将预测与编码空间联系起来,我们进一步对其潜在表示引入一致性约束,以鼓励编码特征包含更多的语义线索。我们证明了使用我们的预训练方案的标准Transformer主干可以捕获有区别的点云语义信息。此外,大量的实验表明,我们的方法在六个流行的下游视觉任务上优于之前的最佳竞争对手,实现了新的最先进的性能。代码可在https://git.openi.org.cn/OpenPointCloud/Point-MPP上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
期刊最新文献
NPSVC++: A Representation Learning Framework for Nonparallel Classifiers. Heuristic Knowledge-Driven Spatio-Temporal Forecasting via Multigraph. Robust Image-Based Visual Servoing Formation Control for Quadrotors Without Communication via Reinforcement Learning. Virtual Domain-Guided Cross-Modal Distillation With Multiview Correlation Awareness for Domain-Specific Multimodal Neural Machine Translation. Redundancy Removal and Knowledge Alignment-Based Personalized Federated Learning for Online Condition Monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1