Explore In-Context Learning for 3D Point Cloud Understanding

ArXiv Pub Date : 2023-06-14 DOI:10.48550/arXiv.2306.08659
Zhongbin Fang, Xiangtai Li, Xia Li, J. Buhmann, Chen Change Loy, Mengyuan Liu
{"title":"Explore In-Context Learning for 3D Point Cloud Understanding","authors":"Zhongbin Fang, Xiangtai Li, Xia Li, J. Buhmann, Chen Change Loy, Mengyuan Liu","doi":"10.48550/arXiv.2306.08659","DOIUrl":null,"url":null,"abstract":"With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks. Meanwhile, in-context learning is still largely unexplored in the 3D point cloud domain. Although masked modeling has been successfully applied for in-context learning in 2D vision, directly extending it to 3D point clouds remains a formidable challenge. In the case of point clouds, the tokens themselves are the point cloud positions (coordinates) that are masked during inference. Moreover, position embedding in previous works may inadvertently introduce information leakage. To address these challenges, we introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds, where both inputs and outputs are modeled as coordinates for each task. Additionally, we propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator, effectively resolving the aforementioned technical issues. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks. Furthermore, with a more effective prompt selection strategy, our framework surpasses the results of individually trained models.","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.08659","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks. Meanwhile, in-context learning is still largely unexplored in the 3D point cloud domain. Although masked modeling has been successfully applied for in-context learning in 2D vision, directly extending it to 3D point clouds remains a formidable challenge. In the case of point clouds, the tokens themselves are the point cloud positions (coordinates) that are masked during inference. Moreover, position embedding in previous works may inadvertently introduce information leakage. To address these challenges, we introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds, where both inputs and outputs are modeled as coordinates for each task. Additionally, we propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator, effectively resolving the aforementioned technical issues. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks. Furthermore, with a more effective prompt selection strategy, our framework surpasses the results of individually trained models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
探索三维点云理解的上下文学习
随着基于广泛数据训练的大规模模型的兴起,上下文学习已经成为一种新的学习范式,在自然语言处理和计算机视觉任务中显示出巨大的潜力。与此同时,在三维点云领域中,上下文学习在很大程度上仍未被探索。虽然蒙面建模已经成功地应用于2D视觉中的情境学习,但将其直接扩展到3D点云仍然是一个巨大的挑战。在点云的情况下,令牌本身就是在推理过程中被掩盖的点云位置(坐标)。此外,在以前的作品中,位置嵌入可能会在不经意间引入信息泄露。为了解决这些挑战,我们引入了一个新的框架,名为上下文点,专为3D点云中的上下文学习而设计,其中输入和输出都被建模为每个任务的坐标。此外,我们提出了联合采样模块,精心设计与一般点采样算子协同工作,有效地解决了上述技术问题。我们进行了大量的实验来验证我们提出的方法在处理广泛任务方面的多功能性和适应性。此外,通过更有效的快速选择策略,我们的框架超越了单独训练模型的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A novel Boltzmann equation solver for calculation of dose and fluence spectra distributions for proton beam therapy. GOUHFI 2.0: A Next-Generation Toolbox for Brain Segmentation and Cortex Parcellation at Ultra-High Field MRI. Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning. Conditionally Site-Independent Neural Evolution of Antibody Sequences. Causal Interpretation of Neural Network Computations with Contribution Decomposition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1