Enhancing Few-Shot 3D Point Cloud Classification With Soft Interaction and Self-Attention

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Multimedia Pub Date : 2024-12-23 DOI:10.1109/TMM.2024.3521849

Abdullah Aman Khan;Jie Shao;Sidra Shafiq;Shuyuan Zhu;Heng Tao Shen

{"title":"Enhancing Few-Shot 3D Point Cloud Classification With Soft Interaction and Self-Attention","authors":"Abdullah Aman Khan;Jie Shao;Sidra Shafiq;Shuyuan Zhu;Heng Tao Shen","doi":"10.1109/TMM.2024.3521849","DOIUrl":null,"url":null,"abstract":"Few-shot learning is a crucial aspect of modern machine learning that enables models to recognize and classify objects efficiently with limited training data. The shortage of labeled 3D point cloud data calls for innovative solutions, particularly when novel classes emerge more frequently. In this paper, we propose a novel few-shot learning method for recognizing 3D point clouds. More specifically, this paper addresses the challenges of applying few-shot learning to 3D point cloud data, which poses unique difficulties due to the unordered and irregular nature of these data. We propose two new modules for few-shot based 3D point cloud classification, i.e., the Soft Interaction Module (SIM) and Self-Attention Residual Feedforward (SARF) Module. These modules balance and enhance the feature representation by enabling more relevant feature interactions and capturing long-range dependencies between query and support features. To validate the effectiveness of the proposed method, extensive experiments are conducted on benchmark datasets, including ModelNet40, ShapeNetCore, and ScanObjectNN. Our approach demonstrates superior performance in handling abrupt feature changes occurring during the meta-learning process. The results of the experiments indicate the superiority of our proposed method by demonstrating its robust generalization ability and better classification performance for 3D point cloud data with limited training samples.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1127-1141"},"PeriodicalIF":9.7000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10812858/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Few-shot learning is a crucial aspect of modern machine learning that enables models to recognize and classify objects efficiently with limited training data. The shortage of labeled 3D point cloud data calls for innovative solutions, particularly when novel classes emerge more frequently. In this paper, we propose a novel few-shot learning method for recognizing 3D point clouds. More specifically, this paper addresses the challenges of applying few-shot learning to 3D point cloud data, which poses unique difficulties due to the unordered and irregular nature of these data. We propose two new modules for few-shot based 3D point cloud classification, i.e., the Soft Interaction Module (SIM) and Self-Attention Residual Feedforward (SARF) Module. These modules balance and enhance the feature representation by enabling more relevant feature interactions and capturing long-range dependencies between query and support features. To validate the effectiveness of the proposed method, extensive experiments are conducted on benchmark datasets, including ModelNet40, ShapeNetCore, and ScanObjectNN. Our approach demonstrates superior performance in handling abrupt feature changes occurring during the meta-learning process. The results of the experiments indicate the superiority of our proposed method by demonstrating its robust generalization ability and better classification performance for 3D point cloud data with limited training samples.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用软交互和自我关注增强少镜头三维点云分类功能

Few-shot学习是现代机器学习的一个重要方面，它使模型能够在有限的训练数据下有效地识别和分类对象。缺乏标记的3D点云数据需要创新的解决方案，特别是当新的类出现得更频繁时。在本文中，我们提出了一种新的用于三维点云识别的少镜头学习方法。更具体地说，本文解决了将少镜头学习应用于3D点云数据的挑战，由于这些数据的无序和不规则性质，这带来了独特的困难。我们提出了两个新的基于少镜头的三维点云分类模块，即软交互模块（SIM）和自注意残差前馈模块（SARF）。这些模块通过支持更相关的特性交互和捕获查询和支持特性之间的长期依赖关系来平衡和增强特性表示。为了验证所提方法的有效性，我们在包括ModelNet40、ShapeNetCore和ScanObjectNN在内的基准数据集上进行了大量的实验。我们的方法在处理元学习过程中突然发生的特征变化方面表现出卓越的性能。实验结果表明，该方法对训练样本有限的三维点云数据具有鲁棒的泛化能力和较好的分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.

期刊最新文献

Screen Detection from Egocentric Image Streams Leveraging Multi-View Vision Language Model. HMS2Net: Heterogeneous Multimodal State Space Network via CLIP for Dynamic Scene Classification in Livestreaming 2025 Reviewers List Long-Tailed Continual Learning for Visual Food Recognition SSPD: Spatial-Spectral Prior Decoupling Model for Spectral Snapshot Compressive Imaging