点关注：对点云特征表示与传播的再思考

IF 8.4 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Multimedia Pub Date : 2024-12-24 DOI:10.1109/TMM.2024.3521745

Shichao Zhang;Yibo Ding;Tianxiang Huo;Shukai Duan;Lidan Wang

{"title":"点关注：对点云特征表示与传播的再思考","authors":"Shichao Zhang;Yibo Ding;Tianxiang Huo;Shukai Duan;Lidan Wang","doi":"10.1109/TMM.2024.3521745","DOIUrl":null,"url":null,"abstract":"Self-attention mechanisms have revolutionized natural language processing and computer vision. However, in point cloud analysis, most existing methods focus on point convolution operators for feature extraction, but fail to model long-range and hierarchical dependencies. To overcome above issues, in this paper, we present PointAttention, a novel network for point cloud feature representation and propagation. Specifically, this architecture uses a two-stage Learnable Self-attention for long-range attention weights learning, which is more effective than conventional triple attention. Furthermore, it employs a Hierarchical Learnable Attention Mechanism to formulate momentous global prior representation and perform fine-grained context understanding, which enables our framework to break through the limitation of the receptive field and reduce the loss of contexts. Interestingly, we show that the proposed Learnable Self-attention is equivalent to the coupling of two Softmax attention operations while having lower complexity. Extensive experiments demonstrate that our network achieves highly competitive performance on several challenging publicly available benchmarks, including point cloud classification on ScanObjectNN and ModelNet40, and part segmentation on ShapeNet-Part.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"327-339"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PointAttention: Rethinking Feature Representation and Propagation in Point Cloud\",\"authors\":\"Shichao Zhang;Yibo Ding;Tianxiang Huo;Shukai Duan;Lidan Wang\",\"doi\":\"10.1109/TMM.2024.3521745\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Self-attention mechanisms have revolutionized natural language processing and computer vision. However, in point cloud analysis, most existing methods focus on point convolution operators for feature extraction, but fail to model long-range and hierarchical dependencies. To overcome above issues, in this paper, we present PointAttention, a novel network for point cloud feature representation and propagation. Specifically, this architecture uses a two-stage Learnable Self-attention for long-range attention weights learning, which is more effective than conventional triple attention. Furthermore, it employs a Hierarchical Learnable Attention Mechanism to formulate momentous global prior representation and perform fine-grained context understanding, which enables our framework to break through the limitation of the receptive field and reduce the loss of contexts. Interestingly, we show that the proposed Learnable Self-attention is equivalent to the coupling of two Softmax attention operations while having lower complexity. Extensive experiments demonstrate that our network achieves highly competitive performance on several challenging publicly available benchmarks, including point cloud classification on ScanObjectNN and ModelNet40, and part segmentation on ShapeNet-Part.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"327-339\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10814668/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10814668/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

自我注意机制已经彻底改变了自然语言处理和计算机视觉。然而，在点云分析中，大多数现有的方法都集中在点卷积算子上进行特征提取，而不能对长期和层次依赖关系进行建模。为了克服上述问题，本文提出了一种新的点云特征表示和传播网络——PointAttention。具体来说，该体系结构使用两阶段可学习的自我注意进行远程注意权重学习，比传统的三重注意更有效。此外，它采用了一种分层可学习的注意机制来形成重要的全局先验表征，并进行细粒度的上下文理解，使我们的框架能够突破接受野的限制，减少上下文的丢失。有趣的是，我们证明了所提出的可学习自注意相当于两个Softmax注意操作的耦合，同时具有较低的复杂性。大量的实验表明，我们的网络在几个具有挑战性的公开基准测试上取得了极具竞争力的性能，包括ScanObjectNN和ModelNet40上的点云分类，以及ShapeNet-Part上的零件分割。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PointAttention: Rethinking Feature Representation and Propagation in Point Cloud

Self-attention mechanisms have revolutionized natural language processing and computer vision. However, in point cloud analysis, most existing methods focus on point convolution operators for feature extraction, but fail to model long-range and hierarchical dependencies. To overcome above issues, in this paper, we present PointAttention, a novel network for point cloud feature representation and propagation. Specifically, this architecture uses a two-stage Learnable Self-attention for long-range attention weights learning, which is more effective than conventional triple attention. Furthermore, it employs a Hierarchical Learnable Attention Mechanism to formulate momentous global prior representation and perform fine-grained context understanding, which enables our framework to break through the limitation of the receptive field and reduce the loss of contexts. Interestingly, we show that the proposed Learnable Self-attention is equivalent to the coupling of two Softmax attention operations while having lower complexity. Extensive experiments demonstrate that our network achieves highly competitive performance on several challenging publicly available benchmarks, including point cloud classification on ScanObjectNN and ModelNet40, and part segmentation on ShapeNet-Part.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.

期刊最新文献

Frequency-Guided Spatial Adaptation for Camouflaged Object Detection Cross-Scatter Sparse Dictionary Pair Learning for Cross-Domain Classification DPStyler: Dynamic PromptStyler for Source-Free Domain Generalization List of Reviewers Dual Semantic Reconstruction Network for Weakly Supervised Temporal Sentence Grounding