{"title":"基于注意引导的点云和多视角三维形状识别融合网络","authors":"Bo Peng, Zengrui Yu, Jianjun Lei, Jiahui Song","doi":"10.1109/VCIP49819.2020.9301813","DOIUrl":null,"url":null,"abstract":"With the dramatic growth of 3D shape data, 3D shape recognition has become a hot research topic in the field of computer vision. How to effectively utilize the multimodal characteristics of 3D shape has been one of the key problems to boost the performance of 3D shape recognition. In this paper, we propose a novel attention-guided fusion network of point cloud and multiple views for 3D shape recognition. Specifically, in order to obtain more discriminative descriptor for 3D shape data, the inter-modality attention enhancement module and view-context attention fusion module are proposed to gradually refine and fuse the features of the point cloud and multiple views. In the inter-modality attention enhancement module, the inter-modality attention mask based on the joint feature representation is computed, so that the features of each modality are enhanced by fusing the correlative information between two modalities. After that, the view-context attention fusion module is proposed to explore the context information of multiple views, and fuse the enhanced features to obtain more discriminative descriptor for 3D shape data. Experimental results on the ModelNet40 dataset demonstrate that the proposed method achieves promising performance compared with state-of-the-art methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"60 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Attention-Guided Fusion Network of Point Cloud and Multiple Views for 3D Shape Recognition\",\"authors\":\"Bo Peng, Zengrui Yu, Jianjun Lei, Jiahui Song\",\"doi\":\"10.1109/VCIP49819.2020.9301813\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the dramatic growth of 3D shape data, 3D shape recognition has become a hot research topic in the field of computer vision. How to effectively utilize the multimodal characteristics of 3D shape has been one of the key problems to boost the performance of 3D shape recognition. In this paper, we propose a novel attention-guided fusion network of point cloud and multiple views for 3D shape recognition. Specifically, in order to obtain more discriminative descriptor for 3D shape data, the inter-modality attention enhancement module and view-context attention fusion module are proposed to gradually refine and fuse the features of the point cloud and multiple views. In the inter-modality attention enhancement module, the inter-modality attention mask based on the joint feature representation is computed, so that the features of each modality are enhanced by fusing the correlative information between two modalities. After that, the view-context attention fusion module is proposed to explore the context information of multiple views, and fuse the enhanced features to obtain more discriminative descriptor for 3D shape data. Experimental results on the ModelNet40 dataset demonstrate that the proposed method achieves promising performance compared with state-of-the-art methods.\",\"PeriodicalId\":431880,\"journal\":{\"name\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"60 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP49819.2020.9301813\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Attention-Guided Fusion Network of Point Cloud and Multiple Views for 3D Shape Recognition
With the dramatic growth of 3D shape data, 3D shape recognition has become a hot research topic in the field of computer vision. How to effectively utilize the multimodal characteristics of 3D shape has been one of the key problems to boost the performance of 3D shape recognition. In this paper, we propose a novel attention-guided fusion network of point cloud and multiple views for 3D shape recognition. Specifically, in order to obtain more discriminative descriptor for 3D shape data, the inter-modality attention enhancement module and view-context attention fusion module are proposed to gradually refine and fuse the features of the point cloud and multiple views. In the inter-modality attention enhancement module, the inter-modality attention mask based on the joint feature representation is computed, so that the features of each modality are enhanced by fusing the correlative information between two modalities. After that, the view-context attention fusion module is proposed to explore the context information of multiple views, and fuse the enhanced features to obtain more discriminative descriptor for 3D shape data. Experimental results on the ModelNet40 dataset demonstrate that the proposed method achieves promising performance compared with state-of-the-art methods.