View sequence prediction GAN: unsupervised representation learning for 3D shapes by decomposing view content and viewpoint variance

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-08-01 DOI:10.1007/s00530-024-01431-8
Heyu Zhou, Jiayu Li, Xianzhu Liu, Yingda Lyu, Haipeng Chen, An-An Liu
{"title":"View sequence prediction GAN: unsupervised representation learning for 3D shapes by decomposing view content and viewpoint variance","authors":"Heyu Zhou, Jiayu Li, Xianzhu Liu, Yingda Lyu, Haipeng Chen, An-An Liu","doi":"10.1007/s00530-024-01431-8","DOIUrl":null,"url":null,"abstract":"<p>Unsupervised representation learning for 3D shapes has become a critical problem for large-scale 3D shape management. Recent model-based methods for this task require additional information for training, while popular view-based methods often overlook viewpoint variance in view prediction, leading to uninformative 3D features that limit their practical applications. To address these issues, we propose an unsupervised 3D shape representation learning method called View Sequence Prediction GAN (VSP-GAN), which decomposes view content and viewpoint variance. VSP-GAN takes several adjacent views of a 3D shape as input and outputs the subsequent views. The key idea is to split the multi-view sequence into two available perceptible parts, view content and viewpoint variance, and independently encode them with separate encoders. With the information, we design a decoder implemented by the mirrored architecture of the content encoder to predict the view sequence by multi-steps. Besides, to improve the quality of the reconstructed views, we propose a novel hierarchical view prediction loss to enhance view realism, semantic consistency, and details retainment. We evaluate the proposed VSP-GAN on two popular 3D CAD datasets, ModelNet10 and ModelNet40, for 3D shape classification and retrieval. The experimental results demonstrate that our VSP-GAN can learn more discriminative features than the state-of-the-art methods.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01431-8","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Unsupervised representation learning for 3D shapes has become a critical problem for large-scale 3D shape management. Recent model-based methods for this task require additional information for training, while popular view-based methods often overlook viewpoint variance in view prediction, leading to uninformative 3D features that limit their practical applications. To address these issues, we propose an unsupervised 3D shape representation learning method called View Sequence Prediction GAN (VSP-GAN), which decomposes view content and viewpoint variance. VSP-GAN takes several adjacent views of a 3D shape as input and outputs the subsequent views. The key idea is to split the multi-view sequence into two available perceptible parts, view content and viewpoint variance, and independently encode them with separate encoders. With the information, we design a decoder implemented by the mirrored architecture of the content encoder to predict the view sequence by multi-steps. Besides, to improve the quality of the reconstructed views, we propose a novel hierarchical view prediction loss to enhance view realism, semantic consistency, and details retainment. We evaluate the proposed VSP-GAN on two popular 3D CAD datasets, ModelNet10 and ModelNet40, for 3D shape classification and retrieval. The experimental results demonstrate that our VSP-GAN can learn more discriminative features than the state-of-the-art methods.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
视图序列预测 GAN:通过分解视图内容和视点差异对三维形状进行无监督表示学习
三维形状的无监督表示学习已成为大规模三维形状管理的关键问题。最近的基于模型的方法需要额外的信息进行训练,而流行的基于视图的方法在视图预测中往往忽略了视点差异,导致三维特征信息量不足,限制了其实际应用。为了解决这些问题,我们提出了一种名为视图序列预测 GAN(VSP-GAN)的无监督三维形状表示学习方法,它能分解视图内容和视点差异。VSP-GAN 将三维形状的多个相邻视图作为输入,并输出后续视图。其主要思路是将多视图序列拆分成两个可感知的部分,即视图内容和视点差异,并分别用不同的编码器对其进行编码。利用这些信息,我们设计了一个解码器,通过内容编码器的镜像架构来实现多步骤预测视图序列。此外,为了提高重建视图的质量,我们提出了一种新颖的分层视图预测损失,以增强视图的真实性、语义一致性和细节保留。我们在两个流行的三维 CAD 数据集 ModelNet10 和 ModelNet40 上对所提出的 VSP-GAN 进行了评估,以进行三维形状分类和检索。实验结果表明,与最先进的方法相比,我们的 VSP-GAN 可以学习到更多的判别特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Hyperbaric oxygen treatment promotes tendon-bone interface healing in a rabbit model of rotator cuff tears. Oxygen-ozone therapy for myocardial ischemic stroke and cardiovascular disorders. Comparative study on the anti-inflammatory and protective effects of different oxygen therapy regimens on lipopolysaccharide-induced acute lung injury in mice. Heme oxygenase/carbon monoxide system and development of the heart. Hyperbaric oxygen for moderate-to-severe traumatic brain injury: outcomes 5-8 years after injury.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1