VAT:用于细粒度服装人体重建的可视性感知变压器。

Xiaoyan Zhang;Zibin Zhu;Hong Xie;Sisi Ren;Jianmin Jiang
{"title":"VAT:用于细粒度服装人体重建的可视性感知变压器。","authors":"Xiaoyan Zhang;Zibin Zhu;Hong Xie;Sisi Ren;Jianmin Jiang","doi":"10.1109/TVCG.2025.3528021","DOIUrl":null,"url":null,"abstract":"In order to reconstruct 3D clothed human with accurate fine-grained details from sparse views, we propose a deep cooperating two-level global to fine-grained reconstruction framework that constructs robust global geometry to guide fine-grained geometry learning. The core of the framework is a novel visibility aware Transformer VAT, which bridges the two-level reconstruction architecture by connecting its global encoder and fine-grained decoder with two pixel-aligned implicit functions, respectively. The global encoder fuses semantic features of multiple views to integrate global geometric features. In the fine-grained decoder, visibility aware attention mechanism is designed to efficiently fuse multi-view and multi-scale features for mining fine-grained geometric features. The global encoder and fine-grained decoder are connected by a global embeding module to form a deep cooperation in the two-level framework, which provides global geometric embedding as a query guidance for calculating visibility aware attention in the fine-grained decoder. In addition, to extract highly aligned multi-scale features for the two-level reconstruction architecture, we design an image feature extractor MSUNet, which establishes strong semantic connections between different scales at minimal cost. Our proposed framework is end-to-end trainable, with all modules jointly optimized. We validate the effectiveness of our framework on public benchmarks, and experimental results demonstrate that our method has significant advantages over state-of-the-art methods in terms of both fine-grained performance and generalization.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 10","pages":"6719-6736"},"PeriodicalIF":6.5000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VAT: Visibility Aware Transformer for Fine-Grained Clothed Human Reconstruction\",\"authors\":\"Xiaoyan Zhang;Zibin Zhu;Hong Xie;Sisi Ren;Jianmin Jiang\",\"doi\":\"10.1109/TVCG.2025.3528021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to reconstruct 3D clothed human with accurate fine-grained details from sparse views, we propose a deep cooperating two-level global to fine-grained reconstruction framework that constructs robust global geometry to guide fine-grained geometry learning. The core of the framework is a novel visibility aware Transformer VAT, which bridges the two-level reconstruction architecture by connecting its global encoder and fine-grained decoder with two pixel-aligned implicit functions, respectively. The global encoder fuses semantic features of multiple views to integrate global geometric features. In the fine-grained decoder, visibility aware attention mechanism is designed to efficiently fuse multi-view and multi-scale features for mining fine-grained geometric features. The global encoder and fine-grained decoder are connected by a global embeding module to form a deep cooperation in the two-level framework, which provides global geometric embedding as a query guidance for calculating visibility aware attention in the fine-grained decoder. In addition, to extract highly aligned multi-scale features for the two-level reconstruction architecture, we design an image feature extractor MSUNet, which establishes strong semantic connections between different scales at minimal cost. Our proposed framework is end-to-end trainable, with all modules jointly optimized. We validate the effectiveness of our framework on public benchmarks, and experimental results demonstrate that our method has significant advantages over state-of-the-art methods in terms of both fine-grained performance and generalization.\",\"PeriodicalId\":94035,\"journal\":{\"name\":\"IEEE transactions on visualization and computer graphics\",\"volume\":\"31 10\",\"pages\":\"6719-6736\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on visualization and computer graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10836818/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10836818/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为了从稀疏视图中获得精确的细粒度细节,我们提出了一个深度合作的两级全局到细粒度重建框架,该框架构建了鲁棒的全局几何来指导细粒度几何学习。该框架的核心是一种新颖的可视性感知的Transformer VAT,它通过将其全局编码器和细粒度解码器分别与两个像素对齐的隐式函数连接在一起,架起了两级重构体系结构的桥梁。全局编码器通过融合多个视图的语义特征来整合全局几何特征。在细粒度解码器中,设计了可见感知注意机制,有效融合多视角、多尺度特征,挖掘细粒度几何特征。全局编码器和细粒度解码器通过全局嵌入模块连接,在两级框架中形成深度合作,为细粒度解码器中可见性感知注意力的计算提供全局几何嵌入作为查询指导。此外,为了为两级重建架构提取高度对齐的多尺度特征,我们设计了一种图像特征提取器MSUNet,以最小的成本在不同尺度之间建立强语义连接。我们提出的框架是端到端可训练的,所有模块都是联合优化的。我们在公共基准测试上验证了我们的框架的有效性,实验结果表明,我们的方法在细粒度性能和泛化方面都比最先进的方法具有显著的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
VAT: Visibility Aware Transformer for Fine-Grained Clothed Human Reconstruction
In order to reconstruct 3D clothed human with accurate fine-grained details from sparse views, we propose a deep cooperating two-level global to fine-grained reconstruction framework that constructs robust global geometry to guide fine-grained geometry learning. The core of the framework is a novel visibility aware Transformer VAT, which bridges the two-level reconstruction architecture by connecting its global encoder and fine-grained decoder with two pixel-aligned implicit functions, respectively. The global encoder fuses semantic features of multiple views to integrate global geometric features. In the fine-grained decoder, visibility aware attention mechanism is designed to efficiently fuse multi-view and multi-scale features for mining fine-grained geometric features. The global encoder and fine-grained decoder are connected by a global embeding module to form a deep cooperation in the two-level framework, which provides global geometric embedding as a query guidance for calculating visibility aware attention in the fine-grained decoder. In addition, to extract highly aligned multi-scale features for the two-level reconstruction architecture, we design an image feature extractor MSUNet, which establishes strong semantic connections between different scales at minimal cost. Our proposed framework is end-to-end trainable, with all modules jointly optimized. We validate the effectiveness of our framework on public benchmarks, and experimental results demonstrate that our method has significant advantages over state-of-the-art methods in terms of both fine-grained performance and generalization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analyzing Notebook Histories to Understand Data Visualization Workflows. Chat Modeling: Interaction-Enhanced Agent Framework for Visualizing Literature-Grounded Biological Structures. Illuminating LLM Coding Agents: Visual Analytics for Deeper Understanding and Enhancement. Enhancing Line Density Plots with Outlier Control and Bin-based Illumination. APN-Net: An Adaptive Perception Network For Point Cloud Normal Estimation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1