Instant Gaussian Splatting Generation for High-Quality and Real-Time Facial Asset Rendering.

Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura
{"title":"Instant Gaussian Splatting Generation for High-Quality and Real-Time Facial Asset Rendering.","authors":"Dafei Qin, Hongyang Lin, Qixuan Zhang, Kaichun Qiao, Longwen Zhang, Jun Saito, Zijun Zhao, Jingyi Yu, Lan Xu, Taku Komura","doi":"10.1109/TPAMI.2025.3550195","DOIUrl":null,"url":null,"abstract":"<p><p>Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3550195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often involves a trade-off between quality and speed. In this paper, we propose GauFace, a novel Gaussian Splatting representation, tailored for efficient rendering of facial mesh with textures. Then, we introduce TransGS, a diffusion transformer that instantly generates the GauFace assets from mesh, textures and lightning conditions. Specifically, we adopt a patch-based pipeline to handle the vast number of Gaussian Points, a novel texel-aligned sampling scheme with UV positional encoding to enhance the throughput of generating GauFace assets. Once trained, TransGS can generate GauFace assets in 5 seconds, delivering high fidelity and real-time facial interaction of 30fps@1440p to a Snapdragon 8 Gen 2 mobile platform. The rich conditional modalities further enable editing and animation capabilities reminiscent of traditional CG pipelines. We conduct extensive evaluations and user studies, compared to traditional renderers, as well as recent neural rendering methods. They demonstrate the superior performance of our approach for facial asset rendering. We also showcase diverse applications of facial assets using our TransGS approach and GauFace representation, across various platforms like PCs, phones, and VR headsets.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于高质量实时面部资产渲染的即时高斯溅射生成技术
传统和人工智能驱动的建模技术可以从扫描、视频或文本提示中生成高保真的3D资产。然而,编辑和呈现这些资产通常涉及质量和速度之间的权衡。在本文中,我们提出了GauFace,一种新颖的高斯飞溅表示,专门用于有效渲染带有纹理的面部网格。然后,我们介绍TransGS,一个扩散变压器,立即从网格,纹理和闪电条件生成GauFace资产。具体来说,我们采用了基于patch的管道来处理大量的高斯点,一种新颖的纹理对齐采样方案与UV位置编码,以提高生成GauFace资产的吞吐量。经过训练后,TransGS可以在5秒内生成GauFace资产,为Snapdragon 8 Gen 2移动平台提供30fps@1440p的高保真度和实时面部交互。丰富的条件模式进一步使编辑和动画功能让人想起传统的CG管道。与传统的渲染器以及最近的神经渲染方法相比,我们进行了广泛的评估和用户研究。他们展示了我们的面部资源渲染方法的优越性能。我们还使用TransGS方法和GauFace表示,在pc、手机和VR头显等各种平台上展示面部资产的各种应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning-Based Multi-View Stereo: A Survey. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation. Unsupervised Gaze Representation Learning by Switching Features. H2OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1