用于沉浸式以人为中心的体积测量视频的鲁棒双高斯拼接技术

Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu
{"title":"用于沉浸式以人为中心的体积测量视频的鲁棒双高斯拼接技术","authors":"Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu","doi":"arxiv-2409.08353","DOIUrl":null,"url":null,"abstract":"Volumetric video represents a transformative advancement in visual media,\nenabling users to freely navigate immersive virtual experiences and narrowing\nthe gap between digital and real worlds. However, the need for extensive manual\nintervention to stabilize mesh sequences and the generation of excessively\nlarge assets in existing workflows impedes broader adoption. In this paper, we\npresent a novel Gaussian-based approach, dubbed \\textit{DualGS}, for real-time\nand high-fidelity playback of complex human performance with excellent\ncompression ratios. Our key idea in DualGS is to separately represent motion\nand appearance using the corresponding skin and joint Gaussians. Such an\nexplicit disentanglement can significantly reduce motion redundancy and enhance\ntemporal coherence. We begin by initializing the DualGS and anchoring skin\nGaussians to joint Gaussians at the first frame. Subsequently, we employ a\ncoarse-to-fine training strategy for frame-by-frame human performance modeling.\nIt includes a coarse alignment phase for overall motion prediction as well as a\nfine-grained optimization for robust tracking and high-fidelity rendering. To\nintegrate volumetric video seamlessly into VR environments, we efficiently\ncompress motion using entropy encoding and appearance using codec compression\ncoupled with a persistent codebook. Our approach achieves a compression ratio\nof up to 120 times, only requiring approximately 350KB of storage per frame. We\ndemonstrate the efficacy of our representation through photo-realistic,\nfree-view experiences on VR headsets, enabling users to immersively watch\nmusicians in performance and feel the rhythm of the notes at the performers'\nfingertips.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"49 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos\",\"authors\":\"Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu\",\"doi\":\"arxiv-2409.08353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Volumetric video represents a transformative advancement in visual media,\\nenabling users to freely navigate immersive virtual experiences and narrowing\\nthe gap between digital and real worlds. However, the need for extensive manual\\nintervention to stabilize mesh sequences and the generation of excessively\\nlarge assets in existing workflows impedes broader adoption. In this paper, we\\npresent a novel Gaussian-based approach, dubbed \\\\textit{DualGS}, for real-time\\nand high-fidelity playback of complex human performance with excellent\\ncompression ratios. Our key idea in DualGS is to separately represent motion\\nand appearance using the corresponding skin and joint Gaussians. Such an\\nexplicit disentanglement can significantly reduce motion redundancy and enhance\\ntemporal coherence. We begin by initializing the DualGS and anchoring skin\\nGaussians to joint Gaussians at the first frame. Subsequently, we employ a\\ncoarse-to-fine training strategy for frame-by-frame human performance modeling.\\nIt includes a coarse alignment phase for overall motion prediction as well as a\\nfine-grained optimization for robust tracking and high-fidelity rendering. To\\nintegrate volumetric video seamlessly into VR environments, we efficiently\\ncompress motion using entropy encoding and appearance using codec compression\\ncoupled with a persistent codebook. Our approach achieves a compression ratio\\nof up to 120 times, only requiring approximately 350KB of storage per frame. We\\ndemonstrate the efficacy of our representation through photo-realistic,\\nfree-view experiences on VR headsets, enabling users to immersively watch\\nmusicians in performance and feel the rhythm of the notes at the performers'\\nfingertips.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"49 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.08353\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

体积视频代表了视觉媒体的变革性进步,使用户能够自由浏览身临其境的虚拟体验,缩小数字世界与现实世界之间的差距。然而,在现有的工作流程中,需要大量的人工干预来稳定网格序列,并生成过大的资产,这阻碍了更广泛的应用。在本文中,我们提出了一种新颖的基于高斯的方法--DualGS,用于实时、高保真地回放具有出色压缩比的复杂人体表演。我们在 DualGS 中的主要想法是使用相应的皮肤高斯和关节高斯分别表示运动和外观。这种明确的分离可以显著减少运动冗余,增强时空一致性。我们首先初始化 DualGS,并在第一帧将皮肤高斯锚定为联合高斯。随后,我们采用由粗到细的训练策略进行逐帧人体表现建模,其中包括用于整体运动预测的粗对齐阶段,以及用于稳健跟踪和高保真渲染的细粒度优化。为了将体积视频无缝集成到 VR 环境中,我们使用熵编码对运动进行了有效压缩,并使用编解码器压缩外观,再加上持久的编码本。我们的方法实现了高达 120 倍的压缩率,每帧仅需约 350KB 的存储空间。我们通过在 VR 头显上进行照片般逼真的自由视角体验,展示了我们的表示法的功效,使用户能够身临其境地观看音乐家的表演,感受表演者指尖上的音符节奏。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos
Volumetric video represents a transformative advancement in visual media, enabling users to freely navigate immersive virtual experiences and narrowing the gap between digital and real worlds. However, the need for extensive manual intervention to stabilize mesh sequences and the generation of excessively large assets in existing workflows impedes broader adoption. In this paper, we present a novel Gaussian-based approach, dubbed \textit{DualGS}, for real-time and high-fidelity playback of complex human performance with excellent compression ratios. Our key idea in DualGS is to separately represent motion and appearance using the corresponding skin and joint Gaussians. Such an explicit disentanglement can significantly reduce motion redundancy and enhance temporal coherence. We begin by initializing the DualGS and anchoring skin Gaussians to joint Gaussians at the first frame. Subsequently, we employ a coarse-to-fine training strategy for frame-by-frame human performance modeling. It includes a coarse alignment phase for overall motion prediction as well as a fine-grained optimization for robust tracking and high-fidelity rendering. To integrate volumetric video seamlessly into VR environments, we efficiently compress motion using entropy encoding and appearance using codec compression coupled with a persistent codebook. Our approach achieves a compression ratio of up to 120 times, only requiring approximately 350KB of storage per frame. We demonstrate the efficacy of our representation through photo-realistic, free-view experiences on VR headsets, enabling users to immersively watch musicians in performance and feel the rhythm of the notes at the performers' fingertips.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations A Missing Data Imputation GAN for Character Sprite Generation Visualizing Temporal Topic Embeddings with a Compass Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1