arXiv - CS - Graphics最新文献_第3页

Weighted Squared Volume Minimization (WSVM) for Generating Uniform Tetrahedral Meshes 生成统一四面体网格的加权平方体积最小化（WSVM）

arXiv - CS - Graphics

Pub Date : 2024-09-09 DOI: arxiv-2409.05525

Kaixin Yu, Yifu Wang, Peng Song, Xiangqiao Meng, Ying He, Jianjun Chen

This paper presents a new algorithm, Weighted Squared Volume Minimization(WSVM), for generating high-quality tetrahedral meshes from closed trianglemeshes. Drawing inspiration from the principle of minimal surfaces thatminimize squared surface area, WSVM employs a new energy function integratingweighted squared volumes for tetrahedral elements. When minimized with constantweights, this energy promotes uniform volumes among the tetrahedra. Adjustingthe weights to account for local geometry further achieves uniform dihedralangles within the mesh. The algorithm begins with an initial tetrahedral meshgenerated via Delaunay tetrahedralization and proceeds by sequentiallyminimizing volume-oriented and then dihedral angle-oriented energies. At eachstage, it alternates between optimizing vertex positions and refining meshconnectivity through the iterative process. The algorithm operates fullyautomatically and requires no parameter tuning. Evaluations on a variety of 3Dmodels demonstrate that WSVM consistently produces tetrahedral meshes of higherquality, with fewer slivers and enhanced uniformity compared to existingmethods. Check out further details at the project webpage:https://kaixinyu-hub.github.io/WSVM.github.io.

本文提出了一种从封闭三角形网格生成高质量四面体网格的新算法--加权平方体积最小化（WSVM）。WSVM 从最小化平方表面积的最小曲面原理中汲取灵感，采用了一种新的能量函数，对四面体元素的加权平方体积进行积分。当以恒定权重最小化时，该能量可促进四面体之间的体积一致。根据局部几何形状调整权重可进一步实现网格内统一的二面体。该算法从通过 Delaunay 四面体化生成的初始四面体网格开始，依次最小化面向体积的能量和面向二面角的能量。在每个阶段，它都会交替优化顶点位置，并通过迭代过程完善网格连接性。该算法完全自动运行，无需调整参数。在各种三维模型上进行的评估表明，与现有方法相比，WSVM 能持续生成质量更高的四面体网格，减少切片，提高均匀性。欲了解更多详情，请访问项目网页：https://kaixinyu-hub.github.io/WSVM.github.io。

{"title":"Weighted Squared Volume Minimization (WSVM) for Generating Uniform Tetrahedral Meshes","authors":"Kaixin Yu, Yifu Wang, Peng Song, Xiangqiao Meng, Ying He, Jianjun Chen","doi":"arxiv-2409.05525","DOIUrl":"https://doi.org/arxiv-2409.05525","url":null,"abstract":"This paper presents a new algorithm, Weighted Squared Volume Minimization\u0000(WSVM), for generating high-quality tetrahedral meshes from closed triangle\u0000meshes. Drawing inspiration from the principle of minimal surfaces that\u0000minimize squared surface area, WSVM employs a new energy function integrating\u0000weighted squared volumes for tetrahedral elements. When minimized with constant\u0000weights, this energy promotes uniform volumes among the tetrahedra. Adjusting\u0000the weights to account for local geometry further achieves uniform dihedral\u0000angles within the mesh. The algorithm begins with an initial tetrahedral mesh\u0000generated via Delaunay tetrahedralization and proceeds by sequentially\u0000minimizing volume-oriented and then dihedral angle-oriented energies. At each\u0000stage, it alternates between optimizing vertex positions and refining mesh\u0000connectivity through the iterative process. The algorithm operates fully\u0000automatically and requires no parameter tuning. Evaluations on a variety of 3D\u0000models demonstrate that WSVM consistently produces tetrahedral meshes of higher\u0000quality, with fewer slivers and enhanced uniformity compared to existing\u0000methods. Check out further details at the project webpage:\u0000https://kaixinyu-hub.github.io/WSVM.github.io.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decoupling Contact for Fine-Grained Motion Style Transfer 去耦接触实现精细运动风格传递

arXiv - CS - Graphics

Pub Date : 2024-09-09 DOI: arxiv-2409.05387

Xiangjun Tang, Linjun Wu, He Wang, Yiqian Wu, Bo Hu, Songnan Li, Xu Gong, Yuchen Liao, Qilong Kou, Xiaogang Jin

Motion style transfer changes the style of a motion while retaining itscontent and is useful in computer animations and games. Contact is an essentialcomponent of motion style transfer that should be controlled explicitly inorder to express the style vividly while enhancing motion naturalness andquality. However, it is unknown how to decouple and control contact to achievefine-grained control in motion style transfer. In this paper, we present anovel style transfer method for fine-grained control over contacts whileachieving both motion naturalness and spatial-temporal variations of style.Based on our empirical evidence, we propose controlling contact indirectlythrough the hip velocity, which can be further decomposed into the trajectoryand contact timing, respectively. To this end, we propose a new model thatexplicitly models the correlations between motions and trajectory/contacttiming/style, allowing us to decouple and control each separately. Our approachis built around a motion manifold, where hip controls can be easily integratedinto a Transformer-based decoder. It is versatile in that it can generatemotions directly as well as be used as post-processing for existing methods toimprove quality and contact controllability. In addition, we propose a newmetric that measures a correlation pattern of motions based on our empiricalevidence, aligning well with human perception in terms of motion naturalness.Based on extensive evaluation, our method outperforms existing methods in termsof style expressivity and motion quality.

运动风格转换在保留运动内容的同时改变运动风格，在电脑动画和游戏中非常有用。接触是运动风格转换的重要组成部分，应明确加以控制，以便在增强运动自然度和质量的同时生动地表达运动风格。然而，目前还不知道如何解耦和控制接触，以实现运动风格转移的精细控制。在本文中，我们提出了一种新的风格转换方法，在实现动作自然性和风格的时空变化的同时，对触点进行精细控制。基于我们的经验证据，我们提出通过髋关节速度间接控制触点，而髋关节速度又可分别分解为运动轨迹和触点时间。为此，我们提出了一个新模型，该模型明确地模拟了运动和轨迹/接触时机/风格之间的相关性，使我们能够将两者分离并分别进行控制。我们的方法是围绕运动流形建立的，其中臀部控制可以轻松集成到基于变换器的解码器中。这种方法用途广泛，既可以直接生成动作，也可以用作现有方法的后处理，以提高质量和接触可控性。此外，我们还根据经验证据提出了一种新的测量方法，用于测量运动的相关模式，在运动自然度方面与人类的感知非常吻合。

{"title":"Decoupling Contact for Fine-Grained Motion Style Transfer","authors":"Xiangjun Tang, Linjun Wu, He Wang, Yiqian Wu, Bo Hu, Songnan Li, Xu Gong, Yuchen Liao, Qilong Kou, Xiaogang Jin","doi":"arxiv-2409.05387","DOIUrl":"https://doi.org/arxiv-2409.05387","url":null,"abstract":"Motion style transfer changes the style of a motion while retaining its\u0000content and is useful in computer animations and games. Contact is an essential\u0000component of motion style transfer that should be controlled explicitly in\u0000order to express the style vividly while enhancing motion naturalness and\u0000quality. However, it is unknown how to decouple and control contact to achieve\u0000fine-grained control in motion style transfer. In this paper, we present a\u0000novel style transfer method for fine-grained control over contacts while\u0000achieving both motion naturalness and spatial-temporal variations of style.\u0000Based on our empirical evidence, we propose controlling contact indirectly\u0000through the hip velocity, which can be further decomposed into the trajectory\u0000and contact timing, respectively. To this end, we propose a new model that\u0000explicitly models the correlations between motions and trajectory/contact\u0000timing/style, allowing us to decouple and control each separately. Our approach\u0000is built around a motion manifold, where hip controls can be easily integrated\u0000into a Transformer-based decoder. It is versatile in that it can generate\u0000motions directly as well as be used as post-processing for existing methods to\u0000improve quality and contact controllability. In addition, we propose a new\u0000metric that measures a correlation pattern of motions based on our empirical\u0000evidence, aligning well with human perception in terms of motion naturalness.\u0000Based on extensive evaluation, our method outperforms existing methods in terms\u0000of style expressivity and motion quality.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"284 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PVP-Recon: Progressive View Planning via Warping Consistency for Sparse-View Surface Reconstruction PVP-Recon：通过翘曲一致性进行渐进式视图规划，实现稀疏视图曲面重构

arXiv - CS - Graphics

Pub Date : 2024-09-09 DOI: arxiv-2409.05474

Sheng Ye, Yuze He, Matthieu Lin, Jenny Sheng, Ruoyu Fan, Yiheng Han, Yubin Hu, Ran Yi, Yu-Hui Wen, Yong-Jin Liu, Wenping Wang

Neural implicit representations have revolutionized dense multi-view surfacereconstruction, yet their performance significantly diminishes with sparseinput views. A few pioneering works have sought to tackle the challenge ofsparse-view reconstruction by leveraging additional geometric priors ormulti-scene generalizability. However, they are still hindered by the imperfectchoice of input views, using images under empirically determined viewpoints toprovide considerable overlap. We propose PVP-Recon, a novel and effectivesparse-view surface reconstruction method that progressively plans the nextbest views to form an optimal set of sparse viewpoints for image capturing.PVP-Recon starts initial surface reconstruction with as few as 3 views andprogressively adds new views which are determined based on a novel warpingscore that reflects the information gain of each newly added view. Thisprogressive view planning progress is interleaved with a neural SDF-basedreconstruction module that utilizes multi-resolution hash features, enhanced bya progressive training scheme and a directional Hessian loss. Quantitative andqualitative experiments on three benchmark datasets show that our frameworkachieves high-quality reconstruction with a constrained input budget andoutperforms existing baselines.

神经隐式表征为密集多视角曲面重建带来了革命性的变化，但在输入视角稀疏的情况下，神经隐式表征的性能会明显下降。一些开创性的工作试图通过利用额外的几何先验或多场景泛化来应对稀疏视图重建的挑战。然而，它们仍然受到输入视图选择不完善的阻碍，因为它们使用的是根据经验确定的视点下的图像，而这些视点提供了相当大的重叠。我们提出的 PVP-Recon 是一种新颖有效的解析视图曲面重建方法，它能逐步规划出下一个最佳视图，从而形成一组用于图像捕捉的最佳稀疏视点。PVP-Recon 从最多 3 个视图开始初始曲面重建，并逐步添加新的视图，这些视图是根据反映每个新添加视图信息增益的新颖翘曲分数确定的。这种渐进式视图规划进度与基于 SDF 的神经重建模块交错进行，该模块利用多分辨率哈希特征，并通过渐进式训练方案和方向性黑森损失进行增强。在三个基准数据集上进行的定量和定性实验表明，我们的框架在输入预算有限的情况下实现了高质量的重建，并且优于现有的基线。

{"title":"PVP-Recon: Progressive View Planning via Warping Consistency for Sparse-View Surface Reconstruction","authors":"Sheng Ye, Yuze He, Matthieu Lin, Jenny Sheng, Ruoyu Fan, Yiheng Han, Yubin Hu, Ran Yi, Yu-Hui Wen, Yong-Jin Liu, Wenping Wang","doi":"arxiv-2409.05474","DOIUrl":"https://doi.org/arxiv-2409.05474","url":null,"abstract":"Neural implicit representations have revolutionized dense multi-view surface\u0000reconstruction, yet their performance significantly diminishes with sparse\u0000input views. A few pioneering works have sought to tackle the challenge of\u0000sparse-view reconstruction by leveraging additional geometric priors or\u0000multi-scene generalizability. However, they are still hindered by the imperfect\u0000choice of input views, using images under empirically determined viewpoints to\u0000provide considerable overlap. We propose PVP-Recon, a novel and effective\u0000sparse-view surface reconstruction method that progressively plans the next\u0000best views to form an optimal set of sparse viewpoints for image capturing.\u0000PVP-Recon starts initial surface reconstruction with as few as 3 views and\u0000progressively adds new views which are determined based on a novel warping\u0000score that reflects the information gain of each newly added view. This\u0000progressive view planning progress is interleaved with a neural SDF-based\u0000reconstruction module that utilizes multi-resolution hash features, enhanced by\u0000a progressive training scheme and a directional Hessian loss. Quantitative and\u0000qualitative experiments on three benchmark datasets show that our framework\u0000achieves high-quality reconstruction with a constrained input budget and\u0000outperforms existing baselines.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NESI: Shape Representation via Neural Explicit Surface Intersection NESI：通过神经显性表面交集进行形状表示

arXiv - CS - Graphics

Pub Date : 2024-09-09 DOI: arxiv-2409.06030

Congyi Zhang, Jinfan Yang, Eric Hedlin, Suzuran Takikawa, Nicholas Vining, Kwang Moo Yi, Wenping Wang, Alla Sheffer

Compressed representations of 3D shapes that are compact, accurate, and canbe processed efficiently directly in compressed form, are extremely useful fordigital media applications. Recent approaches in this space focus on learnedimplicit or parametric representations. While implicits are well suited fortasks such as in-out queries, they lack natural 2D parameterization,complicating tasks such as texture or normal mapping. Conversely, parametricrepresentations support the latter tasks but are ill-suited for occupancyqueries. We propose a novel learned alternative to these approaches, based onintersections of localized explicit, or height-field, surfaces. Since explicitscan be trivially expressed both implicitly and parametrically, NESI directlysupports a wider range of processing operations than implicit alternatives,including occupancy queries and parametric access. We represent input shapesusing a collection of differently oriented height-field bounded half-spacescombined using volumetric Boolean intersections. We first tightly bound eachinput using a pair of oppositely oriented height-fields, forming a DoubleHeight-Field (DHF) Hull. We refine this hull by intersecting it with additionallocalized height-fields (HFs) that capture surface regions in its interior. Weminimize the number of HFs necessary to accurately capture each input andcompactly encode both the DHF hull and the local HFs as neural functionsdefined over subdomains of R^2. This reduced dimensionality encoding delivershigh-quality compact approximations. Given similar parameter count, or storagecapacity, NESI significantly reduces approximation error compared to the stateof the art, especially at lower parameter counts.

三维形状的压缩表示既紧凑又准确，而且可以直接以压缩形式进行高效处理，这对于数字媒体应用非常有用。这一领域的最新方法侧重于学习隐式或参数表示法。虽然隐式表示法非常适合进出查询等任务，但它们缺乏自然的二维参数化，使纹理或法线映射等任务变得复杂。相反，参数表示法支持后一种任务，但不适合占用查询。我们提出了一种新颖的学习替代方法，它基于局部显式或高度场曲面的交集。由于显式曲面既可以隐式表达，也可以参数化表达，因此与隐式曲面相比，NESI 可以直接支持更广泛的处理操作，包括占位查询和参数化访问。我们使用不同方向的高度场边界半空间集合来表示输入形状，这些半空间使用体积布尔交集进行组合。首先，我们使用一对方向相反的高度场紧密绑定每个输入，形成一个双高度场 (DHF) 体。我们通过与捕捉其内部表面区域的附加局部高度场 (HF) 相交来完善这个穹顶。我们将精确捕捉每个输入所需的高度场数量最小化，并将 DHF 体和局部高度场都紧凑地编码为定义在 R^2 子域上的神经函数。这种降维编码提供了高质量的紧凑近似。在参数数或存储容量相近的情况下，NESI 与现有技术相比，能显著减少近似误差，尤其是在参数数较低的情况下。

{"title":"NESI: Shape Representation via Neural Explicit Surface Intersection","authors":"Congyi Zhang, Jinfan Yang, Eric Hedlin, Suzuran Takikawa, Nicholas Vining, Kwang Moo Yi, Wenping Wang, Alla Sheffer","doi":"arxiv-2409.06030","DOIUrl":"https://doi.org/arxiv-2409.06030","url":null,"abstract":"Compressed representations of 3D shapes that are compact, accurate, and can\u0000be processed efficiently directly in compressed form, are extremely useful for\u0000digital media applications. Recent approaches in this space focus on learned\u0000implicit or parametric representations. While implicits are well suited for\u0000tasks such as in-out queries, they lack natural 2D parameterization,\u0000complicating tasks such as texture or normal mapping. Conversely, parametric\u0000representations support the latter tasks but are ill-suited for occupancy\u0000queries. We propose a novel learned alternative to these approaches, based on\u0000intersections of localized explicit, or height-field, surfaces. Since explicits\u0000can be trivially expressed both implicitly and parametrically, NESI directly\u0000supports a wider range of processing operations than implicit alternatives,\u0000including occupancy queries and parametric access. We represent input shapes\u0000using a collection of differently oriented height-field bounded half-spaces\u0000combined using volumetric Boolean intersections. We first tightly bound each\u0000input using a pair of oppositely oriented height-fields, forming a Double\u0000Height-Field (DHF) Hull. We refine this hull by intersecting it with additional\u0000localized height-fields (HFs) that capture surface regions in its interior. We\u0000minimize the number of HFs necessary to accurately capture each input and\u0000compactly encode both the DHF hull and the local HFs as neural functions\u0000defined over subdomains of R^2. This reduced dimensionality encoding delivers\u0000high-quality compact approximations. Given similar parameter count, or storage\u0000capacity, NESI significantly reduces approximation error compared to the state\u0000of the art, especially at lower parameter counts.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering 闪存缓存：减少基于辐射缓存的反渲染中的偏差

arXiv - CS - Graphics

Pub Date : 2024-09-09 DOI: arxiv-2409.05867

Benjamin Attal, Dor Verbin, Ben Mildenhall, Peter Hedman, Jonathan T. Barron, Matthew O'Toole, Pratul P. Srinivasan

State-of-the-art techniques for 3D reconstruction are largely based onvolumetric scene representations, which require sampling multiple points tocompute the color arriving along a ray. Using these representations for moregeneral inverse rendering -- reconstructing geometry, materials, and lightingfrom observed images -- is challenging because recursively path-tracing suchvolumetric representations is expensive. Recent works alleviate this issuethrough the use of radiance caches: data structures that store thesteady-state, infinite-bounce radiance arriving at any point from anydirection. However, these solutions rely on approximations that introduce biasinto the renderings and, more importantly, into the gradients used foroptimization. We present a method that avoids these approximations whileremaining computationally efficient. In particular, we leverage two techniquesto reduce variance for unbiased estimators of the rendering equation: (1) anocclusion-aware importance sampler for incoming illumination and (2) a fastcache architecture that can be used as a control variate for the radiance froma high-quality, but more expensive, volumetric cache. We show that by removingthese biases our approach improves the generality of radiance cache basedinverse rendering, as well as increasing quality in the presence of challenginglight transport effects such as specular reflections.

最先进的三维重建技术主要基于体积场景表示法，这需要对多个点进行采样，以计算沿光线到达的颜色。将这些表示法用于更一般的反渲染（根据观测图像重建几何图形、材料和照明）具有挑战性，因为递归路径追踪这种体积表示法的成本很高。最近的研究通过使用辐射缓存（存储从任意方向到达任意点的稳态、无限反弹辐射的数据结构）来缓解这一问题。然而，这些解决方案依赖于近似值，而近似值会给渲染带来偏差，更重要的是，会给用于优化的梯度带来偏差。我们提出了一种既能避免这些近似值，又能保持计算效率的方法。具体而言，我们利用了两种技术来减少渲染方程无偏估计器的方差：(1) 针对入射光照的闭塞感知重要性采样器；(2) 快速缓存架构，该架构可用作来自高质量但更昂贵的体积缓存的辐射控制变量。我们的研究表明，通过消除这些偏差，我们的方法提高了基于辐射缓存的反向渲染的通用性，并在出现镜面反射等具有挑战性的光传输效应时提高了渲染质量。

{"title":"Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering","authors":"Benjamin Attal, Dor Verbin, Ben Mildenhall, Peter Hedman, Jonathan T. Barron, Matthew O'Toole, Pratul P. Srinivasan","doi":"arxiv-2409.05867","DOIUrl":"https://doi.org/arxiv-2409.05867","url":null,"abstract":"State-of-the-art techniques for 3D reconstruction are largely based on\u0000volumetric scene representations, which require sampling multiple points to\u0000compute the color arriving along a ray. Using these representations for more\u0000general inverse rendering -- reconstructing geometry, materials, and lighting\u0000from observed images -- is challenging because recursively path-tracing such\u0000volumetric representations is expensive. Recent works alleviate this issue\u0000through the use of radiance caches: data structures that store the\u0000steady-state, infinite-bounce radiance arriving at any point from any\u0000direction. However, these solutions rely on approximations that introduce bias\u0000into the renderings and, more importantly, into the gradients used for\u0000optimization. We present a method that avoids these approximations while\u0000remaining computationally efficient. In particular, we leverage two techniques\u0000to reduce variance for unbiased estimators of the rendering equation: (1) an\u0000occlusion-aware importance sampler for incoming illumination and (2) a fast\u0000cache architecture that can be used as a control variate for the radiance from\u0000a high-quality, but more expensive, volumetric cache. We show that by removing\u0000these biases our approach improves the generality of radiance cache based\u0000inverse rendering, as well as increasing quality in the presence of challenging\u0000light transport effects such as specular reflections.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping DreamMapping：通过变异分布映射实现高保真文本到 3D 的生成

arXiv - CS - Graphics

Pub Date : 2024-09-08 DOI: arxiv-2409.05099

Zeyu Cai, Duotun Wang, Yixun Liang, Zhijing Shao, Ying-Cong Chen, Xiaohang Zhan, Zeyu Wang

Score Distillation Sampling (SDS) has emerged as a prevalent technique fortext-to-3D generation, enabling 3D content creation by distillingview-dependent information from text-to-2D guidance. However, they frequentlyexhibit shortcomings such as over-saturated color and excess smoothness. Inthis paper, we conduct a thorough analysis of SDS and refine its formulation,finding that the core design is to model the distribution of rendered images.Following this insight, we introduce a novel strategy called VariationalDistribution Mapping (VDM), which expedites the distribution modeling processby regarding the rendered images as instances of degradation fromdiffusion-based generation. This special design enables the efficient trainingof variational distribution by skipping the calculations of the Jacobians inthe diffusion U-Net. We also introduce timestep-dependent DistributionCoefficient Annealing (DCA) to further improve distilling precision. LeveragingVDM and DCA, we use Gaussian Splatting as the 3D representation and build atext-to-3D generation framework. Extensive experiments and evaluationsdemonstrate the capability of VDM and DCA to generate high-fidelity andrealistic assets with optimization efficiency.

分数蒸馏采样（SDS）已成为文本到三维生成的一种流行技术，它通过从文本到二维的引导中蒸馏出与视图相关的信息，从而实现三维内容的创建。然而，它们经常表现出色彩过度饱和和过度平滑等缺点。在本文中，我们对 SDS 进行了深入分析，并对其表述进行了改进，发现其核心设计是对渲染图像的分布进行建模。根据这一见解，我们引入了一种名为变异分布映射（VariationalDistribution Mapping，VDM）的新策略，通过将渲染图像视为基于扩散生成的退化实例来加快分布建模过程。这种特殊的设计通过跳过扩散 U-Net 中 Jacobian 的计算，实现了高效的变分分布训练。我们还引入了与时间步相关的分布系数退火（DCA），以进一步提高蒸馏精度。利用VDM 和 DCA，我们使用高斯拼接（Gaussian Splatting）作为三维表示，并构建了文本到三维的生成框架。广泛的实验和评估证明，VDM 和 DCA 能够以优化效率生成高保真和逼真的资产。

{"title":"DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping","authors":"Zeyu Cai, Duotun Wang, Yixun Liang, Zhijing Shao, Ying-Cong Chen, Xiaohang Zhan, Zeyu Wang","doi":"arxiv-2409.05099","DOIUrl":"https://doi.org/arxiv-2409.05099","url":null,"abstract":"Score Distillation Sampling (SDS) has emerged as a prevalent technique for\u0000text-to-3D generation, enabling 3D content creation by distilling\u0000view-dependent information from text-to-2D guidance. However, they frequently\u0000exhibit shortcomings such as over-saturated color and excess smoothness. In\u0000this paper, we conduct a thorough analysis of SDS and refine its formulation,\u0000finding that the core design is to model the distribution of rendered images.\u0000Following this insight, we introduce a novel strategy called Variational\u0000Distribution Mapping (VDM), which expedites the distribution modeling process\u0000by regarding the rendered images as instances of degradation from\u0000diffusion-based generation. This special design enables the efficient training\u0000of variational distribution by skipping the calculations of the Jacobians in\u0000the diffusion U-Net. We also introduce timestep-dependent Distribution\u0000Coefficient Annealing (DCA) to further improve distilling precision. Leveraging\u0000VDM and DCA, we use Gaussian Splatting as the 3D representation and build a\u0000text-to-3D generation framework. Extensive experiments and evaluations\u0000demonstrate the capability of VDM and DCA to generate high-fidelity and\u0000realistic assets with optimization efficiency.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"284 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ultron: Enabling Temporal Geometry Compression of 3D Mesh Sequences using Temporal Correspondence and Mesh Deformation Ultron：利用时序对应和网格变形实现三维网格序列的时序几何压缩

arXiv - CS - Graphics

Pub Date : 2024-09-08 DOI: arxiv-2409.05151

Haichao Zhu

With the advancement of computer vision, dynamic 3D reconstruction techniqueshave seen significant progress and found applications in various fields.However, these techniques generate large amounts of 3D data sequences,necessitating efficient storage and transmission methods. Existing 3D modelcompression methods primarily focus on static models and do not considerinter-frame information, limiting their ability to reduce data size. Temporalmesh compression, which has received less attention, often requires all inputmeshes to have the same topology, a condition rarely met in real-worldapplications. This research proposes a method to compress mesh sequences witharbitrary topology using temporal correspondence and mesh deformation. Themethod establishes temporal correspondence between consecutive frames, appliesa deformation model to transform the mesh from one frame to subsequent frames,and replaces the original meshes with deformed ones if the quality meets atolerance threshold. Extensive experiments demonstrate that this method canachieve state-of-the-art performance in terms of compression performance. Thecontributions of this paper include a geometry and motion-based model forestablishing temporal correspondence between meshes, a mesh quality assessmentfor temporal mesh sequences, an entropy-based encoding and corner table-basedmethod for compressing mesh sequences, and extensive experiments showing theeffectiveness of the proposed method. All the code will be open-sourced athttps://github.com/lszhuhaichao/ultron.

随着计算机视觉技术的发展，动态三维重建技术取得了长足的进步，并在各个领域得到了应用。然而，这些技术会产生大量的三维数据序列，需要高效的存储和传输方法。现有的三维模型压缩方法主要针对静态模型，不考虑帧间信息，限制了其缩小数据量的能力。时间网格压缩受到的关注较少，它通常要求所有输入网格具有相同的拓扑结构，而这一条件在实际应用中很少能满足。本研究提出了一种利用时间对应和网格变形来压缩具有任意拓扑结构的网格序列的方法。该方法在连续帧之间建立时间对应关系，应用变形模型将网格从一帧转换到后续帧，如果质量达到容许阈值，则用变形网格替换原始网格。大量实验证明，这种方法在压缩性能方面可以达到最先进的水平。本文的贡献包括：基于几何和运动的模型森林，建立网格之间的时间对应关系；针对时间网格序列的网格质量评估；基于熵编码和角表的网格序列压缩方法；以及大量实验，展示所提方法的有效性。所有代码将开源于https://github.com/lszhuhaichao/ultron。

{"title":"Ultron: Enabling Temporal Geometry Compression of 3D Mesh Sequences using Temporal Correspondence and Mesh Deformation","authors":"Haichao Zhu","doi":"arxiv-2409.05151","DOIUrl":"https://doi.org/arxiv-2409.05151","url":null,"abstract":"With the advancement of computer vision, dynamic 3D reconstruction techniques\u0000have seen significant progress and found applications in various fields.\u0000However, these techniques generate large amounts of 3D data sequences,\u0000necessitating efficient storage and transmission methods. Existing 3D model\u0000compression methods primarily focus on static models and do not consider\u0000inter-frame information, limiting their ability to reduce data size. Temporal\u0000mesh compression, which has received less attention, often requires all input\u0000meshes to have the same topology, a condition rarely met in real-world\u0000applications. This research proposes a method to compress mesh sequences with\u0000arbitrary topology using temporal correspondence and mesh deformation. The\u0000method establishes temporal correspondence between consecutive frames, applies\u0000a deformation model to transform the mesh from one frame to subsequent frames,\u0000and replaces the original meshes with deformed ones if the quality meets a\u0000tolerance threshold. Extensive experiments demonstrate that this method can\u0000achieve state-of-the-art performance in terms of compression performance. The\u0000contributions of this paper include a geometry and motion-based model for\u0000establishing temporal correspondence between meshes, a mesh quality assessment\u0000for temporal mesh sequences, an entropy-based encoding and corner table-based\u0000method for compressing mesh sequences, and extensive experiments showing the\u0000effectiveness of the proposed method. All the code will be open-sourced at\u0000https://github.com/lszhuhaichao/ultron.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fisheye-GS: Lightweight and Extensible Gaussian Splatting Module for Fisheye Cameras Fisheye-GS：用于鱼眼相机的轻量级可扩展高斯拼接模块

arXiv - CS - Graphics

Pub Date : 2024-09-07 DOI: arxiv-2409.04751

Zimu Liao, Siyan Chen, Rong Fu, Yi Wang, Zhongling Su, Hao Luo, Linning Xu, Bo Dai, Hengjie Li, Zhilin Pei, Xingcheng Zhang

Recently, 3D Gaussian Splatting (3DGS) has garnered attention for its highfidelity and real-time rendering. However, adapting 3DGS to different cameramodels, particularly fisheye lenses, poses challenges due to the unique 3D to2D projection calculation. Additionally, there are inefficiencies in thetile-based splatting, especially for the extreme curvature and wide field ofview of fisheye lenses, which are crucial for its broader real-lifeapplications. To tackle these challenges, we introduce Fisheye-GS.Thisinnovative method recalculates the projection transformation and its gradientsfor fisheye cameras. Our approach can be seamlessly integrated as a module intoother efficient 3D rendering methods, emphasizing its extensibility,lightweight nature, and modular design. Since we only modified the projectioncomponent, it can also be easily adapted for use with different camera models.Compared to methods that train after undistortion, our approach demonstrates aclear improvement in visual quality.

最近，三维高斯拼接（3DGS）因其高保真和实时渲染而备受关注。然而，由于独特的三维到二维的投影计算方法，使 3DGS 适应不同的摄像机模型，尤其是鱼眼镜头，带来了挑战。此外，基于平铺的拼接也存在效率低下的问题，特别是对于鱼眼镜头的极端曲率和宽视场来说，这对于其在现实生活中的广泛应用至关重要。为了应对这些挑战，我们引入了 Fisheye-GS。这种创新方法可以重新计算鱼眼相机的投影变换及其梯度。我们的方法可以作为一个模块无缝集成到其他高效的三维渲染方法中，强调了它的可扩展性、轻量级和模块化设计。由于我们只修改了投影部分，因此也可以很容易地适用于不同的相机模型。与在不失真后进行训练的方法相比，我们的方法在视觉质量上有明显的改进。

引用次数: 0

Casper DPM: Cascaded Perceptual Dynamic Projection Mapping onto Hands Casper DPM：串联感知动态投影映射到手上

arXiv - CS - Graphics

Pub Date : 2024-09-06 DOI: arxiv-2409.04397

Yotam Erel, Or Kozlovsky-Mordenfeld, Daisuke Iwai, Kosuke Sato, Amit H. Bermano

We present a technique for dynamically projecting 3D content onto human handswith short perceived motion-to-photon latency. Computing the pose and shape ofhuman hands accurately and quickly is a challenging task due to theirarticulated and deformable nature. We combine a slower 3D coarse estimation ofthe hand pose with high speed 2D correction steps which improve the alignmentof the projection to the hands, increase the projected surface area, and reduceperceived latency. Since our approach leverages a full 3D reconstruction of thehands, any arbitrary texture or reasonably performant effect can be applied,which was not possible before. We conducted two user studies to assess thebenefits of using our method. The results show subjects are less sensitive tolatency artifacts and perform faster and with more ease a given associated taskover the naive approach of directly projecting rendered frames from the 3D poseestimation. We demonstrate several novel use cases and applications.

我们介绍了一种将三维内容动态投射到人手上的技术，其感知运动到光子的延迟时间很短。由于人手具有关节化和可变形的特性，因此准确快速地计算人手的姿势和形状是一项极具挑战性的任务。我们将较慢的手部姿态三维粗略估计与高速二维校正步骤相结合，从而改善了投影与手部的对齐度，增加了投影面积，并减少了感知延迟。由于我们的方法利用了手部的全三维重建，因此可以应用任何任意纹理或合理的效果，这在以前是不可能实现的。我们进行了两项用户研究，以评估使用我们的方法的好处。结果表明，与直接投射三维姿态估计的渲染帧的天真方法相比，受试者对延迟伪影的敏感度更低，并且能更快更轻松地完成特定的相关任务。我们展示了几个新颖的使用案例和应用。

引用次数: 0

Efficient Analysis and Visualization of High-Resolution Computed Tomography Data for the Exploration of Enclosed Cuneiform Tablets 高分辨率计算机断层扫描数据的高效分析和可视化，用于探索封闭的楔形石碑

arXiv - CS - Graphics

Pub Date : 2024-09-06 DOI: arxiv-2409.04236

Stephan Olbrich, Andreas Beckert, Cécile Michel, Christian Schroer, Samaneh Ehteram, Andreas Schropp, Philipp Paetzold

Cuneiform is the earliest known system of writing, first developed for theSumerian language of southern Mesopotamia in the second half of the 4thmillennium BC. Cuneiform signs are obtained by impressing a stylus on freshclay tablets. For certain purposes, e.g. authentication by seal imprint, somecuneiform tablets were enclosed in clay envelopes, which cannot be openedwithout destroying them. The aim of our interdisciplinary project is thenon-invasive study of clay tablets. A portable X-ray micro-CT scanner isdeveloped to acquire density data of such artifacts on a high-resolution,regular 3D grid at collection sites. The resulting volume data is processedthrough feature-preserving denoising, extraction of high-accuracy surfacesusing a manifold dual marching cubes algorithm and extraction of local featuresby enhanced curvature rendering and ambient occlusion. For the non-invasivestudy of cuneiform inscriptions, the tablet is virtually separated from itsenvelope by curvature-based segmentation. The computational- and data-intensivealgorithms are optimized or near-real-time offline usage with limited resourcesat collection sites. To visualize the complexity-reduced and octree-basedcompressed representation of surfaces, we develop and implement an interactiveapplication. To facilitate the analysis of such clay tablets, we implementshape-based feature extraction algorithms to enhance cuneiform recognition. Ourworkflow supports innovative 3D display and interaction techniques such asautostereoscopic displays and gesture control.

楔形文字是已知最早的书写系统，最早是在公元前 4 世纪后半期为美索不达米亚南部的苏美尔语而开发的。楔形文字的符号是通过在新鲜粘土片上刻画笔迹而获得的。出于某些目的，例如通过印章印记进行鉴定，一些楔形文字石碑被装在粘土封套中，如果不破坏封套就无法打开。我们的跨学科项目旨在对泥板进行无损伤研究。我们开发了一种便携式 X 射线显微 CT 扫描仪，可在采集地点的高分辨率、规则三维网格上获取此类文物的密度数据。通过特征保留去噪、使用流形双行进立方体算法提取高精度表面以及通过增强曲率渲染和环境遮挡提取局部特征，对得到的体积数据进行处理。在对楔形文字铭文进行非侵入式研究时，通过基于曲率的分割，将碑文与其封套几乎分离开来。对计算和数据密集型算法进行了优化，以便在收集地点资源有限的情况下进行近乎实时的离线使用。为了使复杂度降低和基于八度压缩的表面表示可视化，我们开发并实现了一个交互式应用程序。为便于分析此类泥板，我们实施了基于形状的特征提取算法，以提高楔形文字的识别能力。我们的工作流程支持创新的三维显示和交互技术，如自立体显示和手势控制。

{"title":"Efficient Analysis and Visualization of High-Resolution Computed Tomography Data for the Exploration of Enclosed Cuneiform Tablets","authors":"Stephan Olbrich, Andreas Beckert, Cécile Michel, Christian Schroer, Samaneh Ehteram, Andreas Schropp, Philipp Paetzold","doi":"arxiv-2409.04236","DOIUrl":"https://doi.org/arxiv-2409.04236","url":null,"abstract":"Cuneiform is the earliest known system of writing, first developed for the\u0000Sumerian language of southern Mesopotamia in the second half of the 4th\u0000millennium BC. Cuneiform signs are obtained by impressing a stylus on fresh\u0000clay tablets. For certain purposes, e.g. authentication by seal imprint, some\u0000cuneiform tablets were enclosed in clay envelopes, which cannot be opened\u0000without destroying them. The aim of our interdisciplinary project is the\u0000non-invasive study of clay tablets. A portable X-ray micro-CT scanner is\u0000developed to acquire density data of such artifacts on a high-resolution,\u0000regular 3D grid at collection sites. The resulting volume data is processed\u0000through feature-preserving denoising, extraction of high-accuracy surfaces\u0000using a manifold dual marching cubes algorithm and extraction of local features\u0000by enhanced curvature rendering and ambient occlusion. For the non-invasive\u0000study of cuneiform inscriptions, the tablet is virtually separated from its\u0000envelope by curvature-based segmentation. The computational- and data-intensive\u0000algorithms are optimized or near-real-time offline usage with limited resources\u0000at collection sites. To visualize the complexity-reduced and octree-based\u0000compressed representation of surfaces, we develop and implement an interactive\u0000application. To facilitate the analysis of such clay tablets, we implement\u0000shape-based feature extraction algorithms to enhance cuneiform recognition. Our\u0000workflow supports innovative 3D display and interaction techniques such as\u0000autostereoscopic displays and gesture control.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0