The Local Moran's I statistic is a valuable tool for identifying localized patterns of spatial autocorrelation. Understanding these patterns is crucial in spatial analysis, but interpreting the statistic can be difficult. To simplify this process, we introduce three novel visualizations that enhance the interpretation of Local Moran's I results. These visualizations can be interactively linked to one another, and to established visualizations, to offer a more holistic exploration of the results. We provide a JavaScript library with implementations of these new visual elements, along with a web dashboard that demonstrates their integrated use.
局部莫兰 I 统计量是识别局部空间自相关模式的重要工具。理解这些模式对空间分析至关重要,但解释统计量却很困难。为了简化这一过程,我们引入了三种新颖的可视化方法,以加强对局部莫兰 I 结果的解释。这些可视化效果可以相互交互链接,也可以与已有的可视化效果链接,从而对结果进行更全面的探索。我们提供了一个 JavaScript 库,其中包含这些新的可视化元素的实现方法,同时还提供了一个网络仪表板来演示它们的集成使用。
{"title":"Demystifying Spatial Dependence: Interactive Visualizations for Interpreting Local Spatial Autocorrelation","authors":"Lee Mason, Blanaid Hicks, Jonas Almeida","doi":"arxiv-2408.02418","DOIUrl":"https://doi.org/arxiv-2408.02418","url":null,"abstract":"The Local Moran's I statistic is a valuable tool for identifying localized\u0000patterns of spatial autocorrelation. Understanding these patterns is crucial in\u0000spatial analysis, but interpreting the statistic can be difficult. To simplify\u0000this process, we introduce three novel visualizations that enhance the\u0000interpretation of Local Moran's I results. These visualizations can be\u0000interactively linked to one another, and to established visualizations, to\u0000offer a more holistic exploration of the results. We provide a JavaScript\u0000library with implementations of these new visual elements, along with a web\u0000dashboard that demonstrates their integrated use.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, Guosheng Lin
We introduce MeshAnything V2, an autoregressive transformer that generates Artist-Created Meshes (AM) aligned to given shapes. It can be integrated with various 3D asset production pipelines to achieve high-quality, highly controllable AM generation. MeshAnything V2 surpasses previous methods in both efficiency and performance using models of the same size. These improvements are due to our newly proposed mesh tokenization method: Adjacent Mesh Tokenization (AMT). Different from previous methods that represent each face with three vertices, AMT uses a single vertex whenever possible. Compared to previous methods, AMT requires about half the token sequence length to represent the same mesh in average. Furthermore, the token sequences from AMT are more compact and well-structured, fundamentally benefiting AM generation. Our extensive experiments show that AMT significantly improves the efficiency and performance of AM generation. Project Page: https://buaacyw.github.io/meshanything-v2/
我们介绍 MeshAnything V2,它是一种自回归变换器,可生成与给定形状对齐的艺术家自创网格(AM)。它可以与各种三维资产生产流水线集成,以实现高质量、高度可控的 AM 生成。在使用相同大小的模型时,MeshAnything V2 在效率和性能上都超越了以前的方法。这些改进归功于我们新提出的网格标记化方法:相邻网格标记化(AMT)。与以前用三个顶点表示每个面的方法不同,AMT 尽可能使用单个顶点。与以前的方法相比,AMT 平均只需要一半的标记序列长度就能表示相同的网格。我们的大量实验表明,AMT 显著提高了 AM 生成的效率和性能。项目页面:https://buaacyw.github.io/meshanything-v2/
{"title":"MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization","authors":"Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, Guosheng Lin","doi":"arxiv-2408.02555","DOIUrl":"https://doi.org/arxiv-2408.02555","url":null,"abstract":"We introduce MeshAnything V2, an autoregressive transformer that generates\u0000Artist-Created Meshes (AM) aligned to given shapes. It can be integrated with\u0000various 3D asset production pipelines to achieve high-quality, highly\u0000controllable AM generation. MeshAnything V2 surpasses previous methods in both\u0000efficiency and performance using models of the same size. These improvements\u0000are due to our newly proposed mesh tokenization method: Adjacent Mesh\u0000Tokenization (AMT). Different from previous methods that represent each face\u0000with three vertices, AMT uses a single vertex whenever possible. Compared to\u0000previous methods, AMT requires about half the token sequence length to\u0000represent the same mesh in average. Furthermore, the token sequences from AMT\u0000are more compact and well-structured, fundamentally benefiting AM generation.\u0000Our extensive experiments show that AMT significantly improves the efficiency\u0000and performance of AM generation. Project Page:\u0000https://buaacyw.github.io/meshanything-v2/","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gilad Deutch, Rinon Gal, Daniel Garibi, Or Patashnik, Daniel Cohen-Or
Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods has proven surprisingly challenging. Here, we focus on a popular line of text-based editing frameworks - the ``edit-friendly'' DDPM-noise inversion approach. We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength. We trace the artifacts to mismatched noise statistics between inverted noises and the expected noise schedule, and suggest a shifted noise schedule which corrects for this offset. To increase editing strength, we propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts. All in all, our method enables text-based image editing with as few as three diffusion steps, while providing novel insights into the mechanisms behind popular text-based editing approaches.
{"title":"TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models","authors":"Gilad Deutch, Rinon Gal, Daniel Garibi, Or Patashnik, Daniel Cohen-Or","doi":"arxiv-2408.00735","DOIUrl":"https://doi.org/arxiv-2408.00735","url":null,"abstract":"Diffusion models have opened the path to a wide range of text-based image\u0000editing frameworks. However, these typically build on the multi-step nature of\u0000the diffusion backwards process, and adapting them to distilled, fast-sampling\u0000methods has proven surprisingly challenging. Here, we focus on a popular line\u0000of text-based editing frameworks - the ``edit-friendly'' DDPM-noise inversion\u0000approach. We analyze its application to fast sampling methods and categorize\u0000its failures into two classes: the appearance of visual artifacts, and\u0000insufficient editing strength. We trace the artifacts to mismatched noise\u0000statistics between inverted noises and the expected noise schedule, and suggest\u0000a shifted noise schedule which corrects for this offset. To increase editing\u0000strength, we propose a pseudo-guidance approach that efficiently increases the\u0000magnitude of edits without introducing new artifacts. All in all, our method\u0000enables text-based image editing with as few as three diffusion steps, while\u0000providing novel insights into the mechanisms behind popular text-based editing\u0000approaches.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikos Athanasiou, Alpár Ceske, Markos Diomataris, Michael J. Black, Gül Varol
The focus of this paper is 3D motion editing. Given a 3D human motion and a textual description of the desired modification, our goal is to generate an edited motion as described by the text. The challenges include the lack of training data and the design of a model that faithfully edits the source motion. In this paper, we address both these challenges. We build a methodology to semi-automatically collect a dataset of triplets in the form of (i) a source motion, (ii) a target motion, and (iii) an edit text, and create the new MotionFix dataset. Having access to such data allows us to train a conditional diffusion model, TMED, that takes both the source motion and the edit text as input. We further build various baselines trained only on text-motion pairs datasets, and show superior performance of our model trained on triplets. We introduce new retrieval-based metrics for motion editing and establish a new benchmark on the evaluation set of MotionFix. Our results are encouraging, paving the way for further research on finegrained motion generation. Code and models will be made publicly available.
{"title":"MotionFix: Text-Driven 3D Human Motion Editing","authors":"Nikos Athanasiou, Alpár Ceske, Markos Diomataris, Michael J. Black, Gül Varol","doi":"arxiv-2408.00712","DOIUrl":"https://doi.org/arxiv-2408.00712","url":null,"abstract":"The focus of this paper is 3D motion editing. Given a 3D human motion and a\u0000textual description of the desired modification, our goal is to generate an\u0000edited motion as described by the text. The challenges include the lack of\u0000training data and the design of a model that faithfully edits the source\u0000motion. In this paper, we address both these challenges. We build a methodology\u0000to semi-automatically collect a dataset of triplets in the form of (i) a source\u0000motion, (ii) a target motion, and (iii) an edit text, and create the new\u0000MotionFix dataset. Having access to such data allows us to train a conditional\u0000diffusion model, TMED, that takes both the source motion and the edit text as\u0000input. We further build various baselines trained only on text-motion pairs\u0000datasets, and show superior performance of our model trained on triplets. We\u0000introduce new retrieval-based metrics for motion editing and establish a new\u0000benchmark on the evaluation set of MotionFix. Our results are encouraging,\u0000paving the way for further research on finegrained motion generation. Code and\u0000models will be made publicly available.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark Boss, Zixuan Huang, Aaryaman Vasishta, Varun Jampani
We present SF3D, a novel method for rapid and high-quality textured object mesh reconstruction from a single image in just 0.5 seconds. Unlike most existing approaches, SF3D is explicitly trained for mesh generation, incorporating a fast UV unwrapping technique that enables swift texture generation rather than relying on vertex colors. The method also learns to predict material parameters and normal maps to enhance the visual quality of the reconstructed 3D meshes. Furthermore, SF3D integrates a delighting step to effectively remove low-frequency illumination effects, ensuring that the reconstructed meshes can be easily used in novel illumination conditions. Experiments demonstrate the superior performance of SF3D over the existing techniques. Project page: https://stable-fast-3d.github.io
{"title":"SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement","authors":"Mark Boss, Zixuan Huang, Aaryaman Vasishta, Varun Jampani","doi":"arxiv-2408.00653","DOIUrl":"https://doi.org/arxiv-2408.00653","url":null,"abstract":"We present SF3D, a novel method for rapid and high-quality textured object\u0000mesh reconstruction from a single image in just 0.5 seconds. Unlike most\u0000existing approaches, SF3D is explicitly trained for mesh generation,\u0000incorporating a fast UV unwrapping technique that enables swift texture\u0000generation rather than relying on vertex colors. The method also learns to\u0000predict material parameters and normal maps to enhance the visual quality of\u0000the reconstructed 3D meshes. Furthermore, SF3D integrates a delighting step to\u0000effectively remove low-frequency illumination effects, ensuring that the\u0000reconstructed meshes can be easily used in novel illumination conditions.\u0000Experiments demonstrate the superior performance of SF3D over the existing\u0000techniques. Project page: https://stable-fast-3d.github.io","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural implicit representation, the parameterization of distance function as a coordinate neural field, has emerged as a promising lead in tackling surface reconstruction from unoriented point clouds. To enforce consistent orientation, existing methods focus on regularizing the gradient of the distance function, such as constraining it to be of the unit norm, minimizing its divergence, or aligning it with the eigenvector of Hessian that corresponds to zero eigenvalue. However, under the presence of large scanning noise, they tend to either overfit the noise input or produce an excessively smooth reconstruction. In this work, we propose to guide the surface reconstruction under a new variant of neural field, the octahedral field, leveraging the spherical harmonics representation of octahedral frames originated in the hexahedral meshing. Such field automatically snaps to geometry features when constrained to be smooth, and naturally preserves sharp angles when interpolated over creases. By simultaneously fitting and smoothing the octahedral field alongside the implicit geometry, it behaves analogously to bilateral filtering, resulting in smooth reconstruction while preserving sharp edges. Despite being operated purely pointwise, our method outperforms various traditional and neural approaches across extensive experiments, and is very competitive with methods that require normal and data priors. Our full implementation is available at: https://github.com/Ankbzpx/frame-field.
{"title":"Neural Octahedral Field: Octahedral prior for simultaneous smoothing and sharp edge regularization","authors":"Ruichen Zheng, Tao Yu","doi":"arxiv-2408.00303","DOIUrl":"https://doi.org/arxiv-2408.00303","url":null,"abstract":"Neural implicit representation, the parameterization of distance function as\u0000a coordinate neural field, has emerged as a promising lead in tackling surface\u0000reconstruction from unoriented point clouds. To enforce consistent orientation,\u0000existing methods focus on regularizing the gradient of the distance function,\u0000such as constraining it to be of the unit norm, minimizing its divergence, or\u0000aligning it with the eigenvector of Hessian that corresponds to zero\u0000eigenvalue. However, under the presence of large scanning noise, they tend to\u0000either overfit the noise input or produce an excessively smooth reconstruction.\u0000In this work, we propose to guide the surface reconstruction under a new\u0000variant of neural field, the octahedral field, leveraging the spherical\u0000harmonics representation of octahedral frames originated in the hexahedral\u0000meshing. Such field automatically snaps to geometry features when constrained\u0000to be smooth, and naturally preserves sharp angles when interpolated over\u0000creases. By simultaneously fitting and smoothing the octahedral field alongside\u0000the implicit geometry, it behaves analogously to bilateral filtering, resulting\u0000in smooth reconstruction while preserving sharp edges. Despite being operated\u0000purely pointwise, our method outperforms various traditional and neural\u0000approaches across extensive experiments, and is very competitive with methods\u0000that require normal and data priors. Our full implementation is available at:\u0000https://github.com/Ankbzpx/frame-field.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manuel Kansy, Jacek Naruniec, Christopher Schroers, Markus Gross, Romann M. Weber
Recent years have seen a tremendous improvement in the quality of video generation and editing approaches. While several techniques focus on editing appearance, few address motion. Current approaches using text, trajectories, or bounding boxes are limited to simple motions, so we specify motions with a single motion reference video instead. We further propose to use a pre-trained image-to-video model rather than a text-to-video model. This approach allows us to preserve the exact appearance and position of a target object or scene and helps disentangle appearance from motion. Our method, called motion-textual inversion, leverages our observation that image-to-video models extract appearance mainly from the (latent) image input, while the text/image embedding injected via cross-attention predominantly controls motion. We thus represent motion using text/image embedding tokens. By operating on an inflated motion-text embedding containing multiple text/image embedding tokens per frame, we achieve a high temporal motion granularity. Once optimized on the motion reference video, this embedding can be applied to various target images to generate videos with semantically similar motions. Our approach does not require spatial alignment between the motion reference video and target image, generalizes across various domains, and can be applied to various tasks such as full-body and face reenactment, as well as controlling the motion of inanimate objects and the camera. We empirically demonstrate the effectiveness of our method in the semantic video motion transfer task, significantly outperforming existing methods in this context.
{"title":"Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion","authors":"Manuel Kansy, Jacek Naruniec, Christopher Schroers, Markus Gross, Romann M. Weber","doi":"arxiv-2408.00458","DOIUrl":"https://doi.org/arxiv-2408.00458","url":null,"abstract":"Recent years have seen a tremendous improvement in the quality of video\u0000generation and editing approaches. While several techniques focus on editing\u0000appearance, few address motion. Current approaches using text, trajectories, or\u0000bounding boxes are limited to simple motions, so we specify motions with a\u0000single motion reference video instead. We further propose to use a pre-trained\u0000image-to-video model rather than a text-to-video model. This approach allows us\u0000to preserve the exact appearance and position of a target object or scene and\u0000helps disentangle appearance from motion. Our method, called motion-textual\u0000inversion, leverages our observation that image-to-video models extract\u0000appearance mainly from the (latent) image input, while the text/image embedding\u0000injected via cross-attention predominantly controls motion. We thus represent\u0000motion using text/image embedding tokens. By operating on an inflated\u0000motion-text embedding containing multiple text/image embedding tokens per\u0000frame, we achieve a high temporal motion granularity. Once optimized on the\u0000motion reference video, this embedding can be applied to various target images\u0000to generate videos with semantically similar motions. Our approach does not\u0000require spatial alignment between the motion reference video and target image,\u0000generalizes across various domains, and can be applied to various tasks such as\u0000full-body and face reenactment, as well as controlling the motion of inanimate\u0000objects and the camera. We empirically demonstrate the effectiveness of our\u0000method in the semantic video motion transfer task, significantly outperforming\u0000existing methods in this context.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In volume visualization, visualization synthesis has attracted much attention due to its ability to generate novel visualizations without following the conventional rendering pipeline. However, existing solutions based on generative adversarial networks often require many training images and take significant training time. Still, issues such as low quality, consistency, and flexibility persist. This paper introduces StyleRF-VolVis, an innovative style transfer framework for expressive volume visualization (VolVis) via neural radiance field (NeRF). The expressiveness of StyleRF-VolVis is upheld by its ability to accurately separate the underlying scene geometry (i.e., content) and color appearance (i.e., style), conveniently modify color, opacity, and lighting of the original rendering while maintaining visual content consistency across the views, and effectively transfer arbitrary styles from reference images to the reconstructed 3D scene. To achieve these, we design a base NeRF model for scene geometry extraction, a palette color network to classify regions of the radiance field for photorealistic editing, and an unrestricted color network to lift the color palette constraint via knowledge distillation for non-photorealistic editing. We demonstrate the superior quality, consistency, and flexibility of StyleRF-VolVis by experimenting with various volume rendering scenes and reference images and comparing StyleRF-VolVis against other image-based (AdaIN), video-based (ReReVST), and NeRF-based (ARF and SNeRF) style rendering solutions.
{"title":"StyleRF-VolVis: Style Transfer of Neural Radiance Fields for Expressive Volume Visualization","authors":"Kaiyuan Tang, Chaoli Wang","doi":"arxiv-2408.00150","DOIUrl":"https://doi.org/arxiv-2408.00150","url":null,"abstract":"In volume visualization, visualization synthesis has attracted much attention\u0000due to its ability to generate novel visualizations without following the\u0000conventional rendering pipeline. However, existing solutions based on\u0000generative adversarial networks often require many training images and take\u0000significant training time. Still, issues such as low quality, consistency, and\u0000flexibility persist. This paper introduces StyleRF-VolVis, an innovative style\u0000transfer framework for expressive volume visualization (VolVis) via neural\u0000radiance field (NeRF). The expressiveness of StyleRF-VolVis is upheld by its\u0000ability to accurately separate the underlying scene geometry (i.e., content)\u0000and color appearance (i.e., style), conveniently modify color, opacity, and\u0000lighting of the original rendering while maintaining visual content consistency\u0000across the views, and effectively transfer arbitrary styles from reference\u0000images to the reconstructed 3D scene. To achieve these, we design a base NeRF\u0000model for scene geometry extraction, a palette color network to classify\u0000regions of the radiance field for photorealistic editing, and an unrestricted\u0000color network to lift the color palette constraint via knowledge distillation\u0000for non-photorealistic editing. We demonstrate the superior quality,\u0000consistency, and flexibility of StyleRF-VolVis by experimenting with various\u0000volume rendering scenes and reference images and comparing StyleRF-VolVis\u0000against other image-based (AdaIN), video-based (ReReVST), and NeRF-based (ARF\u0000and SNeRF) style rendering solutions.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a conceptually simple and efficient algorithm for seamless parametrization, a key element in constructing quad layouts and texture charts on surfaces. More specifically, we consider the construction of parametrizations with prescribed holonomy signatures i.e., a set of angles at singularities, and rotations along homology loops, preserving which is essential for constructing parametrizations following an input field, as well as for user control of the parametrization structure. Our algorithm performs exceptionally well on a large dataset based on Thingi10k [Zhou and Jacobson 2016], (16156 meshes) as well as on a challenging smaller dataset of [Myles et al. 2014], converging, on average, in 9 iterations. Although the algorithm lacks a formal mathematical guarantee, presented empirical evidence and the connections between convex optimization and closely related algorithms, suggest that a similar formulation can be found for this algorithm in the future.
{"title":"Seamless Parametrization in Penner Coordinates","authors":"Ryan Capouellez, Denis Zorin","doi":"arxiv-2407.21342","DOIUrl":"https://doi.org/arxiv-2407.21342","url":null,"abstract":"We introduce a conceptually simple and efficient algorithm for seamless\u0000parametrization, a key element in constructing quad layouts and texture charts\u0000on surfaces. More specifically, we consider the construction of\u0000parametrizations with prescribed holonomy signatures i.e., a set of angles at\u0000singularities, and rotations along homology loops, preserving which is\u0000essential for constructing parametrizations following an input field, as well\u0000as for user control of the parametrization structure. Our algorithm performs\u0000exceptionally well on a large dataset based on Thingi10k [Zhou and Jacobson\u00002016], (16156 meshes) as well as on a challenging smaller dataset of [Myles et\u0000al. 2014], converging, on average, in 9 iterations. Although the algorithm\u0000lacks a formal mathematical guarantee, presented empirical evidence and the\u0000connections between convex optimization and closely related algorithms, suggest\u0000that a similar formulation can be found for this algorithm in the future.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Gaussian diffusion model, initially designed for image generation, has recently been adapted for 3D point cloud generation. However, these adaptations have not fully considered the intrinsic geometric characteristics of 3D shapes, thereby constraining the diffusion model's potential for 3D shape manipulation. To address this limitation, we introduce a novel deformable 3D shape diffusion model that facilitates comprehensive 3D shape manipulation, including point cloud generation, mesh deformation, and facial animation. Our approach innovatively incorporates a differential deformation kernel, which deconstructs the generation of geometric structures into successive non-rigid deformation stages. By leveraging a probabilistic diffusion model to simulate this step-by-step process, our method provides a versatile and efficient solution for a wide range of applications, spanning from graphics rendering to facial expression animation. Empirical evidence highlights the effectiveness of our approach, demonstrating state-of-the-art performance in point cloud generation and competitive results in mesh deformation. Additionally, extensive visual demonstrations reveal the significant potential of our approach for practical applications. Our method presents a unique pathway for advancing 3D shape manipulation and unlocking new opportunities in the realm of virtual reality.
{"title":"Deformable 3D Shape Diffusion Model","authors":"Dengsheng Chen, Jie Hu, Xiaoming Wei, Enhua Wu","doi":"arxiv-2407.21428","DOIUrl":"https://doi.org/arxiv-2407.21428","url":null,"abstract":"The Gaussian diffusion model, initially designed for image generation, has\u0000recently been adapted for 3D point cloud generation. However, these adaptations\u0000have not fully considered the intrinsic geometric characteristics of 3D shapes,\u0000thereby constraining the diffusion model's potential for 3D shape manipulation.\u0000To address this limitation, we introduce a novel deformable 3D shape diffusion\u0000model that facilitates comprehensive 3D shape manipulation, including point\u0000cloud generation, mesh deformation, and facial animation. Our approach\u0000innovatively incorporates a differential deformation kernel, which deconstructs\u0000the generation of geometric structures into successive non-rigid deformation\u0000stages. By leveraging a probabilistic diffusion model to simulate this\u0000step-by-step process, our method provides a versatile and efficient solution\u0000for a wide range of applications, spanning from graphics rendering to facial\u0000expression animation. Empirical evidence highlights the effectiveness of our\u0000approach, demonstrating state-of-the-art performance in point cloud generation\u0000and competitive results in mesh deformation. Additionally, extensive visual\u0000demonstrations reveal the significant potential of our approach for practical\u0000applications. Our method presents a unique pathway for advancing 3D shape\u0000manipulation and unlocking new opportunities in the realm of virtual reality.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}