Jan-Niklas Dihlmann, Arjun Majumdar, Andreas Engelhardt, Raphael Braun, Hendrik P. A. Lensch
3D reconstruction and relighting of objects made from scattering materials present a significant challenge due to the complex light transport beneath the surface. 3D Gaussian Splatting introduced high-quality novel view synthesis at real-time speeds. While 3D Gaussians efficiently approximate an object's surface, they fail to capture the volumetric properties of subsurface scattering. We propose a framework for optimizing an object's shape together with the radiance transfer field given multi-view OLAT (one light at a time) data. Our method decomposes the scene into an explicit surface represented as 3D Gaussians, with a spatially varying BRDF, and an implicit volumetric representation of the scattering component. A learned incident light field accounts for shadowing. We optimize all parameters jointly via ray-traced differentiable rendering. Our approach enables material editing, relighting and novel view synthesis at interactive rates. We show successful application on synthetic data and introduce a newly acquired multi-view multi-light dataset of objects in a light-stage setup. Compared to previous work we achieve comparable or better results at a fraction of optimization and rendering time while enabling detailed control over material attributes. Project page https://sss.jdihlmann.com/
{"title":"Subsurface Scattering for 3D Gaussian Splatting","authors":"Jan-Niklas Dihlmann, Arjun Majumdar, Andreas Engelhardt, Raphael Braun, Hendrik P. A. Lensch","doi":"arxiv-2408.12282","DOIUrl":"https://doi.org/arxiv-2408.12282","url":null,"abstract":"3D reconstruction and relighting of objects made from scattering materials\u0000present a significant challenge due to the complex light transport beneath the\u0000surface. 3D Gaussian Splatting introduced high-quality novel view synthesis at\u0000real-time speeds. While 3D Gaussians efficiently approximate an object's\u0000surface, they fail to capture the volumetric properties of subsurface\u0000scattering. We propose a framework for optimizing an object's shape together\u0000with the radiance transfer field given multi-view OLAT (one light at a time)\u0000data. Our method decomposes the scene into an explicit surface represented as\u00003D Gaussians, with a spatially varying BRDF, and an implicit volumetric\u0000representation of the scattering component. A learned incident light field\u0000accounts for shadowing. We optimize all parameters jointly via ray-traced\u0000differentiable rendering. Our approach enables material editing, relighting and\u0000novel view synthesis at interactive rates. We show successful application on\u0000synthetic data and introduce a newly acquired multi-view multi-light dataset of\u0000objects in a light-stage setup. Compared to previous work we achieve comparable\u0000or better results at a fraction of optimization and rendering time while\u0000enabling detailed control over material attributes. Project page\u0000https://sss.jdihlmann.com/","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent single-view 3D generative methods have made significant advancements by leveraging knowledge distilled from extensive 3D object datasets. However, challenges persist in the synthesis of 3D scenes from a single view, primarily due to the complexity of real-world environments and the limited availability of high-quality prior resources. In this paper, we introduce a novel approach called Pano2Room, designed to automatically reconstruct high-quality 3D indoor scenes from a single panoramic image. These panoramic images can be easily generated using a panoramic RGBD inpainter from captures at a single location with any camera. The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter while collecting photo-realistic 3D-consistent pseudo novel views. Finally, the refined mesh is converted into a 3D Gaussian Splatting field and trained with the collected pseudo novel views. This pipeline enables the reconstruction of real-world 3D scenes, even in the presence of large occlusions, and facilitates the synthesis of photo-realistic novel views with detailed geometry. Extensive qualitative and quantitative experiments have been conducted to validate the superiority of our method in single-panorama indoor novel synthesis compared to the state-of-the-art. Our code and data are available at url{https://github.com/TrickyGo/Pano2Room}.
{"title":"Pano2Room: Novel View Synthesis from a Single Indoor Panorama","authors":"Guo Pu, Yiming Zhao, Zhouhui Lian","doi":"arxiv-2408.11413","DOIUrl":"https://doi.org/arxiv-2408.11413","url":null,"abstract":"Recent single-view 3D generative methods have made significant advancements\u0000by leveraging knowledge distilled from extensive 3D object datasets. However,\u0000challenges persist in the synthesis of 3D scenes from a single view, primarily\u0000due to the complexity of real-world environments and the limited availability\u0000of high-quality prior resources. In this paper, we introduce a novel approach\u0000called Pano2Room, designed to automatically reconstruct high-quality 3D indoor\u0000scenes from a single panoramic image. These panoramic images can be easily\u0000generated using a panoramic RGBD inpainter from captures at a single location\u0000with any camera. The key idea is to initially construct a preliminary mesh from\u0000the input panorama, and iteratively refine this mesh using a panoramic RGBD\u0000inpainter while collecting photo-realistic 3D-consistent pseudo novel views.\u0000Finally, the refined mesh is converted into a 3D Gaussian Splatting field and\u0000trained with the collected pseudo novel views. This pipeline enables the\u0000reconstruction of real-world 3D scenes, even in the presence of large\u0000occlusions, and facilitates the synthesis of photo-realistic novel views with\u0000detailed geometry. Extensive qualitative and quantitative experiments have been\u0000conducted to validate the superiority of our method in single-panorama indoor\u0000novel synthesis compared to the state-of-the-art. Our code and data are\u0000available at url{https://github.com/TrickyGo/Pano2Room}.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Huang, Haichao Miao, Hyojin Kim, Andrew Townsend, Kyle Champley, Joseph Tringe, Valerio Pascucci, Peer-Timo Bremer
Advanced manufacturing creates increasingly complex objects with material compositions that are often difficult to characterize by a single modality. Our collaborating domain scientists are going beyond traditional methods by employing both X-ray and neutron computed tomography to obtain complementary representations expected to better resolve material boundaries. However, the use of two modalities creates its own challenges for visualization, requiring either complex adjustments of bimodal transfer functions or the need for multiple views. Together with experts in nondestructive evaluation, we designed a novel interactive bimodal visualization approach to create a combined view of the co-registered X-ray and neutron acquisitions of industrial objects. Using an automatic topological segmentation of the bivariate histogram of X-ray and neutron values as a starting point, the system provides a simple yet effective interface to easily create, explore, and adjust a bimodal visualization. We propose a widget with simple brushing interactions that enables the user to quickly correct the segmented histogram results. Our semiautomated system enables domain experts to intuitively explore large bimodal datasets without the need for either advanced segmentation algorithms or knowledge of visualization techniques. We demonstrate our approach using synthetic examp
先进的制造业制造出越来越复杂的物体,其材料构成往往难以用单一模式来表征。我们合作的领域科学家正在超越传统方法,同时采用 X 射线和中子计算机断层扫描技术来获得互补的显示结果,从而更好地解析材料边界。然而,两种模式的使用给可视化带来了挑战,需要对双模传递函数进行复杂的调整,或者需要形成多个视图。我们与无损评估专家一起设计了一种新颖的交互式双模可视化方法,以创建工业物体的共聚合 X 射线和中子采集图像的组合视图。该系统以 X 射线和中子值的双变量直方图的自动拓扑分割为起点,提供了一个简单而有效的界面,可轻松创建、探索和调整双模态可视化。我们提出了一个具有简单刷洗交互功能的小工具,使用户能够快速修正分割后的直方图结果。我们的半自动系统能让领域专家直观地探索大型双峰数据集,而无需高级分割算法或可视化技术知识。我们使用合成示例来演示我们的方法。
{"title":"Bimodal Visualization of Industrial X-Ray and Neutron Computed Tomography Data","authors":"Xuan Huang, Haichao Miao, Hyojin Kim, Andrew Townsend, Kyle Champley, Joseph Tringe, Valerio Pascucci, Peer-Timo Bremer","doi":"arxiv-2408.11957","DOIUrl":"https://doi.org/arxiv-2408.11957","url":null,"abstract":"Advanced manufacturing creates increasingly complex objects with material\u0000compositions that are often difficult to characterize by a single modality. Our\u0000collaborating domain scientists are going beyond traditional methods by\u0000employing both X-ray and neutron computed tomography to obtain complementary\u0000representations expected to better resolve material boundaries. However, the\u0000use of two modalities creates its own challenges for visualization, requiring\u0000either complex adjustments of bimodal transfer functions or the need for\u0000multiple views. Together with experts in nondestructive evaluation, we designed\u0000a novel interactive bimodal visualization approach to create a combined view of\u0000the co-registered X-ray and neutron acquisitions of industrial objects. Using\u0000an automatic topological segmentation of the bivariate histogram of X-ray and\u0000neutron values as a starting point, the system provides a simple yet effective\u0000interface to easily create, explore, and adjust a bimodal visualization. We\u0000propose a widget with simple brushing interactions that enables the user to\u0000quickly correct the segmented histogram results. Our semiautomated system\u0000enables domain experts to intuitively explore large bimodal datasets without\u0000the need for either advanced segmentation algorithms or knowledge of\u0000visualization techniques. We demonstrate our approach using synthetic examp","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"434 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We address a persistent challenge in text-to-image models: accurately generating a specified number of objects. Current models, which learn from image-text pairs, inherently struggle with counting, as training data cannot depict every possible number of objects for any given object. To solve this, we propose optimizing the generated image based on a counting loss derived from a counting model that aggregates an object's potential. Employing an out-of-the-box counting model is challenging for two reasons: first, the model requires a scaling hyperparameter for the potential aggregation that varies depending on the viewpoint of the objects, and second, classifier guidance techniques require modified models that operate on noisy intermediate diffusion steps. To address these challenges, we propose an iterated online training mode that improves the accuracy of inferred images while altering the text conditioning embedding and dynamically adjusting hyperparameters. Our method offers three key advantages: (i) it can consider non-derivable counting techniques based on detection models, (ii) it is a zero-shot plug-and-play solution facilitating rapid changes to the counting techniques and image generation methods, and (iii) the optimized counting token can be reused to generate accurate images without additional optimization. We evaluate the generation of various objects and show significant improvements in accuracy. The project page is available at https://ozzafar.github.io/count_token.
{"title":"Iterative Object Count Optimization for Text-to-image Diffusion Models","authors":"Oz Zafar, Lior Wolf, Idan Schwartz","doi":"arxiv-2408.11721","DOIUrl":"https://doi.org/arxiv-2408.11721","url":null,"abstract":"We address a persistent challenge in text-to-image models: accurately\u0000generating a specified number of objects. Current models, which learn from\u0000image-text pairs, inherently struggle with counting, as training data cannot\u0000depict every possible number of objects for any given object. To solve this, we\u0000propose optimizing the generated image based on a counting loss derived from a\u0000counting model that aggregates an object's potential. Employing an\u0000out-of-the-box counting model is challenging for two reasons: first, the model\u0000requires a scaling hyperparameter for the potential aggregation that varies\u0000depending on the viewpoint of the objects, and second, classifier guidance\u0000techniques require modified models that operate on noisy intermediate diffusion\u0000steps. To address these challenges, we propose an iterated online training mode\u0000that improves the accuracy of inferred images while altering the text\u0000conditioning embedding and dynamically adjusting hyperparameters. Our method\u0000offers three key advantages: (i) it can consider non-derivable counting\u0000techniques based on detection models, (ii) it is a zero-shot plug-and-play\u0000solution facilitating rapid changes to the counting techniques and image\u0000generation methods, and (iii) the optimized counting token can be reused to\u0000generate accurate images without additional optimization. We evaluate the\u0000generation of various objects and show significant improvements in accuracy.\u0000The project page is available at https://ozzafar.github.io/count_token.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhijing Shao, Duotun Wang, Qing-Yao Tian, Yao-Dong Yang, Hengyu Meng, Zeyu Cai, Bo Dong, Yu Zhang, Kang Zhang, Zeyu Wang
Although neural rendering has made significant advancements in creating lifelike, animatable full-body and head avatars, incorporating detailed expressions into full-body avatars remains largely unexplored. We present DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for full-body avatars with rich facial expressions. Trained on multiview videos of a given subject, our method learns a conditional variational autoencoder that takes both the body motion and facial expression as driving signals to generate Gaussian maps in the UV layout. To drive the facial expressions, instead of the commonly used 3D Morphable Models (3DMMs) in 3D head avatars, we propose to adopt the expression latent space trained solely on 2D portrait images, bridging the gap between 2D talking faces and 3D avatars. Leveraging the rendering capability of 3DGS and the rich expressiveness of the expression latent space, the learned avatars can be reenacted to reproduce photorealistic rendering images with subtle and accurate facial expressions. Experiments on an existing dataset and our newly proposed dataset of full-body talking avatars demonstrate the efficacy of our method. We also propose an audio-driven extension of our method with the help of 2D talking faces, opening new possibilities to interactive AI agents.
尽管神经渲染技术在创建栩栩如生、可动画化的全身和头部头像方面取得了重大进展,但在全身头像中融入细致表情的技术在很大程度上仍未得到探索。我们展示了首个基于 3D 高斯拼接(3DGS)的建模方法--DEGAS,用于制作面部表情丰富的全身头像。我们的方法在给定对象的多视角视频上进行训练,学习条件变异自动编码器,将身体运动和面部表情作为驱动信号,在 UV 布局中生成高斯图。为了驱动面部表情,我们建议采用仅在二维肖像图像上训练的表情潜空间,而不是三维头像中常用的三维可变形模型(3DMM),从而缩小了二维会说话的人脸和三维头像之间的差距。利用 3DGS 的增强能力和表情潜空间的丰富表现力,学习到的头像可以重现逼真的渲染图像,并带有微妙而准确的面部表情。在现有数据集和我们新提出的全身会说话的头像数据集上进行的实验证明了我们方法的有效性。我们还提出了一种音频驱动的扩展方法,借助二维会说话的人脸,为交互式人工智能代理开辟了新的可能性。
{"title":"DEGAS: Detailed Expressions on Full-Body Gaussian Avatars","authors":"Zhijing Shao, Duotun Wang, Qing-Yao Tian, Yao-Dong Yang, Hengyu Meng, Zeyu Cai, Bo Dong, Yu Zhang, Kang Zhang, Zeyu Wang","doi":"arxiv-2408.10588","DOIUrl":"https://doi.org/arxiv-2408.10588","url":null,"abstract":"Although neural rendering has made significant advancements in creating\u0000lifelike, animatable full-body and head avatars, incorporating detailed\u0000expressions into full-body avatars remains largely unexplored. We present\u0000DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for\u0000full-body avatars with rich facial expressions. Trained on multiview videos of\u0000a given subject, our method learns a conditional variational autoencoder that\u0000takes both the body motion and facial expression as driving signals to generate\u0000Gaussian maps in the UV layout. To drive the facial expressions, instead of the\u0000commonly used 3D Morphable Models (3DMMs) in 3D head avatars, we propose to\u0000adopt the expression latent space trained solely on 2D portrait images,\u0000bridging the gap between 2D talking faces and 3D avatars. Leveraging the\u0000rendering capability of 3DGS and the rich expressiveness of the expression\u0000latent space, the learned avatars can be reenacted to reproduce photorealistic\u0000rendering images with subtle and accurate facial expressions. Experiments on an\u0000existing dataset and our newly proposed dataset of full-body talking avatars\u0000demonstrate the efficacy of our method. We also propose an audio-driven\u0000extension of our method with the help of 2D talking faces, opening new\u0000possibilities to interactive AI agents.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Open-world 3D reconstruction models have recently garnered significant attention. However, without sufficient 3D inductive bias, existing methods typically entail expensive training costs and struggle to extract high-quality 3D meshes. In this work, we introduce MeshFormer, a sparse-view reconstruction model that explicitly leverages 3D native structure, input guidance, and training supervision. Specifically, instead of using a triplane representation, we store features in 3D sparse voxels and combine transformers with 3D convolutions to leverage an explicit 3D structure and projective bias. In addition to sparse-view RGB input, we require the network to take input and generate corresponding normal maps. The input normal maps can be predicted by 2D diffusion models, significantly aiding in the guidance and refinement of the geometry's learning. Moreover, by combining Signed Distance Function (SDF) supervision with surface rendering, we directly learn to generate high-quality meshes without the need for complex multi-stage training processes. By incorporating these explicit 3D biases, MeshFormer can be trained efficiently and deliver high-quality textured meshes with fine-grained geometric details. It can also be integrated with 2D diffusion models to enable fast single-image-to-3D and text-to-3D tasks. Project page: https://meshformer3d.github.io
{"title":"MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model","authors":"Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, Hongzhi Wu, Hao Su","doi":"arxiv-2408.10198","DOIUrl":"https://doi.org/arxiv-2408.10198","url":null,"abstract":"Open-world 3D reconstruction models have recently garnered significant\u0000attention. However, without sufficient 3D inductive bias, existing methods\u0000typically entail expensive training costs and struggle to extract high-quality\u00003D meshes. In this work, we introduce MeshFormer, a sparse-view reconstruction\u0000model that explicitly leverages 3D native structure, input guidance, and\u0000training supervision. Specifically, instead of using a triplane representation,\u0000we store features in 3D sparse voxels and combine transformers with 3D\u0000convolutions to leverage an explicit 3D structure and projective bias. In\u0000addition to sparse-view RGB input, we require the network to take input and\u0000generate corresponding normal maps. The input normal maps can be predicted by\u00002D diffusion models, significantly aiding in the guidance and refinement of the\u0000geometry's learning. Moreover, by combining Signed Distance Function (SDF)\u0000supervision with surface rendering, we directly learn to generate high-quality\u0000meshes without the need for complex multi-stage training processes. By\u0000incorporating these explicit 3D biases, MeshFormer can be trained efficiently\u0000and deliver high-quality textured meshes with fine-grained geometric details.\u0000It can also be integrated with 2D diffusion models to enable fast\u0000single-image-to-3D and text-to-3D tasks. Project page:\u0000https://meshformer3d.github.io","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue Chang, Otman Benchekroun, Maurizio M. Chiaramonte, Peter Yichen Chen, Eitan Grinspun
The eigenfunctions of the Laplace operator are essential in mathematical physics, engineering, and geometry processing. Typically, these are computed by discretizing the domain and performing eigendecomposition, tying the results to a specific mesh. However, this method is unsuitable for continuously-parameterized shapes. We propose a novel representation for eigenfunctions in continuously-parameterized shape spaces, where eigenfunctions are spatial fields with continuous dependence on shape parameters, defined by minimal Dirichlet energy, unit norm, and mutual orthogonality. We implement this with multilayer perceptrons trained as neural fields, mapping shape parameters and domain positions to eigenfunction values. A unique challenge is enforcing mutual orthogonality with respect to causality, where the causal ordering varies across the shape space. Our training method therefore requires three interwoven concepts: (1) learning $n$ eigenfunctions concurrently by minimizing Dirichlet energy with unit norm constraints; (2) filtering gradients during backpropagation to enforce causal orthogonality, preventing earlier eigenfunctions from being influenced by later ones; (3) dynamically sorting the causal ordering based on eigenvalues to track eigenvalue curve crossovers. We demonstrate our method on problems such as shape family analysis, predicting eigenfunctions for incomplete shapes, interactive shape manipulation, and computing higher-dimensional eigenfunctions, on all of which traditional methods fall short.
{"title":"Neural Representation of Shape-Dependent Laplacian Eigenfunctions","authors":"Yue Chang, Otman Benchekroun, Maurizio M. Chiaramonte, Peter Yichen Chen, Eitan Grinspun","doi":"arxiv-2408.10099","DOIUrl":"https://doi.org/arxiv-2408.10099","url":null,"abstract":"The eigenfunctions of the Laplace operator are essential in mathematical\u0000physics, engineering, and geometry processing. Typically, these are computed by\u0000discretizing the domain and performing eigendecomposition, tying the results to\u0000a specific mesh. However, this method is unsuitable for\u0000continuously-parameterized shapes. We propose a novel representation for eigenfunctions in\u0000continuously-parameterized shape spaces, where eigenfunctions are spatial\u0000fields with continuous dependence on shape parameters, defined by minimal\u0000Dirichlet energy, unit norm, and mutual orthogonality. We implement this with\u0000multilayer perceptrons trained as neural fields, mapping shape parameters and\u0000domain positions to eigenfunction values. A unique challenge is enforcing mutual orthogonality with respect to\u0000causality, where the causal ordering varies across the shape space. Our\u0000training method therefore requires three interwoven concepts: (1) learning $n$\u0000eigenfunctions concurrently by minimizing Dirichlet energy with unit norm\u0000constraints; (2) filtering gradients during backpropagation to enforce causal\u0000orthogonality, preventing earlier eigenfunctions from being influenced by later\u0000ones; (3) dynamically sorting the causal ordering based on eigenvalues to track\u0000eigenvalue curve crossovers. We demonstrate our method on problems such as shape family analysis,\u0000predicting eigenfunctions for incomplete shapes, interactive shape\u0000manipulation, and computing higher-dimensional eigenfunctions, on all of which\u0000traditional methods fall short.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"284 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proper representation of data in graphical visualizations becomes challenging when high accuracy in data types is required, especially in those situations where the difference between double-precision floating-point and single-precision floating-point values makes a significant difference. Some of the limitations of using single-precision over double-precision include lesser accuracy, which accumulates errors over time, and poor modeling of large or small numbers. In such scenarios, emulated double precision is often used as a solution. The proposed methodology uses a modern GPU pipeline and graphics library API specifications to use native double precision. In this research, the approach is implemented using the Vulkan API, C++, and GLSL. Experimental evaluation with a series of experiments on 2D and 3D point datasets is proposed to indicate the effectiveness of the approach. This evaluates performance comparisons between native double-precision implementations against their emulated double-precision approaches with respect to rendering performance and accuracy. This study provides insight into the benefits of using native double-precision in graphical applications, denoting limitations and problems with emulated double-precision usages. These results improve the general understanding of the precision involved in graphical visualizations and assist developers in making decisions about which precision methods to use during their applications.
当需要高精度的数据类型时,尤其是在双精度浮点数值和单精度浮点数值之间存在显著差异的情况下,在图形可视化中正确表示数据就变得具有挑战性。与双精度相比,使用单精度的一些局限性包括精度较低,会随着时间的推移而累积误差,以及对大数或小数的建模能力较差。在这种情况下,通常使用模拟双精度作为解决方案。建议的方法使用现代 GPU 管道和图形库 API 规范来使用本地双精度。本研究使用 Vulkan API、C++ 和 GLSL 实现了该方法。为了说明该方法的有效性,提出了一系列关于二维和三维点数据集的实验评估。在渲染性能和准确性方面,评估了本地双精度实现与模拟双精度方法之间的性能比较。这项研究深入探讨了在图形应用中使用本地双精度的好处,指出了模拟双精度使用的局限性和问题。这些结果加深了人们对图形可视化所涉及精度的总体理解,有助于开发人员决定在其应用中使用哪种精度方法。
{"title":"Double-Precision Floating-Point Data Visualizations Using Vulkan API","authors":"Nezihe Sozen","doi":"arxiv-2408.09699","DOIUrl":"https://doi.org/arxiv-2408.09699","url":null,"abstract":"Proper representation of data in graphical visualizations becomes challenging\u0000when high accuracy in data types is required, especially in those situations\u0000where the difference between double-precision floating-point and\u0000single-precision floating-point values makes a significant difference. Some of\u0000the limitations of using single-precision over double-precision include lesser\u0000accuracy, which accumulates errors over time, and poor modeling of large or\u0000small numbers. In such scenarios, emulated double precision is often used as a\u0000solution. The proposed methodology uses a modern GPU pipeline and graphics\u0000library API specifications to use native double precision. In this research,\u0000the approach is implemented using the Vulkan API, C++, and GLSL. Experimental\u0000evaluation with a series of experiments on 2D and 3D point datasets is proposed\u0000to indicate the effectiveness of the approach. This evaluates performance\u0000comparisons between native double-precision implementations against their\u0000emulated double-precision approaches with respect to rendering performance and\u0000accuracy. This study provides insight into the benefits of using native\u0000double-precision in graphical applications, denoting limitations and problems\u0000with emulated double-precision usages. These results improve the general\u0000understanding of the precision involved in graphical visualizations and assist\u0000developers in making decisions about which precision methods to use during\u0000their applications.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"40 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Research on smooth vector graphics is separated into two independent research threads: one on interpolation-based gradient meshes and the other on diffusion-based curve formulations. With this paper, we propose a mathematical formulation that unifies gradient meshes and curve-based approaches as solution to a Poisson problem. To combine these two well-known representations, we first generate a non-overlapping intermediate patch representation that specifies for each patch a target Laplacian and boundary conditions. Unifying the treatment of boundary conditions adds further artistic degrees of freedoms to the existing formulations, such as Neumann conditions on diffusion curves. To synthesize a raster image for a given output resolution, we then rasterize boundary conditions and Laplacians for the respective patches and compute the final image as solution to a Poisson problem. We evaluate the method on various test scenes containing gradient meshes and curve-based primitives. Since our mathematical formulation works with established smooth vector graphics primitives on the front-end, it is compatible with existing content creation pipelines and with established editing tools. Rather than continuing two separate research paths, we hope that a unification of the formulations will lead to new rasterization and vectorization tools in the future that utilize the strengths of both approaches.
{"title":"Unified Smooth Vector Graphics: Modeling Gradient Meshes and Curve-based Approaches Jointly as Poisson Problem","authors":"Xingze Tian, Tobias Günther","doi":"arxiv-2408.09211","DOIUrl":"https://doi.org/arxiv-2408.09211","url":null,"abstract":"Research on smooth vector graphics is separated into two independent research\u0000threads: one on interpolation-based gradient meshes and the other on\u0000diffusion-based curve formulations. With this paper, we propose a mathematical\u0000formulation that unifies gradient meshes and curve-based approaches as solution\u0000to a Poisson problem. To combine these two well-known representations, we first\u0000generate a non-overlapping intermediate patch representation that specifies for\u0000each patch a target Laplacian and boundary conditions. Unifying the treatment\u0000of boundary conditions adds further artistic degrees of freedoms to the\u0000existing formulations, such as Neumann conditions on diffusion curves. To\u0000synthesize a raster image for a given output resolution, we then rasterize\u0000boundary conditions and Laplacians for the respective patches and compute the\u0000final image as solution to a Poisson problem. We evaluate the method on various\u0000test scenes containing gradient meshes and curve-based primitives. Since our\u0000mathematical formulation works with established smooth vector graphics\u0000primitives on the front-end, it is compatible with existing content creation\u0000pipelines and with established editing tools. Rather than continuing two\u0000separate research paths, we hope that a unification of the formulations will\u0000lead to new rasterization and vectorization tools in the future that utilize\u0000the strengths of both approaches.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Topological abstractions offer a method to summarize the behavior of vector fields but computing them robustly can be challenging due to numerical precision issues. One alternative is to represent the vector field using a discrete approach, which constructs a collection of pairs of simplices in the input mesh that satisfies criteria introduced by Forman's discrete Morse theory. While numerous approaches exist to compute pairs in the restricted case of the gradient of a scalar field, state-of-the-art algorithms for the general case of vector fields require expensive optimization procedures. This paper introduces a fast, novel approach for pairing simplices of two-dimensional, triangulated vector fields that do not vary in time. The key insight of our approach is that we can employ a local evaluation, inspired by the approach used to construct a discrete gradient field, where every simplex in a mesh is considered by no more than one of its vertices. Specifically, we observe that for any edge in the input mesh, we can uniquely assign an outward direction of flow. We can further expand this consistent notion of outward flow at each vertex, which corresponds to the concept of a downhill flow in the case of scalar fields. Working with outward flow enables a linear-time algorithm that processes the (outward) neighborhoods of each vertex one-by-one, similar to the approach used for scalar fields. We couple our approach to constructing discrete vector fields with a method to extract, simplify, and visualize topological features. Empirical results on analytic and simulation data demonstrate drastic improvements in running time, produce features similar to the current state-of-the-art, and show the application of simplification to large, complex flows.
{"title":"Localized Evaluation for Constructing Discrete Vector Fields","authors":"Tanner Finken, Julien Tierny, Joshua A Levine","doi":"arxiv-2408.04769","DOIUrl":"https://doi.org/arxiv-2408.04769","url":null,"abstract":"Topological abstractions offer a method to summarize the behavior of vector\u0000fields but computing them robustly can be challenging due to numerical\u0000precision issues. One alternative is to represent the vector field using a\u0000discrete approach, which constructs a collection of pairs of simplices in the\u0000input mesh that satisfies criteria introduced by Forman's discrete Morse\u0000theory. While numerous approaches exist to compute pairs in the restricted case\u0000of the gradient of a scalar field, state-of-the-art algorithms for the general\u0000case of vector fields require expensive optimization procedures. This paper\u0000introduces a fast, novel approach for pairing simplices of two-dimensional,\u0000triangulated vector fields that do not vary in time. The key insight of our\u0000approach is that we can employ a local evaluation, inspired by the approach\u0000used to construct a discrete gradient field, where every simplex in a mesh is\u0000considered by no more than one of its vertices. Specifically, we observe that\u0000for any edge in the input mesh, we can uniquely assign an outward direction of\u0000flow. We can further expand this consistent notion of outward flow at each\u0000vertex, which corresponds to the concept of a downhill flow in the case of\u0000scalar fields. Working with outward flow enables a linear-time algorithm that\u0000processes the (outward) neighborhoods of each vertex one-by-one, similar to the\u0000approach used for scalar fields. We couple our approach to constructing\u0000discrete vector fields with a method to extract, simplify, and visualize\u0000topological features. Empirical results on analytic and simulation data\u0000demonstrate drastic improvements in running time, produce features similar to\u0000the current state-of-the-art, and show the application of simplification to\u0000large, complex flows.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}