首页 > 最新文献

arXiv - CS - Graphics最新文献

英文 中文
2DGH: 2D Gaussian-Hermite Splatting for High-quality Rendering and Better Geometry Reconstruction 2DGH:二维高斯-赫米特拼接技术实现高质量渲染和更好的几何重构
Pub Date : 2024-08-30 DOI: arxiv-2408.16982
Ruihan Yu, Tianyu Huang, Jingwang Ling, Feng Xu
2D Gaussian Splatting has recently emerged as a significant method in 3Dreconstruction, enabling novel view synthesis and geometry reconstructionsimultaneously. While the well-known Gaussian kernel is broadly used, its lackof anisotropy and deformation ability leads to dim and vague edges at objectsilhouettes, limiting the reconstruction quality of current Gaussian splattingmethods. To enhance the representation power, we draw inspiration from quantumphysics and propose to use the Gaussian-Hermite kernel as the new primitive inGaussian splatting. The new kernel takes a unified mathematical form andextends the Gaussian function, which serves as the zero-rank term in theupdated formulation. Our experiments demonstrate the extraordinary performanceof Gaussian-Hermite kernel in both geometry reconstruction and novel-viewsynthesis tasks. The proposed kernel outperforms traditional Gaussian Splattingkernels, showcasing its potential for high-quality 3D reconstruction andrendering.
近来,二维高斯拼接法成为三维重建的重要方法,可同时进行新颖的视图合成和几何重建。虽然众所周知的高斯核被广泛使用,但它缺乏各向异性和变形能力,导致物体轮廓边缘模糊不清,限制了当前高斯拼接方法的重建质量。为了增强表示能力,我们从量子物理学中汲取灵感,提出使用高斯-赫米特核作为高斯拼接的新基元。新核采用统一的数学形式,并扩展了高斯函数,在更新的公式中作为零秩项。我们的实验证明了高斯-赫米特核在几何重建和新视图合成任务中的非凡性能。所提出的内核优于传统的高斯拼接内核,展示了其在高质量三维重建和渲染方面的潜力。
{"title":"2DGH: 2D Gaussian-Hermite Splatting for High-quality Rendering and Better Geometry Reconstruction","authors":"Ruihan Yu, Tianyu Huang, Jingwang Ling, Feng Xu","doi":"arxiv-2408.16982","DOIUrl":"https://doi.org/arxiv-2408.16982","url":null,"abstract":"2D Gaussian Splatting has recently emerged as a significant method in 3D\u0000reconstruction, enabling novel view synthesis and geometry reconstruction\u0000simultaneously. While the well-known Gaussian kernel is broadly used, its lack\u0000of anisotropy and deformation ability leads to dim and vague edges at object\u0000silhouettes, limiting the reconstruction quality of current Gaussian splatting\u0000methods. To enhance the representation power, we draw inspiration from quantum\u0000physics and propose to use the Gaussian-Hermite kernel as the new primitive in\u0000Gaussian splatting. The new kernel takes a unified mathematical form and\u0000extends the Gaussian function, which serves as the zero-rank term in the\u0000updated formulation. Our experiments demonstrate the extraordinary performance\u0000of Gaussian-Hermite kernel in both geometry reconstruction and novel-view\u0000synthesis tasks. The proposed kernel outperforms traditional Gaussian Splatting\u0000kernels, showcasing its potential for high-quality 3D reconstruction and\u0000rendering.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Multi-Resolution Encoding for Interactive Large-Scale Volume Visualization through Functional Approximation 通过函数逼近实现自适应多分辨率编码,实现交互式大规模体积可视化
Pub Date : 2024-08-30 DOI: arxiv-2409.00184
Jianxin Sun, David Lenz, Hongfeng Yu, Tom Peterka
Functional approximation as a high-order continuous representation provides amore accurate value and gradient query compared to the traditional discretevolume representation. Volume visualization directly rendered from functionalapproximation generates high-quality rendering results without high-orderartifacts caused by trilinear interpolations. However, querying an encodedfunctional approximation is computationally expensive, especially when theinput dataset is large, making functional approximation impractical forinteractive visualization. In this paper, we proposed a novel functionalapproximation multi-resolution representation, Adaptive-FAM, which islightweight and fast to query. We also design a GPU-accelerated out-of-coremulti-resolution volume visualization framework that directly utilizes theAdaptive-FAM representation to generate high-quality rendering with interactiveresponsiveness. Our method can not only dramatically decrease the caching time,one of the main contributors to input latency, but also effectively improve thecache hit rate through prefetching. Our approach significantly outperforms thetraditional function approximation method in terms of input latency whilemaintaining comparable rendering quality.
与传统的离散体积表示法相比,函数逼近法作为一种高阶连续表示法,能提供更精确的数值和梯度查询。由函数近似直接渲染的体积可视化可生成高质量的渲染结果,而不会出现三线性插值造成的高阶伪影。然而,查询编码函数近似值的计算成本很高,尤其是当输入数据集很大时,这使得函数近似值在交互式可视化中变得不切实际。在本文中,我们提出了一种新颖的函数近似多分辨率表示法--自适应-FAM,它重量轻、查询速度快。我们还设计了一个 GPU 加速的多分辨率体外可视化框架,直接利用 Adaptive-FAM 表示法生成高质量的交互式响应渲染。我们的方法不仅能大幅减少缓存时间(输入延迟的主要因素之一),还能通过预取有效提高缓存命中率。我们的方法在输入延迟方面明显优于传统的函数逼近方法,同时还能保持相当的渲染质量。
{"title":"Adaptive Multi-Resolution Encoding for Interactive Large-Scale Volume Visualization through Functional Approximation","authors":"Jianxin Sun, David Lenz, Hongfeng Yu, Tom Peterka","doi":"arxiv-2409.00184","DOIUrl":"https://doi.org/arxiv-2409.00184","url":null,"abstract":"Functional approximation as a high-order continuous representation provides a\u0000more accurate value and gradient query compared to the traditional discrete\u0000volume representation. Volume visualization directly rendered from functional\u0000approximation generates high-quality rendering results without high-order\u0000artifacts caused by trilinear interpolations. However, querying an encoded\u0000functional approximation is computationally expensive, especially when the\u0000input dataset is large, making functional approximation impractical for\u0000interactive visualization. In this paper, we proposed a novel functional\u0000approximation multi-resolution representation, Adaptive-FAM, which is\u0000lightweight and fast to query. We also design a GPU-accelerated out-of-core\u0000multi-resolution volume visualization framework that directly utilizes the\u0000Adaptive-FAM representation to generate high-quality rendering with interactive\u0000responsiveness. Our method can not only dramatically decrease the caching time,\u0000one of the main contributors to input latency, but also effectively improve the\u0000cache hit rate through prefetching. Our approach significantly outperforms the\u0000traditional function approximation method in terms of input latency while\u0000maintaining comparable rendering quality.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RenDetNet: Weakly-supervised Shadow Detection with Shadow Caster Verification RenDetNet:弱监督阴影检测与阴影捕捉器验证
Pub Date : 2024-08-30 DOI: arxiv-2408.17143
Nikolina Kubiak, Elliot Wortman, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield
Existing shadow detection models struggle to differentiate dark image areasfrom shadows. In this paper, we tackle this issue by verifying that alldetected shadows are real, i.e. they have paired shadow casters. We performthis step in a physically-accurate manner by differentiably re-rendering thescene and observing the changes stemming from carving out estimated shadowcasters. Thanks to this approach, the RenDetNet proposed in this paper is thefirst learning-based shadow detection model whose supervisory signals can becomputed in a self-supervised manner. The developed system compares favourablyagainst recent models trained on our data. As part of this publication, werelease our code on github.
现有的阴影检测模型很难区分图像中的暗区和阴影。在本文中,我们通过验证所有检测到的阴影都是真实的,即它们有成对的阴影投射者来解决这个问题。我们通过对场景进行差异化的重新渲染,并观察因刻画出估计的阴影投射物而产生的变化,从而以物理精确的方式完成这一步。由于采用了这种方法,本文提出的 RenDetNet 成为第一个基于学习的阴影检测模型,其监督信号可以以自我监督的方式计算。与最近基于我们的数据训练出的模型相比,所开发的系统表现出色。作为本文的一部分,我们在 github 上发布了我们的代码。
{"title":"RenDetNet: Weakly-supervised Shadow Detection with Shadow Caster Verification","authors":"Nikolina Kubiak, Elliot Wortman, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield","doi":"arxiv-2408.17143","DOIUrl":"https://doi.org/arxiv-2408.17143","url":null,"abstract":"Existing shadow detection models struggle to differentiate dark image areas\u0000from shadows. In this paper, we tackle this issue by verifying that all\u0000detected shadows are real, i.e. they have paired shadow casters. We perform\u0000this step in a physically-accurate manner by differentiably re-rendering the\u0000scene and observing the changes stemming from carving out estimated shadow\u0000casters. Thanks to this approach, the RenDetNet proposed in this paper is the\u0000first learning-based shadow detection model whose supervisory signals can be\u0000computed in a self-supervised manner. The developed system compares favourably\u0000against recent models trained on our data. As part of this publication, we\u0000release our code on github.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model ReconX:利用视频扩散模型从稀疏视图重建任何场景
Pub Date : 2024-08-29 DOI: arxiv-2408.16767
Fangfu Liu, Wenqiang Sun, Hanyang Wang, Yikai Wang, Haowen Sun, Junliang Ye, Jun Zhang, Yueqi Duan
Advancements in 3D scene reconstruction have transformed 2D images from thereal world into 3D models, producing realistic 3D results from hundreds ofinput photos. Despite great success in dense-view reconstruction scenarios,rendering a detailed scene from insufficient captured views is still anill-posed optimization problem, often resulting in artifacts and distortions inunseen areas. In this paper, we propose ReconX, a novel 3D scene reconstructionparadigm that reframes the ambiguous reconstruction challenge as a temporalgeneration task. The key insight is to unleash the strong generative prior oflarge pre-trained video diffusion models for sparse-view reconstruction.However, 3D view consistency struggles to be accurately preserved in directlygenerated video frames from pre-trained models. To address this, given limitedinput views, the proposed ReconX first constructs a global point cloud andencodes it into a contextual space as the 3D structure condition. Guided by thecondition, the video diffusion model then synthesizes video frames that areboth detail-preserved and exhibit a high degree of 3D consistency, ensuring thecoherence of the scene from various perspectives. Finally, we recover the 3Dscene from the generated video through a confidence-aware 3D Gaussian Splattingoptimization scheme. Extensive experiments on various real-world datasets showthe superiority of our ReconX over state-of-the-art methods in terms of qualityand generalizability.
三维场景重建技术的进步将现实世界中的二维图像转化为三维模型,从数百张输入照片中生成逼真的三维结果。尽管在密集视图重建场景中取得了巨大成功,但从捕捉到的不足视图中渲染出详细的场景仍然是一个难以解决的优化问题,往往会在不可见的区域造成伪影和失真。在本文中,我们提出了一种新颖的三维场景重建范式 ReconX,它将模糊重建挑战重构为时间生成任务。然而,在根据预训练模型直接生成的视频帧中,三维视图的一致性很难得到准确保留。为了解决这个问题,在输入视图有限的情况下,建议的 ReconX 首先构建一个全局点云,并将其编码到上下文空间作为三维结构条件。然后,在该条件的指导下,视频扩散模型合成既能保留细节又能表现出高度三维一致性的视频帧,从而确保从不同视角观察场景的一致性。最后,我们通过置信度感知的三维高斯拼接优化方案从生成的视频中恢复三维场景。在各种真实世界数据集上进行的广泛实验表明,我们的 ReconX 在质量和通用性方面都优于最先进的方法。
{"title":"ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model","authors":"Fangfu Liu, Wenqiang Sun, Hanyang Wang, Yikai Wang, Haowen Sun, Junliang Ye, Jun Zhang, Yueqi Duan","doi":"arxiv-2408.16767","DOIUrl":"https://doi.org/arxiv-2408.16767","url":null,"abstract":"Advancements in 3D scene reconstruction have transformed 2D images from the\u0000real world into 3D models, producing realistic 3D results from hundreds of\u0000input photos. Despite great success in dense-view reconstruction scenarios,\u0000rendering a detailed scene from insufficient captured views is still an\u0000ill-posed optimization problem, often resulting in artifacts and distortions in\u0000unseen areas. In this paper, we propose ReconX, a novel 3D scene reconstruction\u0000paradigm that reframes the ambiguous reconstruction challenge as a temporal\u0000generation task. The key insight is to unleash the strong generative prior of\u0000large pre-trained video diffusion models for sparse-view reconstruction.\u0000However, 3D view consistency struggles to be accurately preserved in directly\u0000generated video frames from pre-trained models. To address this, given limited\u0000input views, the proposed ReconX first constructs a global point cloud and\u0000encodes it into a contextual space as the 3D structure condition. Guided by the\u0000condition, the video diffusion model then synthesizes video frames that are\u0000both detail-preserved and exhibit a high degree of 3D consistency, ensuring the\u0000coherence of the scene from various perspectives. Finally, we recover the 3D\u0000scene from the generated video through a confidence-aware 3D Gaussian Splatting\u0000optimization scheme. Extensive experiments on various real-world datasets show\u0000the superiority of our ReconX over state-of-the-art methods in terms of quality\u0000and generalizability.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing Architectural Floorplan Design with Geometry-enhanced Graph Diffusion 利用几何增强图形扩散推进建筑平面图设计
Pub Date : 2024-08-29 DOI: arxiv-2408.16258
Sizhe Hu, Wenming Wu, Yuntao Wang, Benzhu Xu, Liping Zheng
Automating architectural floorplan design is vital for housing and interiordesign, offering a faster, cost-effective alternative to manual sketches byarchitects. However, existing methods, including rule-based and learning-basedapproaches, face challenges in design complexity and constrained generationwith extensive post-processing, and tend to obvious geometric inconsistenciessuch as misalignment, overlap, and gaps. In this work, we propose a novelgenerative framework for vector floorplan design via structural graphgeneration, called GSDiff, focusing on wall junction generation and wallsegment prediction to capture both geometric and semantic aspects of structuralgraphs. To improve the geometric rationality of generated structural graphs, wepropose two innovative geometry enhancement methods. In wall junctiongeneration, we propose a novel alignment loss function to improve geometricconsistency. In wall segment prediction, we propose a random self-supervisionmethod to enhance the model's perception of the overall geometric structure,thereby promoting the generation of reasonable geometric structures. Employingthe diffusion model and the Transformer model, as well as the geometryenhancement strategies, our framework can generate wall junctions, wallsegments and room polygons with structural and semantic information, resultingin structural graphs that accurately represent floorplans. Extensiveexperiments show that the proposed method surpasses existing techniques,enabling free generation and constrained generation, marking a shift towardsstructure generation in architectural design.
建筑平面图设计自动化对于住宅和室内设计至关重要,它为建筑师提供了一种更快、更经济的手工草图替代方法。然而,现有的方法,包括基于规则的方法和基于学习的方法,都面临着设计复杂性和生成受限以及大量后处理的挑战,而且往往会出现明显的几何不一致性,如错位、重叠和间隙。在这项工作中,我们提出了一种通过结构图生成矢量平面图设计的新型生成框架,称为 GSDiff,重点关注墙体交界处生成和墙体分段预测,以捕捉结构图的几何和语义方面。为了提高生成的结构图的几何合理性,我们提出了两种创新的几何增强方法。在墙体交界处生成中,我们提出了一种新颖的对齐损失函数来提高几何一致性。在墙体分段预测中,我们提出了一种随机自我监督方法,以增强模型对整体几何结构的感知,从而促进生成合理的几何结构。利用扩散模型和变换器模型以及几何增强策略,我们的框架可以生成具有结构和语义信息的墙体连接点、墙体分段和房间多边形,从而生成能够准确表示平面图的结构图。广泛的实验表明,所提出的方法超越了现有技术,实现了自由生成和受约束生成,标志着建筑设计向结构生成的转变。
{"title":"Advancing Architectural Floorplan Design with Geometry-enhanced Graph Diffusion","authors":"Sizhe Hu, Wenming Wu, Yuntao Wang, Benzhu Xu, Liping Zheng","doi":"arxiv-2408.16258","DOIUrl":"https://doi.org/arxiv-2408.16258","url":null,"abstract":"Automating architectural floorplan design is vital for housing and interior\u0000design, offering a faster, cost-effective alternative to manual sketches by\u0000architects. However, existing methods, including rule-based and learning-based\u0000approaches, face challenges in design complexity and constrained generation\u0000with extensive post-processing, and tend to obvious geometric inconsistencies\u0000such as misalignment, overlap, and gaps. In this work, we propose a novel\u0000generative framework for vector floorplan design via structural graph\u0000generation, called GSDiff, focusing on wall junction generation and wall\u0000segment prediction to capture both geometric and semantic aspects of structural\u0000graphs. To improve the geometric rationality of generated structural graphs, we\u0000propose two innovative geometry enhancement methods. In wall junction\u0000generation, we propose a novel alignment loss function to improve geometric\u0000consistency. In wall segment prediction, we propose a random self-supervision\u0000method to enhance the model's perception of the overall geometric structure,\u0000thereby promoting the generation of reasonable geometric structures. Employing\u0000the diffusion model and the Transformer model, as well as the geometry\u0000enhancement strategies, our framework can generate wall junctions, wall\u0000segments and room polygons with structural and semantic information, resulting\u0000in structural graphs that accurately represent floorplans. Extensive\u0000experiments show that the proposed method surpasses existing techniques,\u0000enabling free generation and constrained generation, marking a shift towards\u0000structure generation in architectural design.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UV-free Texture Generation with Denoising and Geodesic Heat Diffusions 利用去噪和大地热扩散技术生成无紫外线纹理
Pub Date : 2024-08-29 DOI: arxiv-2408.16762
Simone Foti, Stefanos Zafeiriou, Tolga Birdal
Seams, distortions, wasted UV space, vertex-duplication, and varyingresolution over the surface are the most prominent issues of the standardUV-based texturing of meshes. These issues are particularly acute whenautomatic UV-unwrapping techniques are used. For this reason, instead ofgenerating textures in automatically generated UV-planes like moststate-of-the-art methods, we propose to represent textures as colouredpoint-clouds whose colours are generated by a denoising diffusion probabilisticmodel constrained to operate on the surface of 3D objects. Our sampling andresolution agnostic generative model heavily relies on heat diffusion over thesurface of the meshes for spatial communication between points. To enableprocessing of arbitrarily sampled point-cloud textures and ensure long-distancetexture consistency we introduce a fast re-sampling of the mesh spectralproperties used during the heat diffusion and introduce a novelheat-diffusion-based self-attention mechanism. Our code and pre-trained modelsare available at github.com/simofoti/UV3-TeD.
接缝、变形、浪费 UV 空间、顶点重复和表面分辨率不一是基于标准 UV 的网格纹理制作最突出的问题。在使用自动 UV 解包技术时,这些问题尤为突出。因此,与大多数最先进的方法一样在自动生成的 UV 平面中生成纹理不同,我们建议将纹理表示为彩色点云,其颜色由去噪扩散概率模型生成,该模型受限于在三维物体表面运行。我们的采样和分辨率无关生成模型主要依靠网格表面的热扩散来实现点之间的空间通信。为了能够处理任意采样的点云纹理,并确保长距离纹理的一致性,我们对热扩散过程中使用的网格光谱属性进行了快速重新采样,并引入了一种新颖的基于热扩散的自我关注机制。我们的代码和预训练模型可在 github.com/simofoti/UV3-TeD 上获取。
{"title":"UV-free Texture Generation with Denoising and Geodesic Heat Diffusions","authors":"Simone Foti, Stefanos Zafeiriou, Tolga Birdal","doi":"arxiv-2408.16762","DOIUrl":"https://doi.org/arxiv-2408.16762","url":null,"abstract":"Seams, distortions, wasted UV space, vertex-duplication, and varying\u0000resolution over the surface are the most prominent issues of the standard\u0000UV-based texturing of meshes. These issues are particularly acute when\u0000automatic UV-unwrapping techniques are used. For this reason, instead of\u0000generating textures in automatically generated UV-planes like most\u0000state-of-the-art methods, we propose to represent textures as coloured\u0000point-clouds whose colours are generated by a denoising diffusion probabilistic\u0000model constrained to operate on the surface of 3D objects. Our sampling and\u0000resolution agnostic generative model heavily relies on heat diffusion over the\u0000surface of the meshes for spatial communication between points. To enable\u0000processing of arbitrarily sampled point-cloud textures and ensure long-distance\u0000texture consistency we introduce a fast re-sampling of the mesh spectral\u0000properties used during the heat diffusion and introduce a novel\u0000heat-diffusion-based self-attention mechanism. Our code and pre-trained models\u0000are available at github.com/simofoti/UV3-TeD.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
G-Style: Stylized Gaussian Splatting G-Style:风格化高斯溅射
Pub Date : 2024-08-28 DOI: arxiv-2408.15695
Áron Samuel Kovács, Pedro Hermosilla, Renata G. Raidou
We introduce G-Style, a novel algorithm designed to transfer the style of animage onto a 3D scene represented using Gaussian Splatting. Gaussian Splattingis a powerful 3D representation for novel view synthesis, as -- compared toother approaches based on Neural Radiance Fields -- it provides fast scenerenderings and user control over the scene. Recent pre-prints have demonstratedthat the style of Gaussian Splatting scenes can be modified using an imageexemplar. However, since the scene geometry remains fixed during thestylization process, current solutions fall short of producing satisfactoryresults. Our algorithm aims to address these limitations by following athree-step process: In a pre-processing step, we remove undesirable Gaussianswith large projection areas or highly elongated shapes. Subsequently, wecombine several losses carefully designed to preserve different scales of thestyle in the image, while maintaining as much as possible the integrity of theoriginal scene content. During the stylization process and following theoriginal design of Gaussian Splatting, we split Gaussians where additionaldetail is necessary within our scene by tracking the gradient of the stylizedcolor. Our experiments demonstrate that G-Style generates high-qualitystylizations within just a few minutes, outperforming existing methods bothqualitatively and quantitatively.
我们介绍了 G-Style,这是一种新颖的算法,旨在将动画风格转移到使用高斯拼接技术表示的三维场景上。与其他基于神经辐射场的方法相比,高斯泼溅法是一种功能强大的三维表示法,可用于新颖的视图合成,它提供了快速的场景渲染和用户对场景的控制。最近的预发表论文证明,高斯拼接场景的风格可以使用图像示例进行修改。然而,由于场景几何形状在风格化过程中保持固定,目前的解决方案无法产生令人满意的结果。我们的算法旨在通过三步流程解决这些局限性:在预处理步骤中,我们去除投影面积较大或形状高度拉长的不良高斯。随后,我们将精心设计的几种损失结合起来,以保留图像中不同尺度的样式,同时尽可能保持原始场景内容的完整性。在风格化过程中,按照高斯拼接的原始设计,我们通过跟踪风格化颜色的梯度,在场景中需要额外细节的地方分割高斯。我们的实验证明,G-Style 能在几分钟内生成高质量的风格化效果,在定性和定量方面都优于现有方法。
{"title":"G-Style: Stylized Gaussian Splatting","authors":"Áron Samuel Kovács, Pedro Hermosilla, Renata G. Raidou","doi":"arxiv-2408.15695","DOIUrl":"https://doi.org/arxiv-2408.15695","url":null,"abstract":"We introduce G-Style, a novel algorithm designed to transfer the style of an\u0000image onto a 3D scene represented using Gaussian Splatting. Gaussian Splatting\u0000is a powerful 3D representation for novel view synthesis, as -- compared to\u0000other approaches based on Neural Radiance Fields -- it provides fast scene\u0000renderings and user control over the scene. Recent pre-prints have demonstrated\u0000that the style of Gaussian Splatting scenes can be modified using an image\u0000exemplar. However, since the scene geometry remains fixed during the\u0000stylization process, current solutions fall short of producing satisfactory\u0000results. Our algorithm aims to address these limitations by following a\u0000three-step process: In a pre-processing step, we remove undesirable Gaussians\u0000with large projection areas or highly elongated shapes. Subsequently, we\u0000combine several losses carefully designed to preserve different scales of the\u0000style in the image, while maintaining as much as possible the integrity of the\u0000original scene content. During the stylization process and following the\u0000original design of Gaussian Splatting, we split Gaussians where additional\u0000detail is necessary within our scene by tracking the gradient of the stylized\u0000color. Our experiments demonstrate that G-Style generates high-quality\u0000stylizations within just a few minutes, outperforming existing methods both\u0000qualitatively and quantitatively.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Micro and macro facial expressions by driven animations in realistic Virtual Humans 逼真虚拟人中的微观和宏观面部表情驱动动画
Pub Date : 2024-08-28 DOI: arxiv-2408.16110
Rubens Halbig Montanha, Giovana Nascimento Raupp, Ana Carolina Policarpo Schmitt, Victor Flávio de Andrade Araujo, Soraia Raupp Musse
Computer Graphics (CG) advancements have allowed the creation of morerealistic Virtual Humans (VH) through modern techniques for animating the VHbody and face, thereby affecting perception. From traditional methods,including blend shapes, to driven animations using facial and body tracking,these advancements can potentially enhance the perception of comfort andrealism in relation to VHs. Previously, Psychology studied facial movements inhumans, with some works separating expressions into macro and microexpressions. Also, some previous CG studies have analyzed how macro and microexpressions are perceived, replicating psychology studies in VHs, encompassingstudies with realistic and cartoon VHs, and exploring different VHtechnologies. However, instead of using facial tracking animation methods,these previous studies animated the VHs using blendshapes interpolation. Tounderstand how the facial tracking technique alters the perception of VHs, thispaper extends the study to macro and micro expressions, employing two datasetsto transfer real facial expressions to VHs and analyze how their expressionsare perceived. Our findings suggest that transferring facial expressions fromreal actors to VHs significantly diminishes the accuracy of emotion perceptioncompared to VH facial animations created by artists.
计算机图形学(CG)的进步通过现代虚拟人身体和面部动画技术创造出了逼真的虚拟人(VH),从而影响了人们的感知。从包括混合形状在内的传统方法,到使用面部和身体跟踪的驱动动画,这些进步有可能增强人们对虚拟人的舒适感和真实感。此前,心理学研究了人类的面部动作,其中一些作品将表情分为宏观表情和微观表情。此外,之前的一些 CG 研究也分析了宏观和微观表情是如何被感知的,将心理学研究复制到了虚拟主机中,包括对现实和卡通虚拟主机的研究,以及对不同虚拟主机技术的探索。然而,之前的这些研究并没有使用面部追踪动画方法,而是使用混合形状插值法制作 VH 动画。为了了解面部跟踪技术如何改变人们对 VH 的感知,本文将研究扩展到宏观和微观表情,采用两个数据集将真实的面部表情转移到 VH 上,并分析人们对其表情的感知。我们的研究结果表明,与艺术家创作的 VH 面部动画相比,将真实演员的面部表情转移到 VH 上会大大降低情绪感知的准确性。
{"title":"Micro and macro facial expressions by driven animations in realistic Virtual Humans","authors":"Rubens Halbig Montanha, Giovana Nascimento Raupp, Ana Carolina Policarpo Schmitt, Victor Flávio de Andrade Araujo, Soraia Raupp Musse","doi":"arxiv-2408.16110","DOIUrl":"https://doi.org/arxiv-2408.16110","url":null,"abstract":"Computer Graphics (CG) advancements have allowed the creation of more\u0000realistic Virtual Humans (VH) through modern techniques for animating the VH\u0000body and face, thereby affecting perception. From traditional methods,\u0000including blend shapes, to driven animations using facial and body tracking,\u0000these advancements can potentially enhance the perception of comfort and\u0000realism in relation to VHs. Previously, Psychology studied facial movements in\u0000humans, with some works separating expressions into macro and micro\u0000expressions. Also, some previous CG studies have analyzed how macro and micro\u0000expressions are perceived, replicating psychology studies in VHs, encompassing\u0000studies with realistic and cartoon VHs, and exploring different VH\u0000technologies. However, instead of using facial tracking animation methods,\u0000these previous studies animated the VHs using blendshapes interpolation. To\u0000understand how the facial tracking technique alters the perception of VHs, this\u0000paper extends the study to macro and micro expressions, employing two datasets\u0000to transfer real facial expressions to VHs and analyze how their expressions\u0000are perceived. Our findings suggest that transferring facial expressions from\u0000real actors to VHs significantly diminishes the accuracy of emotion perception\u0000compared to VH facial animations created by artists.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating and Comparing Crowd Simulations: Perspectives from a Crowd Authoring Tool 评估和比较人群模拟:来自人群创作工具的视角
Pub Date : 2024-08-28 DOI: arxiv-2408.15762
Gabriel Fonseca Silva, Paulo Ricardo Knob, Rubens Halbig Montanha, Soraia Raupp Musse
Crowd simulation is a research area widely used in diverse fields, includinggaming and security, assessing virtual agent movements through metrics liketime to reach their goals, speed, trajectories, and densities. This is relevantfor security applications, for instance, as different crowd configurations candetermine the time people spend in environments trying to evacuate them. Inthis work, we extend WebCrowds, an authoring tool for crowd simulation, toallow users to build scenarios and evaluate them through a set of metrics. Theaim is to provide a quantitative metric that can, based on simulation data,select the best crowd configuration in a certain environment. We conductexperiments to validate our proposed metric in multiple crowd simulationscenarios and perform a comparison with another metric found in the literature.The results show that experts in the domain of crowd scenarios agree with ourproposed quantitative metric.
人群仿真是一个广泛应用于游戏和安全等不同领域的研究领域,通过达到目标的时间、速度、轨迹和密度等指标来评估虚拟代理的移动。例如,这与安全应用相关,因为不同的人群配置会决定人们在试图撤离的环境中花费的时间。在这项工作中,我们扩展了用于人群模拟的创作工具 WebCrowds,使用户能够构建场景并通过一系列指标对其进行评估。其目的是提供一种量化指标,根据模拟数据选择特定环境中的最佳人群配置。我们进行了实验,在多个人群模拟场景中验证了我们提出的指标,并与文献中的另一种指标进行了比较。结果表明,人群场景领域的专家同意我们提出的量化指标。
{"title":"Evaluating and Comparing Crowd Simulations: Perspectives from a Crowd Authoring Tool","authors":"Gabriel Fonseca Silva, Paulo Ricardo Knob, Rubens Halbig Montanha, Soraia Raupp Musse","doi":"arxiv-2408.15762","DOIUrl":"https://doi.org/arxiv-2408.15762","url":null,"abstract":"Crowd simulation is a research area widely used in diverse fields, including\u0000gaming and security, assessing virtual agent movements through metrics like\u0000time to reach their goals, speed, trajectories, and densities. This is relevant\u0000for security applications, for instance, as different crowd configurations can\u0000determine the time people spend in environments trying to evacuate them. In\u0000this work, we extend WebCrowds, an authoring tool for crowd simulation, to\u0000allow users to build scenarios and evaluate them through a set of metrics. The\u0000aim is to provide a quantitative metric that can, based on simulation data,\u0000select the best crowd configuration in a certain environment. We conduct\u0000experiments to validate our proposed metric in multiple crowd simulation\u0000scenarios and perform a comparison with another metric found in the literature.\u0000The results show that experts in the domain of crowd scenarios agree with our\u0000proposed quantitative metric.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OctFusion: Octree-based Diffusion Models for 3D Shape Generation OctFusion:基于八维扩散模型的三维形状生成
Pub Date : 2024-08-27 DOI: arxiv-2408.14732
Bojun Xiong, Si-Tong Wei, Xin-Yang Zheng, Yan-Pei Cao, Zhouhui Lian, Peng-Shuai Wang
Diffusion models have emerged as a popular method for 3D generation. However,it is still challenging for diffusion models to efficiently generate diverseand high-quality 3D shapes. In this paper, we introduce OctFusion, which cangenerate 3D shapes with arbitrary resolutions in 2.5 seconds on a single Nvidia4090 GPU, and the extracted meshes are guaranteed to be continuous andmanifold. The key components of OctFusion are the octree-based latentrepresentation and the accompanying diffusion models. The representationcombines the benefits of both implicit neural representations and explicitspatial octrees and is learned with an octree-based variational autoencoder.The proposed diffusion model is a unified multi-scale U-Net that enablesweights and computation sharing across different octree levels and avoids thecomplexity of widely used cascaded diffusion schemes. We verify theeffectiveness of OctFusion on the ShapeNet and Objaverse datasets and achievestate-of-the-art performances on shape generation tasks. We demonstrate thatOctFusion is extendable and flexible by generating high-quality color fieldsfor textured mesh generation and high-quality 3D shapes conditioned on textprompts, sketches, or category labels. Our code and pre-trained models areavailable at url{https://github.com/octree-nn/octfusion}.
扩散模型已成为一种流行的三维生成方法。然而,扩散模型要想高效生成多样化和高质量的三维图形仍具有挑战性。在本文中,我们介绍了 OctFusion,它可以在单个 Nvidia4090 GPU 上在 2.5 秒内生成任意分辨率的三维形状,并且保证提取的网格是连续的和多边形的。OctFusion 的关键组件是基于八度的潜在表示法和相应的扩散模型。该表示法结合了隐式神经表示法和显式空间八叉树的优点,并通过基于八叉树的变异自动编码器进行学习。所提出的扩散模型是一种统一的多尺度 U-Net,可以在不同的八叉树级之间实现权重和计算共享,并避免了广泛使用的级联扩散方案的复杂性。我们在 ShapeNet 和 Objaverse 数据集上验证了 OctFusion 的有效性,并在形状生成任务上取得了最先进的性能。我们证明了 OctFusion 的可扩展性和灵活性,它能为纹理网格生成生成高质量的色域,并根据文本摘要、草图或类别标签生成高质量的三维形状。我们的代码和预训练模型可在(url{https://github.com/octree-nn/octfusion}.
{"title":"OctFusion: Octree-based Diffusion Models for 3D Shape Generation","authors":"Bojun Xiong, Si-Tong Wei, Xin-Yang Zheng, Yan-Pei Cao, Zhouhui Lian, Peng-Shuai Wang","doi":"arxiv-2408.14732","DOIUrl":"https://doi.org/arxiv-2408.14732","url":null,"abstract":"Diffusion models have emerged as a popular method for 3D generation. However,\u0000it is still challenging for diffusion models to efficiently generate diverse\u0000and high-quality 3D shapes. In this paper, we introduce OctFusion, which can\u0000generate 3D shapes with arbitrary resolutions in 2.5 seconds on a single Nvidia\u00004090 GPU, and the extracted meshes are guaranteed to be continuous and\u0000manifold. The key components of OctFusion are the octree-based latent\u0000representation and the accompanying diffusion models. The representation\u0000combines the benefits of both implicit neural representations and explicit\u0000spatial octrees and is learned with an octree-based variational autoencoder.\u0000The proposed diffusion model is a unified multi-scale U-Net that enables\u0000weights and computation sharing across different octree levels and avoids the\u0000complexity of widely used cascaded diffusion schemes. We verify the\u0000effectiveness of OctFusion on the ShapeNet and Objaverse datasets and achieve\u0000state-of-the-art performances on shape generation tasks. We demonstrate that\u0000OctFusion is extendable and flexible by generating high-quality color fields\u0000for textured mesh generation and high-quality 3D shapes conditioned on text\u0000prompts, sketches, or category labels. Our code and pre-trained models are\u0000available at url{https://github.com/octree-nn/octfusion}.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1