Computer Graphics Forum最新文献_第9页

PCLC-Net: Point Cloud Completion in Arbitrary Poses with Learnable Canonical Space PCLC-Net：利用可学习的典型空间完成任意姿态的点云补全

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15217

Hanmo Xu, Qingyao Shuai, Xuejin Chen

Recovering the complete structure from partial point clouds in arbitrary poses is challenging. Recently, many efforts have been made to address this problem by developing SO(3)-equivariant completion networks or aligning the partial point clouds with a predefined canonical space before completion. However, these approaches are limited to random rotations only or demand costly pose annotation for model training. In this paper, we present a novel Network for Point cloud Completion with Learnable Canonical space (PCLC-Net) to reduce the need for pose annotations and extract SE(3)-invariant geometry features to improve the completion quality in arbitrary poses. Without pose annotations, our PCLC-Net utilizes self-supervised pose estimation to align the input partial point clouds to a canonical space that is learnable for an object category and subsequently performs shape completion in the learned canonical space. Our PCLC-Net can complete partial point clouds with arbitrary SE(3) poses without requiring pose annotations for supervision. Our PCLC-Net achieves state-of-the-art results on shape completion with arbitrary SE(3) poses on both synthetic and real scanned data. To the best of our knowledge, our method is the first to achieve shape completion in arbitrary poses without pose annotations during network training.

从任意姿态的部分点云中恢复完整结构具有挑战性。最近，很多人通过开发 SO(3)-equivariant 补全网络或在补全之前将部分点云与预定义的典型空间对齐来解决这一问题。然而，这些方法都仅限于随机旋转，或者在模型训练时需要昂贵的姿态注释。在本文中，我们提出了一种新颖的可学习典型空间点云补全网络（PCLC-Net），以减少对姿势注释的需求，并提取 SE(3)-invariant 几何特征，从而提高任意姿势下的补全质量。在没有姿态注释的情况下，我们的 PCLC-Net 利用自监督姿态估计将输入的部分点云对齐到对象类别可学习的规范空间，随后在学习到的规范空间中执行形状补全。我们的 PCLC-Net 可以完成具有任意 SE(3) 姿势的部分点云，而无需姿势注释监督。我们的 PCLC-Net 在合成数据和真实扫描数据的任意 SE(3) 姿态形状补全方面都取得了最先进的成果。据我们所知，我们的方法是第一种在网络训练过程中无需姿势注释就能实现任意姿势形状补全的方法。

{"title":"PCLC-Net: Point Cloud Completion in Arbitrary Poses with Learnable Canonical Space","authors":"Hanmo Xu, Qingyao Shuai, Xuejin Chen","doi":"10.1111/cgf.15217","DOIUrl":"https://doi.org/10.1111/cgf.15217","url":null,"abstract":"<p>Recovering the complete structure from partial point clouds in arbitrary poses is challenging. Recently, many efforts have been made to address this problem by developing SO(3)-equivariant completion networks or aligning the partial point clouds with a predefined canonical space before completion. However, these approaches are limited to random rotations only or demand costly pose annotation for model training. In this paper, we present a novel Network for Point cloud Completion with Learnable Canonical space (PCLC-Net) to reduce the need for pose annotations and extract SE(3)-invariant geometry features to improve the completion quality in arbitrary poses. Without pose annotations, our PCLC-Net utilizes self-supervised pose estimation to align the input partial point clouds to a canonical space that is learnable for an object category and subsequently performs shape completion in the learned canonical space. Our PCLC-Net can complete partial point clouds with arbitrary SE(3) poses without requiring pose annotations for supervision. Our PCLC-Net achieves state-of-the-art results on shape completion with arbitrary SE(3) poses on both synthetic and real scanned data. To the best of our knowledge, our method is the first to achieve shape completion in arbitrary poses without pose annotations during network training.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting 黑暗中的高斯：利用高斯拼接技术从不连贯的黑暗图像中实时合成视图

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15213

Sheng Ye, Zhen-Hui Dong, Yubin Hu, Yu-Hui Wen, Yong-Jin Liu

3D Gaussian Splatting has recently emerged as a powerful representation that can synthesize remarkable novel views using consistent multi-view images as input. However, we notice that images captured in dark environments where the scenes are not fully illuminated can exhibit considerable brightness variations and multi-view inconsistency, which poses great challenges to 3D Gaussian Splatting and severely degrades its performance. To tackle this problem, we propose Gaussian-DK. Observing that inconsistencies are mainly caused by camera imaging, we represent a consistent radiance field of the physical world using a set of anisotropic 3D Gaussians, and design a camera response module to compensate for multi-view inconsistencies. We also introduce a step-based gradient scaling strategy to constrain Gaussians near the camera, which turn out to be floaters, from splitting and cloning. Experiments on our proposed benchmark dataset demonstrate that Gaussian-DK produces high-quality renderings without ghosting and floater artifacts and significantly outperforms existing methods. Furthermore, we can also synthesize light-up images by controlling exposure levels that clearly show details in shadow areas.

最近，三维高斯拼接技术（3D Gaussian Splatting）作为一种强大的表示方法出现，它可以使用一致的多视角图像作为输入，合成出引人注目的新颖视图。然而，我们注意到，在黑暗环境中捕获的图像，由于场景未被完全照亮，会表现出相当大的亮度变化和多视角不一致性，这给三维高斯拼接带来了巨大挑战，并严重降低了其性能。为了解决这个问题，我们提出了高斯-DK。考虑到不一致性主要是由相机成像造成的，我们用一组各向异性的三维高斯来表示物理世界的一致辐射场，并设计了一个相机响应模块来补偿多视角不一致性。我们还引入了一种基于阶跃梯度缩放的策略，以限制相机附近的高斯（原来是漂浮物）分裂和克隆。在我们提出的基准数据集上进行的实验表明，Gaussian-DK 能生成没有重影和浮点伪影的高质量渲染图，其性能明显优于现有方法。此外，我们还能通过控制曝光水平合成亮光图像，清晰显示阴影区域的细节。

{"title":"Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting","authors":"Sheng Ye, Zhen-Hui Dong, Yubin Hu, Yu-Hui Wen, Yong-Jin Liu","doi":"10.1111/cgf.15213","DOIUrl":"https://doi.org/10.1111/cgf.15213","url":null,"abstract":"<p>3D Gaussian Splatting has recently emerged as a powerful representation that can synthesize remarkable novel views using consistent multi-view images as input. However, we notice that images captured in dark environments where the scenes are not fully illuminated can exhibit considerable brightness variations and multi-view inconsistency, which poses great challenges to 3D Gaussian Splatting and severely degrades its performance. To tackle this problem, we propose Gaussian-DK. Observing that inconsistencies are mainly caused by camera imaging, we represent a consistent radiance field of the physical world using a set of anisotropic 3D Gaussians, and design a camera response module to compensate for multi-view inconsistencies. We also introduce a step-based gradient scaling strategy to constrain Gaussians near the camera, which turn out to be floaters, from splitting and cloning. Experiments on our proposed benchmark dataset demonstrate that Gaussian-DK produces high-quality renderings without ghosting and floater artifacts and significantly outperforms existing methods. Furthermore, we can also synthesize light-up images by controlling exposure levels that clearly show details in shadow areas.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Environment Map Rendering Based on Decomposition

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-22 DOI: 10.1111/cgf.15264

Yu-Ting Wu

This paper presents an efficient environment map sampling algorithm designed to render high-quality, low-noise images with only a few light samples, making it ideal for real-time applications. We observe that bright pixels in the environment map produce high-frequency shading effects, such as sharp shadows and shading, while the rest influence the overall tone of the scene. Building on this insight, our approach differs from existing techniques by categorizing the pixels in an environment map into emissive and non-emissive regions and developing specialized algorithms tailored to the distinct properties of each region. By decomposing the environment lighting, we ensure that light sources are deposited on bright pixels, leading to more accurate shadows and specular highlights. Additionally, this strategy allows us to exploit the smoothness in the low-frequency component by rendering a smaller image with more lights, thereby enhancing shading accuracy. Extensive experiments demonstrate that our method significantly reduces shadow artefacts and image noise compared to previous techniques, while also achieving lower numerical errors across a range of illumination types, particularly under limited sample conditions.

{"title":"Efficient Environment Map Rendering Based on Decomposition","authors":"Yu-Ting Wu","doi":"10.1111/cgf.15264","DOIUrl":"https://doi.org/10.1111/cgf.15264","url":null,"abstract":"<p>This paper presents an efficient environment map sampling algorithm designed to render high-quality, low-noise images with only a few light samples, making it ideal for real-time applications. We observe that bright pixels in the environment map produce high-frequency shading effects, such as sharp shadows and shading, while the rest influence the overall tone of the scene. Building on this insight, our approach differs from existing techniques by categorizing the pixels in an environment map into emissive and non-emissive regions and developing specialized algorithms tailored to the distinct properties of each region. By decomposing the environment lighting, we ensure that light sources are deposited on bright pixels, leading to more accurate shadows and specular highlights. Additionally, this strategy allows us to exploit the smoothness in the low-frequency component by rendering a smaller image with more lights, thereby enhancing shading accuracy. Extensive experiments demonstrate that our method significantly reduces shadow artefacts and image noise compared to previous techniques, while also achieving lower numerical errors across a range of illumination types, particularly under limited sample conditions.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143513565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TempDiff: Enhancing Temporal-awareness in Latent Diffusion for Real-World Video Super-Resolution TempDiff：增强潜在扩散中的时间感知，实现真实世界的视频超分辨率

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-18 DOI: 10.1111/cgf.15211

Q. Jiang, Q.L. Wang, L.H. Chi, X.H. Chen, Q.Y. Zhang, R. Zhou, Z.Q. Deng, J.S. Deng, B.B. Tang, S.H. Lv, J. Liu

Latent diffusion models (LDMs) have demonstrated remarkable success in generative modeling. It is promising to leverage the potential of diffusion priors to enhance performance in image and video tasks. However, applying LDMs to video super-resolution (VSR) presents significant challenges due to the high demands for realistic details and temporal consistency in generated videos, exacerbated by the inherent stochasticity in the diffusion process. In this work, we propose a novel diffusion-based framework, Temporal-awareness Latent Diffusion Model (TempDiff), specifically designed for real-world video super-resolution, where degradations are diverse and complex. TempDiff harnesses the powerful generative prior of a pre-trained diffusion model and enhances temporal awareness through the following mechanisms: 1) Incorporating temporal layers into the denoising U-Net and VAE-Decoder, and fine-tuning these added modules to maintain temporal coherency; 2) Estimating optical flow guidance using a pre-trained flow net for latent optimization and propagation across video sequences, ensuring overall stability in the generated high-quality video. Extensive experiments demonstrate that TempDiff achieves compelling results, outperforming state-of-the-art methods on both synthetic and real-world VSR benchmark datasets. Code will be available at https://github.com/jiangqin567/TempDiff

潜在扩散模型（LDM）在生成模型中取得了显著的成功。利用扩散先验的潜力来提高图像和视频任务的性能是大有可为的。然而，将 LDM 应用于视频超分辨率（VSR）却面临着巨大的挑战，因为生成的视频对真实细节和时间一致性的要求很高，而扩散过程中固有的随机性又加剧了这一挑战。在这项工作中，我们提出了一种新颖的基于扩散的框架--时态感知潜在扩散模型（TempDiff），该框架专为真实世界视频超分辨率而设计，在真实世界中，降解是多样而复杂的。TempDiff 利用预先训练好的扩散模型的强大先验生成功能，通过以下机制增强时间感知能力：1）在去噪 U-Net 和 VAE-Decoder 中加入时间层，并对这些新增模块进行微调，以保持时间一致性；2）使用预先训练好的流网估算光流引导，以进行潜优化和跨视频序列传播，确保生成的高质量视频的整体稳定性。广泛的实验表明，TempDiff 取得了令人瞩目的成果，在合成和实际 VSR 基准数据集上的表现均优于最先进的方法。代码见 https://github.com/jiangqin567/TempDiff

{"title":"TempDiff: Enhancing Temporal-awareness in Latent Diffusion for Real-World Video Super-Resolution","authors":"Q. Jiang, Q.L. Wang, L.H. Chi, X.H. Chen, Q.Y. Zhang, R. Zhou, Z.Q. Deng, J.S. Deng, B.B. Tang, S.H. Lv, J. Liu","doi":"10.1111/cgf.15211","DOIUrl":"https://doi.org/10.1111/cgf.15211","url":null,"abstract":"<p>Latent diffusion models (LDMs) have demonstrated remarkable success in generative modeling. It is promising to leverage the potential of diffusion priors to enhance performance in image and video tasks. However, applying LDMs to video super-resolution (VSR) presents significant challenges due to the high demands for realistic details and temporal consistency in generated videos, exacerbated by the inherent stochasticity in the diffusion process. In this work, we propose a novel diffusion-based framework, Temporal-awareness Latent Diffusion Model (TempDiff), specifically designed for real-world video super-resolution, where degradations are diverse and complex. TempDiff harnesses the powerful generative prior of a pre-trained diffusion model and enhances temporal awareness through the following mechanisms: 1) Incorporating temporal layers into the denoising U-Net and VAE-Decoder, and fine-tuning these added modules to maintain temporal coherency; 2) Estimating optical flow guidance using a pre-trained flow net for latent optimization and propagation across video sequences, ensuring overall stability in the generated high-quality video. Extensive experiments demonstrate that TempDiff achieves compelling results, outperforming state-of-the-art methods on both synthetic and real-world VSR benchmark datasets. Code will be available at https://github.com/jiangqin567/TempDiff</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NeuPreSS: Compact Neural Precomputed Subsurface Scattering for Distant Lighting of Heterogeneous Translucent Objects NeuPreSS：用于异质半透明物体远距离照明的紧凑型神经预计算次表面散射法

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-18 DOI: 10.1111/cgf.15234

T. TG, J. R. Frisvad, R. Ramamoorthi, H. W. Jensen

Monte Carlo rendering of translucent objects with heterogeneous scattering properties is often expensive both in terms of memory and computation. If the scattering properties are described by a 3D texture, memory consumption is high. If we do path tracing and use a high dynamic range lighting environment, the computational cost of the rendering can easily become significant. We propose a compact and efficient neural method for representing and rendering the appearance of heterogeneous translucent objects. Instead of assuming only surface variation of optical properties, our method represents the appearance of a full object taking its geometry and volumetric heterogeneities into account. This is similar to a neural radiance field, but our representation works for an arbitrary distant lighting environment. In a sense, we present a version of neural precomputed radiance transfer that captures relighting of heterogeneous translucent objects. We use a multi-layer perceptron (MLP) with skip connections to represent the appearance of an object as a function of spatial position, direction of observation, and direction of incidence. The latter is considered a directional light incident across the entire non-self-shadowed part of the object. We demonstrate the ability of our method to compactly store highly complex materials while having high accuracy when comparing to reference images of the represented object in unseen lighting environments. As compared with path tracing of a heterogeneous light scattering volume behind a refractive interface, our method more easily enables importance sampling of the directions of incidence and can be integrated into existing rendering frameworks while achieving interactive frame rates.

对具有不同散射特性的半透明物体进行蒙特卡罗渲染，通常需要耗费大量内存和计算量。如果散射特性由三维纹理描述，内存消耗就会很高。如果我们进行路径追踪并使用高动态范围照明环境，渲染的计算成本很容易变得很高。我们提出了一种紧凑高效的神经方法，用于表示和渲染异质半透明物体的外观。我们的方法不只假设光学特性的表面变化，而是将整个物体的几何形状和体积异质性考虑在内，表现其外观。这类似于神经辐射场，但我们的表示方法适用于任意的远距离照明环境。从某种意义上说，我们提出了神经预计算辐射度传递的一个版本，可以捕捉异质半透明物体的再照明。我们使用具有跳越连接的多层感知器（MLP），将物体的外观表示为空间位置、观察方向和入射方向的函数。入射方向被认为是入射物体整个非自阴影部分的定向光。我们证明了我们的方法能够紧凑地存储高度复杂的材料，同时与所代表物体在未知照明环境下的参考图像进行比较时具有很高的准确性。与折射界面后的异质光散射体积的路径追踪相比，我们的方法更容易实现入射方向的重要性采样，并可集成到现有的渲染框架中，同时实现交互式帧速率。

{"title":"NeuPreSS: Compact Neural Precomputed Subsurface Scattering for Distant Lighting of Heterogeneous Translucent Objects","authors":"T. TG, J. R. Frisvad, R. Ramamoorthi, H. W. Jensen","doi":"10.1111/cgf.15234","DOIUrl":"https://doi.org/10.1111/cgf.15234","url":null,"abstract":"<div>\u0000 <p>Monte Carlo rendering of translucent objects with heterogeneous scattering properties is often expensive both in terms of memory and computation. If the scattering properties are described by a 3D texture, memory consumption is high. If we do path tracing and use a high dynamic range lighting environment, the computational cost of the rendering can easily become significant. We propose a compact and efficient neural method for representing and rendering the appearance of heterogeneous translucent objects. Instead of assuming only surface variation of optical properties, our method represents the appearance of a full object taking its geometry and volumetric heterogeneities into account. This is similar to a neural radiance field, but our representation works for an arbitrary distant lighting environment. In a sense, we present a version of neural precomputed radiance transfer that captures relighting of heterogeneous translucent objects. We use a multi-layer perceptron (MLP) with skip connections to represent the appearance of an object as a function of spatial position, direction of observation, and direction of incidence. The latter is considered a directional light incident across the entire non-self-shadowed part of the object. We demonstrate the ability of our method to compactly store highly complex materials while having high accuracy when comparing to reference images of the represented object in unseen lighting environments. As compared with path tracing of a heterogeneous light scattering volume behind a refractive interface, our method more easily enables importance sampling of the directions of incidence and can be integrated into existing rendering frameworks while achieving interactive frame rates.</p>\u0000 </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.15234","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unerosion: Simulating Terrain Evolution Back in Time Unerosion：回到过去模拟地形演变

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-17 DOI: 10.1111/cgf.15182

Zhanyu Yang, Guillaume Cordonnier, Marie-Paule Cani, Christian Perrenoud, Bedrich Benes

While the past of terrain cannot be known precisely because an effect can result from many different causes, exploring these possible pasts opens the way to numerous applications ranging from movies and games to paleogeography. We introduce unerosion, an attempt to recover plausible past topographies from an input terrain represented as a height field. Our solution relies on novel algorithms for the backward simulation of different processes: fluvial erosion, sedimentation, and thermal erosion. This is achieved by re-formulating the equations of erosion and sedimentation so that they can be simulated back in time. These algorithms can be combined to account for a succession of climate changes backward in time, while the possible ambiguities provide editing options to the user. Results show that our solution can approximately reverse different types of erosion while enabling users to explore a variety of alternative pasts. Using a chronology of climatic periods to inform us about the main erosion phenomena, we also went back in time using real measured terrain data. We checked the consistency with geological findings, namely the height of river beds hundreds of thousands of years ago.

虽然地形的过去无法精确得知，因为一种影响可能由许多不同的原因造成，但探索这些可能的过去为从电影和游戏到古地理学的众多应用开辟了道路。我们介绍了 Unerosion，这是一种尝试从高度场表示的输入地形中恢复可信的过去地形的方法。我们的解决方案依靠新颖的算法来反向模拟不同的过程：河流侵蚀、沉积和热侵蚀。这是通过重新制定侵蚀和沉积方程来实现的，这样就可以对它们进行时间回溯模拟。这些算法可以结合在一起，以解释在时间上的连续气候变化，而可能存在的模糊性则为用户提供了编辑选项。结果表明，我们的解决方案可以近似逆转不同类型的侵蚀，同时让用户能够探索各种不同的过去。通过气候年表，我们了解了主要的侵蚀现象，还利用真实测量的地形数据进行了时间回溯。我们检查了与地质发现的一致性，即数十万年前河床的高度。

{"title":"Unerosion: Simulating Terrain Evolution Back in Time","authors":"Zhanyu Yang, Guillaume Cordonnier, Marie-Paule Cani, Christian Perrenoud, Bedrich Benes","doi":"10.1111/cgf.15182","DOIUrl":"https://doi.org/10.1111/cgf.15182","url":null,"abstract":"<div>\u0000 \u0000 <p>While the past of terrain cannot be known precisely because an effect can result from many different causes, exploring these possible pasts opens the way to numerous applications ranging from movies and games to paleogeography. We introduce unerosion, an attempt to recover plausible past topographies from an input terrain represented as a height field. Our solution relies on novel algorithms for the backward simulation of different processes: fluvial erosion, sedimentation, and thermal erosion. This is achieved by re-formulating the equations of erosion and sedimentation so that they can be simulated back in time. These algorithms can be combined to account for a succession of climate changes backward in time, while the possible ambiguities provide editing options to the user. Results show that our solution can approximately reverse different types of erosion while enabling users to explore a variety of alternative pasts. Using a chronology of climatic periods to inform us about the main erosion phenomena, we also went back in time using real measured terrain data. We checked the consistency with geological findings, namely the height of river beds hundreds of thousands of years ago.</p>\u0000 </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 8","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.15182","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142707654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ADAPT: AI-Driven Artefact Purging Technique for IMU Based Motion Capture ADAPT：基于 IMU 运动捕捉的人工智能驱动伪影清除技术

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-17 DOI: 10.1111/cgf.15172

P. Schreiner, R. Netterstrøm, H. Yin, S. Darkner, K. Erleben

While IMU based motion capture offers a cost-effective alternative to premium camera-based systems, it often falls short in matching the latter's realism. Common distortions, such as self-penetrating body parts, foot skating, and floating, limit the usability of these systems, particularly for high-end users. To address this, we employed reinforcement learning to train an AI agent that mimics erroneous sample motion. Since our agent operates within a simulated environment, it inherently avoids generating these distortions since it must adhere to the laws of physics. Impressively, the agent manages to mimic the sample motions while preserving their distinctive characteristics. We assessed our method's efficacy across various types of input data, showcasing an ideal blend of artefact-laden IMU-based data with high-grade optical motion capture data. Furthermore, we compared the configuration of observation and action spaces with other implementations, pinpointing the most suitable configuration for our purposes. All our models underwent rigorous evaluation using a spectrum of quantitative metrics complemented by a qualitative review. These evaluations were performed using a benchmark dataset of IMU-based motion data from actors not included in the training data.

虽然基于 IMU 的运动捕捉系统为基于摄像头的高级系统提供了一种具有成本效益的替代方案，但它往往无法达到后者的逼真度。常见的失真现象，如自穿透身体部位、脚部滑动和漂浮等，限制了这些系统的可用性，尤其是对高端用户而言。为了解决这个问题，我们采用了强化学习的方法来训练一个人工智能代理，以模仿错误的样本运动。由于我们的代理是在模拟环境中运行的，它必须遵守物理定律，因此从本质上避免了产生这些失真。令人印象深刻的是，该代理能够在模仿样本运动的同时保留其独特的特征。我们评估了我们的方法在各种类型的输入数据中的有效性，展示了基于假象的 IMU 数据与高级光学运动捕捉数据的理想融合。此外，我们还将观察空间和行动空间的配置与其他实现方法进行了比较，从而确定了最适合我们目的的配置。我们使用一系列定量指标对所有模型进行了严格评估，并辅以定性审查。这些评估是使用一个基准数据集进行的，该数据集是基于 IMU 的演员运动数据，但不包括在训练数据中。

{"title":"ADAPT: AI-Driven Artefact Purging Technique for IMU Based Motion Capture","authors":"P. Schreiner, R. Netterstrøm, H. Yin, S. Darkner, K. Erleben","doi":"10.1111/cgf.15172","DOIUrl":"https://doi.org/10.1111/cgf.15172","url":null,"abstract":"<div>\u0000 \u0000 <p>While IMU based motion capture offers a cost-effective alternative to premium camera-based systems, it often falls short in matching the latter's realism. Common distortions, such as self-penetrating body parts, foot skating, and floating, limit the usability of these systems, particularly for high-end users. To address this, we employed reinforcement learning to train an AI agent that mimics erroneous sample motion. Since our agent operates within a simulated environment, it inherently avoids generating these distortions since it must adhere to the laws of physics. Impressively, the agent manages to mimic the sample motions while preserving their distinctive characteristics. We assessed our method's efficacy across various types of input data, showcasing an ideal blend of artefact-laden IMU-based data with high-grade optical motion capture data. Furthermore, we compared the configuration of observation and action spaces with other implementations, pinpointing the most suitable configuration for our purposes. All our models underwent rigorous evaluation using a spectrum of quantitative metrics complemented by a qualitative review. These evaluations were performed using a benchmark dataset of IMU-based motion data from actors not included in the training data.</p>\u0000 </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 8","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.15172","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142707689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Llanimation: Llama Driven Gesture Animation 动画拉玛驱动手势动画

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-17 DOI: 10.1111/cgf.15167

J. Windle, I. Matthews, S. Taylor

Co-speech gesturing is an important modality in conversation, providing context and social cues. In character animation, appropriate and synchronised gestures add realism, and can make interactive agents more engaging. Historically, methods for automatically generating gestures were predominantly audio-driven, exploiting the prosodic and speech-related content that is encoded in the audio signal. In this paper we instead experiment with using Large-Language Model (LLM) features for gesture generation that are extracted from text using Llama2. We compare against audio features, and explore combining the two modalities in both objective tests and a user study. Surprisingly, our results show that Llama2 features on their own perform significantly better than audio features and that including both modalities yields no significant difference to using Llama2 features in isolation. We demonstrate that the Llama2 based model can generate both beat and semantic gestures without any audio input, suggesting LLMs can provide rich encodings that are well suited for gesture generation.

协同语音手势是对话中的一种重要方式，可提供语境和社交线索。在角色动画中，适当的同步手势能增加真实感，并能使交互式代理更具吸引力。一直以来，自动生成手势的方法主要由音频驱动，利用音频信号中编码的前音和语音相关内容。在本文中，我们尝试使用大语言模型（LLM）特征来生成手势，这些特征是使用 Llama2 从文本中提取的。我们将其与音频特征进行了比较，并在客观测试和用户研究中探索了两种模式的结合。令人惊讶的是，我们的结果表明，Llama2 特征本身的性能明显优于音频特征，而将两种模式结合使用与单独使用 Llama2 特征相比没有显著差异。我们证明，基于 Llama2 的模型可以在没有任何音频输入的情况下生成节拍和语义手势，这表明 LLM 可以提供非常适合手势生成的丰富编码。

{"title":"Llanimation: Llama Driven Gesture Animation","authors":"J. Windle, I. Matthews, S. Taylor","doi":"10.1111/cgf.15167","DOIUrl":"https://doi.org/10.1111/cgf.15167","url":null,"abstract":"<div>\u0000 \u0000 <p>Co-speech gesturing is an important modality in conversation, providing context and social cues. In character animation, appropriate and synchronised gestures add realism, and can make interactive agents more engaging. Historically, methods for automatically generating gestures were predominantly audio-driven, exploiting the prosodic and speech-related content that is encoded in the audio signal. In this paper we instead experiment with using Large-Language Model (LLM) features for gesture generation that are extracted from text using <i>L<span>lama</span></i>2. We compare against audio features, and explore combining the two modalities in both objective tests and a user study. Surprisingly, our results show that <i>L<span>lama</span></i>2 features on their own perform significantly better than audio features and that including both modalities yields no significant difference to using <i>L<span>lama</span></i>2 features in isolation. We demonstrate that the <i>L<span>lama</span></i>2 based model can generate both beat and semantic gestures without any audio input, suggesting LLMs can provide rich encodings that are well suited for gesture generation.</p>\u0000 </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 8","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.15167","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142707653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generalized eXtended Finite Element Method for Deformable Cutting via Boolean Operations 通过布尔运算实现可变形切割的通用扩展有限元法

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-17 DOI: 10.1111/cgf.15184

Q. M. Ton-That, P. G. Kry, S. Andrews

Traditional mesh-based methods for cutting deformable bodies rely on modifying the simulation mesh by deleting, duplicating, deforming or subdividing its elements. Unfortunately, such topological changes eventually lead to instability, reduced accuracy, or computational efficiency challenges. Hence, state of the art algorithms favor the extended finite element method (XFEM), which decouples the cut geometry from the simulation mesh, allowing for stable and accurate cuts at an additional computational cost that is local to the cut region. However, in the 3-dimensional setting, current XFEM frameworks are limited by the cutting configurations that they support. In particular, intersecting cuts are either prohibited or require sophisticated special treatment. Our work presents a general XFEM formulation that is applicable to the 1-, 2-, and 3-dimensional setting without sacrificing the desirable properties of the method. In particular, we propose a generalized enrichment which supports multiple intersecting cuts of various degrees of non-linearity by leveraging recent advances in robust mesh-Boolean technology. This novel strategy additionally enables analytic discontinuous integration schemes required to compute mass, force and elastic energy. We highlight the simplicity, expressivity and accuracy of our XFEM implementation across various scenarios in which intersecting cutting patterns are featured.

传统的基于网格的可变形体切割方法依赖于通过删除、复制、变形或细分网格元素来修改模拟网格。遗憾的是，这种拓扑变化最终会导致不稳定、精度降低或计算效率降低等问题。因此，最先进的算法倾向于采用扩展有限元法 (XFEM)，该方法将切割几何与模拟网格分离开来，只需在切割区域局部增加计算成本，即可实现稳定而精确的切割。然而，在三维环境中，当前的 XFEM 框架受到其支持的切割配置的限制。特别是，相交切割要么被禁止，要么需要复杂的特殊处理。我们的研究提出了一种通用的 XFEM 公式，它适用于一维、二维和三维环境，且不会牺牲该方法的理想特性。特别是，我们提出了一种广义的富集方法，利用最近在鲁棒网格布尔技术方面取得的进展，支持不同非线性程度的多重相交切分。这种新颖的策略还能实现计算质量、力和弹性能量所需的解析非连续积分方案。我们着重介绍了我们的 XFEM 实现在各种情景下的简易性、表现力和准确性，这些情景都以相交切割模式为特征。

{"title":"Generalized eXtended Finite Element Method for Deformable Cutting via Boolean Operations","authors":"Q. M. Ton-That, P. G. Kry, S. Andrews","doi":"10.1111/cgf.15184","DOIUrl":"https://doi.org/10.1111/cgf.15184","url":null,"abstract":"<div>\u0000 <p>Traditional mesh-based methods for cutting deformable bodies rely on modifying the simulation mesh by deleting, duplicating, deforming or subdividing its elements. Unfortunately, such topological changes eventually lead to instability, reduced accuracy, or computational efficiency challenges. Hence, state of the art algorithms favor the extended finite element method (XFEM), which decouples the cut geometry from the simulation mesh, allowing for stable and accurate cuts at an additional computational cost that is local to the cut region. However, in the 3-dimensional setting, current XFEM frameworks are limited by the cutting configurations that they support. In particular, intersecting cuts are either prohibited or require sophisticated special treatment. Our work presents a general XFEM formulation that is applicable to the 1-, 2-, and 3-dimensional setting without sacrificing the desirable properties of the method. In particular, we propose a generalized enrichment which supports multiple intersecting cuts of various degrees of non-linearity by leveraging recent advances in robust mesh-Boolean technology. This novel strategy additionally enables analytic discontinuous integration schemes required to compute mass, force and elastic energy. We highlight the simplicity, expressivity and accuracy of our XFEM implementation across various scenarios in which intersecting cutting patterns are featured.</p>\u0000 </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 8","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.15184","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142707656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multi-layer Solver for XPBD XPBD 的多层求解器

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-17 DOI: 10.1111/cgf.15186

A. Mercier-Aubin, P. G. Kry

We present a novel multi-layer method for extended position-based dynamics that exploits a sequence of reduced models consisting of rigid and elastic parts to speed up convergence. Taking inspiration from concepts like adaptive rigidification and long-range constraints, we automatically generate different rigid bodies at each layer based on the current strain rate. During the solve, the rigid bodies provide coupling between progressively less distant vertices during layer iterations, and therefore the fully elastic iterations at the final layer start from a lower residual error. Our layered approach likewise helps with the treatment of contact, where the mixed solves of both rigid and elastic in the layers permit fast propagation of impacts. We show several experiments that guide the selection of parameters of the solver, including the number of layers, the iterations per layers, as well as the choice of rigid patterns. Overall, our results show lower compute times for achieving a desired residual reduction across a variety of simulation models and scenarios.

我们针对基于位置的扩展动力学提出了一种新颖的多层方法，该方法利用由刚性和弹性部分组成的简化模型序列来加快收敛速度。我们从自适应刚性化和长程约束等概念中汲取灵感，根据当前应变率在每一层自动生成不同的刚体。在求解过程中，刚体会在层迭代过程中为距离逐渐变小的顶点之间提供耦合，因此最后一层的全弹性迭代会从较低的残余误差开始。我们的分层方法同样有助于处理接触问题，各层中的刚体和弹性体混合求解允许冲击快速传播。我们展示了指导选择求解器参数的几项实验，包括层数、每层迭代次数以及刚性模式的选择。总体而言，我们的结果表明，在各种模拟模型和方案中，计算时间较短，就能达到所需的残余物减少量。

引用次数: 0