首页 > 最新文献

ACM Transactions on Graphics (TOG)最新文献

英文 中文
An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range 图像堆栈的内隐神经表征:深度、全焦和高动态范围
Pub Date : 2023-12-04 DOI: 10.1145/3618367
Chao Wang, Ana Serrano, Xingang Pan, Krzysztof Wolski, Bin Chen, K. Myszkowski, Hans-Peter Seidel, Christian Theobalt, Thomas Leimkühler
In everyday photography, physical limitations of camera sensors and lenses frequently lead to a variety of degradations in captured images such as saturation or defocus blur. A common approach to overcome these limitations is to resort to image stack fusion, which involves capturing multiple images with different focal distances or exposures. For instance, to obtain an all-in-focus image, a set of multi-focus images is captured. Similarly, capturing multiple exposures allows for the reconstruction of high dynamic range. In this paper, we present a novel approach that combines neural fields with an expressive camera model to achieve a unified reconstruction of an all-in-focus high-dynamic-range image from an image stack. Our approach is composed of a set of specialized implicit neural representations tailored to address specific sub-problems along our pipeline: We use neural implicits to predict flow to overcome misalignments arising from lens breathing, depth, and all-in-focus images to account for depth of field, as well as tonemapping to deal with sensor responses and saturation - all trained using a physically inspired supervision structure with a differentiable thin lens model at its core. An important benefit of our approach is its ability to handle these tasks simultaneously or independently, providing flexible post-editing capabilities such as refocusing and exposure adjustment. By sampling the three primary factors in photography within our framework (focal distance, aperture, and exposure time), we conduct a thorough exploration to gain valuable insights into their significance and impact on overall reconstruction quality. Through extensive validation, we demonstrate that our method outperforms existing approaches in both depth-from-defocus and all-in-focus image reconstruction tasks. Moreover, our approach exhibits promising results in each of these three dimensions, showcasing its potential to enhance captured image quality and provide greater control in post-processing.
在日常摄影中,相机传感器和镜头的物理限制经常导致捕获图像的各种退化,例如饱和度或散焦模糊。克服这些限制的一种常用方法是求助于图像堆栈融合,这涉及捕获具有不同焦距或曝光的多个图像。例如,为了获得全焦图像,需要捕获一组多焦图像。同样,捕捉多次曝光可以重建高动态范围。在本文中,我们提出了一种将神经场与表达相机模型相结合的新方法,以实现从图像堆栈中统一重建全焦高动态范围图像。我们的方法由一组专门的隐式神经表示组成,用于解决我们管道中的特定子问题:我们使用神经隐式来预测流量,以克服由镜头呼吸、深度和全聚焦图像引起的不对准,以解释景深,以及色调映射来处理传感器响应和饱和度——所有这些都使用物理启发的监督结构进行训练,其核心是可微薄透镜模型。我们的方法的一个重要好处是它能够同时或独立地处理这些任务,提供灵活的后期编辑功能,如重新对焦和曝光调整。通过在我们的框架内采样摄影中的三个主要因素(焦距、光圈和曝光时间),我们进行了深入的探索,以获得对整体重建质量的重要性和影响的宝贵见解。通过广泛的验证,我们证明我们的方法在离焦深度和全焦图像重建任务中都优于现有的方法。此外,我们的方法在这三个方面都显示出有希望的结果,展示了其提高捕获图像质量和在后处理中提供更好控制的潜力。
{"title":"An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range","authors":"Chao Wang, Ana Serrano, Xingang Pan, Krzysztof Wolski, Bin Chen, K. Myszkowski, Hans-Peter Seidel, Christian Theobalt, Thomas Leimkühler","doi":"10.1145/3618367","DOIUrl":"https://doi.org/10.1145/3618367","url":null,"abstract":"In everyday photography, physical limitations of camera sensors and lenses frequently lead to a variety of degradations in captured images such as saturation or defocus blur. A common approach to overcome these limitations is to resort to image stack fusion, which involves capturing multiple images with different focal distances or exposures. For instance, to obtain an all-in-focus image, a set of multi-focus images is captured. Similarly, capturing multiple exposures allows for the reconstruction of high dynamic range. In this paper, we present a novel approach that combines neural fields with an expressive camera model to achieve a unified reconstruction of an all-in-focus high-dynamic-range image from an image stack. Our approach is composed of a set of specialized implicit neural representations tailored to address specific sub-problems along our pipeline: We use neural implicits to predict flow to overcome misalignments arising from lens breathing, depth, and all-in-focus images to account for depth of field, as well as tonemapping to deal with sensor responses and saturation - all trained using a physically inspired supervision structure with a differentiable thin lens model at its core. An important benefit of our approach is its ability to handle these tasks simultaneously or independently, providing flexible post-editing capabilities such as refocusing and exposure adjustment. By sampling the three primary factors in photography within our framework (focal distance, aperture, and exposure time), we conduct a thorough exploration to gain valuable insights into their significance and impact on overall reconstruction quality. Through extensive validation, we demonstrate that our method outperforms existing approaches in both depth-from-defocus and all-in-focus image reconstruction tasks. Moreover, our approach exhibits promising results in each of these three dimensions, showcasing its potential to enhance captured image quality and provide greater control in post-processing.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138601539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Design of LEGO® Sketch Art 乐高®素描艺术的计算设计
Pub Date : 2023-12-04 DOI: 10.1145/3618306
Mingjun Zhou, Jiahao Ge, Hao Xu, Chi-Wing Fu
This paper presents computational methods to aid the creation of LEGO®1 sketch models from simple input images. Beyond conventional LEGO® mosaics, we aim to improve the expressiveness of LEGO® models by utilizing LEGO® tiles with sloping and rounding edges, together with rectangular bricks, to reproduce smooth curves and sharp features in the input. This is a challenging task, as we have limited brick shapes to use and limited space to place bricks. Also, the search space is immense and combinatorial in nature. We approach the task by decoupling the LEGO® construction into two steps: first approximate the shape with a LEGO®-buildable contour then filling the contour polygon with LEGO® bricks. Further, we formulate this contour approximation into a graph optimization with our objective and constraints and effectively solve for the contour polygon that best approximates the input shape. Further, we extend our optimization model to handle multi-color and multi-layer regions, and formulate a grid alignment process and various perceptual constraints to refine the results. We employ our method to create a large variety of LEGO® models and compare it with humans and baseline methods to manifest its compelling quality and speed.
本文提出了计算方法,以帮助从简单的输入图像创建LEGO®1草图模型。除了传统的乐高®马赛克,我们的目标是提高乐高®模型的表现力,利用乐高®瓷砖倾斜和圆润的边缘,与矩形砖一起,在输入中重现光滑的曲线和尖锐的特征。这是一项具有挑战性的任务,因为我们使用的砖块形状有限,放置砖块的空间也有限。此外,搜索空间本质上是巨大的和组合的。我们通过将LEGO®结构解耦为两步来完成任务:首先用LEGO®可构建的轮廓近似形状,然后用LEGO®砖块填充轮廓多边形。进一步,我们将该轮廓近似化为具有目标和约束条件的图优化,并有效地求解最接近输入形状的轮廓多边形。进一步,我们将优化模型扩展到处理多颜色和多层区域,并制定网格对齐过程和各种感知约束来改进结果。我们采用我们的方法来创建各种LEGO®模型,并将其与人类和基线方法进行比较,以显示其引人注目的质量和速度。
{"title":"Computational Design of LEGO® Sketch Art","authors":"Mingjun Zhou, Jiahao Ge, Hao Xu, Chi-Wing Fu","doi":"10.1145/3618306","DOIUrl":"https://doi.org/10.1145/3618306","url":null,"abstract":"This paper presents computational methods to aid the creation of LEGO®1 sketch models from simple input images. Beyond conventional LEGO® mosaics, we aim to improve the expressiveness of LEGO® models by utilizing LEGO® tiles with sloping and rounding edges, together with rectangular bricks, to reproduce smooth curves and sharp features in the input. This is a challenging task, as we have limited brick shapes to use and limited space to place bricks. Also, the search space is immense and combinatorial in nature. We approach the task by decoupling the LEGO® construction into two steps: first approximate the shape with a LEGO®-buildable contour then filling the contour polygon with LEGO® bricks. Further, we formulate this contour approximation into a graph optimization with our objective and constraints and effectively solve for the contour polygon that best approximates the input shape. Further, we extend our optimization model to handle multi-color and multi-layer regions, and formulate a grid alignment process and various perceptual constraints to refine the results. We employ our method to create a large variety of LEGO® models and compare it with humans and baseline methods to manifest its compelling quality and speed.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Density Ratio Multi-Fluid Simulation with Peridynamics 利用周流体力学进行高密度比多流体模拟
Pub Date : 2023-12-04 DOI: 10.1145/3618347
Han Yan, Bo-Ning Ren
Multiple fluid simulation has raised wide research interest in recent years. Despite the impressive successes of current works, simulation of scenes containing mixing or unmixing of high-density-ratio phases using particle-based discretizations still remains a challenging task. In this paper, we propose a peridynamic mixture-model theory that stably handles high-density-ratio multi-fluid simulations. With assistance of novel scalar-valued volume flow states, a particle based discretization scheme is proposed to calculate all the terms in the multi-phase Navier-Stokes equations in an integral form, We also design a novel mass updating strategy for enhancing phase mass conservation and reducing particle volume variations under high density ratio settings in multi-fluid simulations. As a result, we achieve significantly stabler simulations in mixture-model multi-fluid simulations involving mixing and unmixing of high density ratio phases. Various experiments and comparisons demonstrate the effectiveness of our approach.
多元流体模拟近年来引起了广泛的研究兴趣。尽管目前的工作取得了令人印象深刻的成功,但使用基于粒子的离散化方法模拟包含高密度比相混合或解混的场景仍然是一项具有挑战性的任务。本文提出了一种稳定处理高密度比多流体模拟的准动力混合模型理论。针对多流体模拟中高密度比条件下的相质量守恒和颗粒体积变化,提出了一种基于颗粒的离散化方法,以积分形式计算多相Navier-Stokes方程中的所有项,并设计了一种新的质量更新策略。因此,在涉及高密度比相混合和解混的混合-模型多流体模拟中,我们获得了明显更稳定的模拟结果。各种实验和比较证明了我们的方法的有效性。
{"title":"High Density Ratio Multi-Fluid Simulation with Peridynamics","authors":"Han Yan, Bo-Ning Ren","doi":"10.1145/3618347","DOIUrl":"https://doi.org/10.1145/3618347","url":null,"abstract":"Multiple fluid simulation has raised wide research interest in recent years. Despite the impressive successes of current works, simulation of scenes containing mixing or unmixing of high-density-ratio phases using particle-based discretizations still remains a challenging task. In this paper, we propose a peridynamic mixture-model theory that stably handles high-density-ratio multi-fluid simulations. With assistance of novel scalar-valued volume flow states, a particle based discretization scheme is proposed to calculate all the terms in the multi-phase Navier-Stokes equations in an integral form, We also design a novel mass updating strategy for enhancing phase mass conservation and reducing particle volume variations under high density ratio settings in multi-fluid simulations. As a result, we achieve significantly stabler simulations in mixture-model multi-fluid simulations involving mixing and unmixing of high density ratio phases. Various experiments and comparisons demonstrate the effectiveness of our approach.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Hybrid Zoom Using Camera Fusion on Mobile Phones 在移动电话上使用相机融合技术实现高效混合变焦
Pub Date : 2023-12-04 DOI: 10.1145/3618362
Xiaotong Wu, Wei-Sheng Lai, Yi-Chang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang
DSLR cameras can achieve multiple zoom levels via shifting lens distances or swapping lens types. However, these techniques are not possible on smart-phone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems crop and digitally upsample images from W, leading to significant detail loss. In this paper, we propose an efficient system for hybrid zoom super-resolution on mobile devices, which captures a synchronous pair of W and T shots and leverages machine learning models to align and transfer details from T to W. We further develop an adaptive blending method that accounts for depth-of-field mismatches, scene occlusion, flow uncertainty, and alignment errors. To minimize the domain gap, we design a dual-phone camera rig to capture real-world inputs and ground-truths for supervised training. Our method generates a 12-megapixel image in 500ms on a mobile platform and compares favorably against state-of-the-art methods under extensive evaluation on real-world scenarios.
数码单反相机可以通过改变镜头距离或切换镜头类型来实现多种变焦级别。然而,由于空间限制,这些技术在智能手机设备上是不可能的。大多数智能手机制造商采用混合变焦系统:通常是低变焦水平的广角(W)相机和高变焦水平的远摄(T)相机。为了模拟W和T之间的变焦水平,这些系统从W裁剪和数字上采样图像,导致显著的细节损失。在本文中,我们提出了一种高效的移动设备混合变焦超分辨率系统,该系统捕获同步的W和T对镜头,并利用机器学习模型将细节从T对齐和传输到W。我们进一步开发了一种自适应混合方法,该方法考虑了景深不匹配、场景遮挡、流不确定性和对齐误差。为了最大限度地减少领域差距,我们设计了一个双手机摄像头来捕捉现实世界的输入和监督训练的真实情况。我们的方法在500毫秒内在移动平台上生成1200万像素的图像,并在真实场景的广泛评估下与最先进的方法相比具有优势。
{"title":"Efficient Hybrid Zoom Using Camera Fusion on Mobile Phones","authors":"Xiaotong Wu, Wei-Sheng Lai, Yi-Chang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang","doi":"10.1145/3618362","DOIUrl":"https://doi.org/10.1145/3618362","url":null,"abstract":"DSLR cameras can achieve multiple zoom levels via shifting lens distances or swapping lens types. However, these techniques are not possible on smart-phone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems crop and digitally upsample images from W, leading to significant detail loss. In this paper, we propose an efficient system for hybrid zoom super-resolution on mobile devices, which captures a synchronous pair of W and T shots and leverages machine learning models to align and transfer details from T to W. We further develop an adaptive blending method that accounts for depth-of-field mismatches, scene occlusion, flow uncertainty, and alignment errors. To minimize the domain gap, we design a dual-phone camera rig to capture real-world inputs and ground-truths for supervised training. Our method generates a 12-megapixel image in 500ms on a mobile platform and compares favorably against state-of-the-art methods under extensive evaluation on real-world scenarios.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138603950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ToRoS: A Topology Optimization Approach for Designing Robotic Skins ToRoS:设计机器人皮肤的拓扑优化方法
Pub Date : 2023-12-04 DOI: 10.1145/3618382
Juan Montes Maestre, R. Hinchet, Stelian Coros, Bernhard Thomaszewski
Soft robotics offers unique advantages in manipulating fragile or deformable objects, human-robot interaction, and exploring inaccessible terrain. However, designing soft robots that produce large, targeted deformations is challenging. In this paper, we propose a new methodology for designing soft robots that combines optimization-based design with a simple and cost-efficient manufacturing process. Our approach is centered around the concept of robotic skins---thin fabrics with 3D-printed reinforcement patterns that augment and control plain silicone actuators. By decoupling shape control and actuation, our approach enables a simpler and cost-efficient manufacturing process. Unlike previous methods that rely on empirical design heuristics for generating desired deformations, our approach automatically discovers complex reinforcement patterns without any need for domain knowledge or human intervention. This is achieved by casting reinforcement design as a nonlinear constrained optimization problem and using a novel, three-field topology optimization approach tailored to fabrics with 3D-printed reinforcements. We demonstrate the potential of our approach by designing soft robotic actuators capable of various motions such as bending, contraction, twist, and combinations thereof. We also demonstrate applications of our robotic skins to robotic grasping with a soft three-finger gripper and locomotion tasks for a soft quadrupedal robot.
软机器人在操纵易碎或可变形的物体、人机交互和探索难以到达的地形方面具有独特的优势。然而,设计能够产生大的、有针对性的变形的软体机器人是具有挑战性的。在本文中,我们提出了一种设计软机器人的新方法,该方法将基于优化的设计与简单且具有成本效益的制造过程相结合。我们的方法围绕着机器人皮肤的概念——带有3d打印增强图案的薄织物,可以增强和控制普通的硅胶致动器。通过解耦形状控制和驱动,我们的方法实现了更简单和经济高效的制造过程。与以前依赖经验设计启发式生成所需变形的方法不同,我们的方法自动发现复杂的强化模式,无需任何领域知识或人为干预。这是通过将加固设计作为一个非线性约束优化问题,并使用一种新颖的三场拓扑优化方法来实现的,该方法是为3d打印加固的织物量身定制的。我们通过设计能够进行各种运动(如弯曲、收缩、扭曲及其组合)的软机器人驱动器来展示我们方法的潜力。我们还展示了我们的机器人皮肤在机器人抓取与软三指抓手和运动任务的软四足机器人的应用。
{"title":"ToRoS: A Topology Optimization Approach for Designing Robotic Skins","authors":"Juan Montes Maestre, R. Hinchet, Stelian Coros, Bernhard Thomaszewski","doi":"10.1145/3618382","DOIUrl":"https://doi.org/10.1145/3618382","url":null,"abstract":"Soft robotics offers unique advantages in manipulating fragile or deformable objects, human-robot interaction, and exploring inaccessible terrain. However, designing soft robots that produce large, targeted deformations is challenging. In this paper, we propose a new methodology for designing soft robots that combines optimization-based design with a simple and cost-efficient manufacturing process. Our approach is centered around the concept of robotic skins---thin fabrics with 3D-printed reinforcement patterns that augment and control plain silicone actuators. By decoupling shape control and actuation, our approach enables a simpler and cost-efficient manufacturing process. Unlike previous methods that rely on empirical design heuristics for generating desired deformations, our approach automatically discovers complex reinforcement patterns without any need for domain knowledge or human intervention. This is achieved by casting reinforcement design as a nonlinear constrained optimization problem and using a novel, three-field topology optimization approach tailored to fabrics with 3D-printed reinforcements. We demonstrate the potential of our approach by designing soft robotic actuators capable of various motions such as bending, contraction, twist, and combinations thereof. We also demonstrate applications of our robotic skins to robotic grasping with a soft three-finger gripper and locomotion tasks for a soft quadrupedal robot.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138605017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rectifying Strip Patterns 校正带状图案
Pub Date : 2023-12-04 DOI: 10.1145/3618378
Bolun Wang, Hui Wang, E. Schling, Helmut Pottmann
Straight flat strips of inextensible material can be bent into curved strips aligned with arbitrary space curves. The large shape variety of these so-called rectifying strips makes them candidates for shape modeling, especially in applications such as architecture where simple elements are preferred for the fabrication of complex shapes. In this paper, we provide computational tools for the design of shapes from rectifying strips. They can form various patterns and fulfill constraints which are required for specific applications such as gridshells or shading systems. The methodology is based on discrete models of rectifying strips, a discrete level-set formulation and optimization-based constrained mesh design and editing. We also analyse the geometry at nodes and present remarkable quadrilateral arrangements of rectifying strips with torsion-free nodes.
不可扩展材料的平直条可以弯曲成与任意空间曲线对齐的弯曲条。这些所谓的纠偏条的形状变化很大,使它们成为形状建模的候选者,特别是在建筑等应用中,简单的元素更适合制造复杂的形状。在本文中,我们提供了计算工具的形状设计从纠偏带。它们可以形成各种图案,并满足特定应用(如网格壳或阴影系统)所需的约束。该方法基于纠偏条带的离散模型,离散水平集公式和基于优化的约束网格设计和编辑。我们还分析了节点处的几何形状,并提出了具有无扭转节点的纠偏条带的显着四边形排列。
{"title":"Rectifying Strip Patterns","authors":"Bolun Wang, Hui Wang, E. Schling, Helmut Pottmann","doi":"10.1145/3618378","DOIUrl":"https://doi.org/10.1145/3618378","url":null,"abstract":"Straight flat strips of inextensible material can be bent into curved strips aligned with arbitrary space curves. The large shape variety of these so-called rectifying strips makes them candidates for shape modeling, especially in applications such as architecture where simple elements are preferred for the fabrication of complex shapes. In this paper, we provide computational tools for the design of shapes from rectifying strips. They can form various patterns and fulfill constraints which are required for specific applications such as gridshells or shading systems. The methodology is based on discrete models of rectifying strips, a discrete level-set formulation and optimization-based constrained mesh design and editing. We also analyse the geometry at nodes and present remarkable quadrilateral arrangements of rectifying strips with torsion-free nodes.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138601549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture SAILOR:协同辐射场和占位场捕捉真人表演
Pub Date : 2023-12-04 DOI: 10.1145/3618370
Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W. H. Lau
Immersive user experiences in live VR/AR performances require a fast and accurate free-view rendering of the performers. Existing methods are mainly based on Pixel-aligned Implicit Functions (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based methods usually fail to produce photorealistic view-dependent textures, NeRF-based methods typically lack local geometry accuracy and are computationally heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pose estimation). In this work, we propose a novel generalizable method, named SAILOR, to create high-quality human free-view videos from very sparse RGBD live streams. To produce view-dependent textures while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we propose a novel network, named SRONet, for this hybrid representation. SRONet can handle unseen performers without fine-tuning. Besides, a neural blending-based ray interpolation approach, a tree-based voxel-denoising scheme, and a parallel computing pipeline are incorporated to reconstruct and render live free-view videos at 10 fps on average. To evaluate the rendering performance, we construct a real-captured RGBD benchmark from 40 performers. Experimental results show that SAILOR outperforms existing human reconstruction and performance capture methods.
现场VR/AR表演的沉浸式用户体验需要表演者的快速准确的自由视图渲染。现有的方法主要基于像素对齐隐式函数(PIFu)或神经辐射场(NeRF)。然而,虽然基于pifu的方法通常无法产生逼真的视图依赖纹理,但基于nerf的方法通常缺乏局部几何精度并且计算量很大(例如,3D点的密集采样,额外的微调或姿态估计)。在这项工作中,我们提出了一种新的可推广方法,名为SAILOR,从非常稀疏的RGBD直播流中创建高质量的人类自由观看视频。为了产生依赖于视图的纹理,同时保持局部精确的几何形状,我们整合了PIFu和NeRF,使它们协同工作,通过调节PIFu的深度,然后通过NeRF渲染依赖于视图的纹理。具体来说,我们提出了一个新的网络,命名为SRONet,用于这种混合表示。SRONet无需微调就可以处理看不见的表演者。此外,结合基于神经混合的光线插值方法、基于树的体素去噪方案和并行计算管道,以平均10fps的速度重建和渲染实时自由视频。为了评估渲染性能,我们从40个表演者中构建了一个实时捕获的RGBD基准。实验结果表明,该方法优于现有的人体重建和性能捕获方法。
{"title":"SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture","authors":"Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W. H. Lau","doi":"10.1145/3618370","DOIUrl":"https://doi.org/10.1145/3618370","url":null,"abstract":"Immersive user experiences in live VR/AR performances require a fast and accurate free-view rendering of the performers. Existing methods are mainly based on Pixel-aligned Implicit Functions (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based methods usually fail to produce photorealistic view-dependent textures, NeRF-based methods typically lack local geometry accuracy and are computationally heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pose estimation). In this work, we propose a novel generalizable method, named SAILOR, to create high-quality human free-view videos from very sparse RGBD live streams. To produce view-dependent textures while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we propose a novel network, named SRONet, for this hybrid representation. SRONet can handle unseen performers without fine-tuning. Besides, a neural blending-based ray interpolation approach, a tree-based voxel-denoising scheme, and a parallel computing pipeline are incorporated to reconstruct and render live free-view videos at 10 fps on average. To evaluate the rendering performance, we construct a real-captured RGBD benchmark from 40 performers. Experimental results show that SAILOR outperforms existing human reconstruction and performance capture methods.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconstructing Close Human Interactions from Multiple Views 从多视角重建密切的人际互动
Pub Date : 2023-12-04 DOI: 10.1145/3618336
Qing Shuai, Zhiyuan Yu, Zhize Zhou, Lixin Fan, Haijun Yang, Can Yang, Xiaowei Zhou
This paper addresses the challenging task of reconstructing the poses of multiple individuals engaged in close interactions, captured by multiple calibrated cameras. The difficulty arises from the noisy or false 2D keypoint detections due to inter-person occlusion, the heavy ambiguity in associating keypoints to individuals due to the close interactions, and the scarcity of training data as collecting and annotating motion data in crowded scenes is resource-intensive. We introduce a novel system to address these challenges. Our system integrates a learning-based pose estimation component and its corresponding training and inference strategies. The pose estimation component takes multi-view 2D keypoint heatmaps as input and reconstructs the pose of each individual using a 3D conditional volumetric network. As the network doesn't need images as input, we can leverage known camera parameters from test scenes and a large quantity of existing motion capture data to synthesize massive training data that mimics the real data distribution in test scenes. Extensive experiments demonstrate that our approach significantly surpasses previous approaches in terms of pose accuracy and is generalizable across various camera setups and population sizes. The code is available on our project page: https://github.com/zju3dv/CloseMoCap.
本文解决了一个具有挑战性的任务,即重建由多个校准的相机捕获的从事密切互动的多个个体的姿势。由于人与人之间的遮挡,2D关键点检测存在噪声或错误,由于密切的交互,关键点与个体的关联存在严重的模糊性,并且训练数据的稀缺性,因为在拥挤的场景中收集和注释运动数据是资源密集型的。我们引入了一个新的系统来应对这些挑战。我们的系统集成了一个基于学习的姿态估计组件及其相应的训练和推理策略。姿态估计组件以多视图2D关键点热图作为输入,并使用三维条件体积网络重建每个个体的姿态。由于网络不需要图像作为输入,我们可以利用测试场景中已知的摄像机参数和大量现有的动作捕捉数据,合成大量模拟测试场景中真实数据分布的训练数据。大量的实验表明,我们的方法在姿态精度方面显着超越了以前的方法,并且可以推广到各种相机设置和人口规模。代码可以在我们的项目页面上找到:https://github.com/zju3dv/CloseMoCap。
{"title":"Reconstructing Close Human Interactions from Multiple Views","authors":"Qing Shuai, Zhiyuan Yu, Zhize Zhou, Lixin Fan, Haijun Yang, Can Yang, Xiaowei Zhou","doi":"10.1145/3618336","DOIUrl":"https://doi.org/10.1145/3618336","url":null,"abstract":"This paper addresses the challenging task of reconstructing the poses of multiple individuals engaged in close interactions, captured by multiple calibrated cameras. The difficulty arises from the noisy or false 2D keypoint detections due to inter-person occlusion, the heavy ambiguity in associating keypoints to individuals due to the close interactions, and the scarcity of training data as collecting and annotating motion data in crowded scenes is resource-intensive. We introduce a novel system to address these challenges. Our system integrates a learning-based pose estimation component and its corresponding training and inference strategies. The pose estimation component takes multi-view 2D keypoint heatmaps as input and reconstructs the pose of each individual using a 3D conditional volumetric network. As the network doesn't need images as input, we can leverage known camera parameters from test scenes and a large quantity of existing motion capture data to synthesize massive training data that mimics the real data distribution in test scenes. Extensive experiments demonstrate that our approach significantly surpasses previous approaches in terms of pose accuracy and is generalizable across various camera setups and population sizes. The code is available on our project page: https://github.com/zju3dv/CloseMoCap.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138602213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
C-Shells: Deployable Gridshells with Curved Beams C 型外壳带弧形横梁的可部署栅壳
Pub Date : 2023-12-04 DOI: 10.1145/3618366
Quentin Becker, Seiichi Suzuki, Y. Ren, Davide Pellis, Julian Panetta, Mark Pauly
We introduce a computational pipeline for simulating and designing C-shells, a new class of planar-to-spatial deployable linkage structures. A C-shell is composed of curved flexible beams connected at rotational joints that can be assembled in a stress-free planar configuration. When actuated, the elastic beams deform and the assembly deploys towards the target 3D shape. We propose two alternative computational design approaches for C-shells: (i) Forward exploration simulates the deployed shape from a planar beam layout provided by the user. Once a satisfactory overall shape is found, a subsequent design optimization adapts the beam geometry to reduce the elastic energy of the linkage while preserving the target shape. (ii) Inverse design is facilitated by a new geometric flattening method that takes a design surface as input and computes an initial layout of piecewise straight linkage beams. Our design optimization algorithm then calculates the smooth curved beams to best reproduce the target shape at minimal elastic energy. We find that C-shells offer a rich space for design and show several studies that highlight new shape topologies that cannot be achieved with existing deployable linkage structures.
介绍了一种用于模拟和设计c壳的计算管道,c壳是一类新的平面到空间的可展开连杆结构。c型壳由弯曲的柔性梁组成,这些梁在旋转节点上连接,可以以无应力平面结构组装。当被驱动时,弹性梁变形,组件向目标3D形状展开。我们提出了两种可供选择的c壳计算设计方法:(i)正向勘探模拟用户提供的平面梁布局的部署形状。一旦找到满意的整体形状,随后的设计优化调整梁的几何形状,以减少连杆的弹性能量,同时保持目标形状。(ii)采用一种新的几何平坦化方法,以设计平面为输入,计算分段直线连杆梁的初始布置,便于反设计。然后,我们的设计优化算法计算光滑弯曲梁,以在最小弹性能量下最佳地再现目标形状。我们发现c壳为设计提供了丰富的空间,并展示了一些研究,这些研究突出了现有可展开链接结构无法实现的新形状拓扑。
{"title":"C-Shells: Deployable Gridshells with Curved Beams","authors":"Quentin Becker, Seiichi Suzuki, Y. Ren, Davide Pellis, Julian Panetta, Mark Pauly","doi":"10.1145/3618366","DOIUrl":"https://doi.org/10.1145/3618366","url":null,"abstract":"We introduce a computational pipeline for simulating and designing C-shells, a new class of planar-to-spatial deployable linkage structures. A C-shell is composed of curved flexible beams connected at rotational joints that can be assembled in a stress-free planar configuration. When actuated, the elastic beams deform and the assembly deploys towards the target 3D shape. We propose two alternative computational design approaches for C-shells: (i) Forward exploration simulates the deployed shape from a planar beam layout provided by the user. Once a satisfactory overall shape is found, a subsequent design optimization adapts the beam geometry to reduce the elastic energy of the linkage while preserving the target shape. (ii) Inverse design is facilitated by a new geometric flattening method that takes a design surface as input and computes an initial layout of piecewise straight linkage beams. Our design optimization algorithm then calculates the smooth curved beams to best reproduce the target shape at minimal elastic energy. We find that C-shells offer a rich space for design and show several studies that highlight new shape topologies that cannot be achieved with existing deployable linkage structures.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138603967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NodeGit: Diffing and Merging Node Graphs NodeGit:差异和合并节点图
Pub Date : 2023-12-04 DOI: 10.1145/3618343
Eduardo Rinaldi, D. Sforza, Fabio Pellacini
The use of version control is pervasive in collaborative software projects. Version control systems are based on two primary operations: diffing two versions to compute the change between them and merging two versions edited concurrently. Recent works provide solutions to diff and merge graphics assets such as images, meshes and scenes. In this work, we present a practical algorithm to diff and merge procedural programs written as node graphs. To obtain more precise diffs, we version the graphs directly rather than their textual representations. Diffing graphs is equivalent to computing the graph edit distance, which is known to be computationally infeasible. Following prior work, we propose an approximate algorithm tailored to our problem domain. We validate the proposed algorithm by applying it both to manual edits and to a large set of randomized modifications of procedural shapes and materials. We compared our method with existing state-of-the-art algorithms, showing that our approach is the only one that reliably detects user edits.
版本控制的使用在协作软件项目中非常普遍。版本控制系统基于两个主要操作:区分两个版本以计算它们之间的变化,合并并发编辑的两个版本。最近的工作提供了解决方案,以区分和合并图形资产,如图像,网格和场景。在这项工作中,我们提出了一种实用的算法来区分和合并以节点图形式编写的过程程序。为了获得更精确的差异,我们直接对图进行版本化,而不是对其文本表示。区分图相当于计算图的编辑距离,这在计算上是不可行的。根据之前的工作,我们提出了一个适合我们问题域的近似算法。我们通过将其应用于手动编辑和大量随机修改的程序形状和材料来验证所提出的算法。我们将我们的方法与现有的最先进的算法进行了比较,表明我们的方法是唯一可靠地检测用户编辑的方法。
{"title":"NodeGit: Diffing and Merging Node Graphs","authors":"Eduardo Rinaldi, D. Sforza, Fabio Pellacini","doi":"10.1145/3618343","DOIUrl":"https://doi.org/10.1145/3618343","url":null,"abstract":"The use of version control is pervasive in collaborative software projects. Version control systems are based on two primary operations: diffing two versions to compute the change between them and merging two versions edited concurrently. Recent works provide solutions to diff and merge graphics assets such as images, meshes and scenes. In this work, we present a practical algorithm to diff and merge procedural programs written as node graphs. To obtain more precise diffs, we version the graphs directly rather than their textual representations. Diffing graphs is equivalent to computing the graph edit distance, which is known to be computationally infeasible. Following prior work, we propose an approximate algorithm tailored to our problem domain. We validate the proposed algorithm by applying it both to manual edits and to a large set of randomized modifications of procedural shapes and materials. We compared our method with existing state-of-the-art algorithms, showing that our approach is the only one that reliably detects user edits.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138604630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Graphics (TOG)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1