Graphical Models最新文献

英文中文

SPIDER: A framework for processing, editing and presenting immersive high-resolution spherical indoor scenes SPIDER:用于处理、编辑和呈现沉浸式高分辨率球形室内场景的框架

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-07-01 DOI: 10.1016/j.gmod.2023.101182

M. Tukur , G. Pintore , E. Gobbetti , J. Schneider , M. Agus

Today’s Extended Reality (XR) applications that call for specific Diminished Reality (DR) strategies to hide specific classes of objects are increasingly using 360° cameras, which can capture entire areas in a single picture. In this work, we present an interactive-based image processing, editing and rendering system named SPIDER, that takes a spherical 360° indoor scene as input. The system is composed of a novel integrated deep learning architecture for extracting geometric and semantic information of full and empty rooms, based on gated and dilated convolutions, followed by a super-resolution module for improving the resolution of the color and depth signals. The obtained high resolution representations allow users to perform interactive exploration and basic editing operations on the reconstructed indoor scene, namely: (i) rendering of the scene in various modalities (point cloud, polygonal, wireframe) (ii) refurnishing (transferring portions of rooms) (iii) deferred shading through the usage of precomputed normal maps. These kinds of scene editing and manipulations can be used for assessing the inference from deep learning models and enable several Mixed Reality applications in areas such as furniture retails, interior designs, and real estates. Moreover, it can also be useful in data augmentation, arts, designs, and paintings. We report on the performance improvement of the various processing components on public domain spherical image indoor datasets.

如今的扩展现实(XR)应用需要特定的缩小现实(DR)策略来隐藏特定类别的物体，越来越多地使用360°相机，可以在一张照片中捕捉整个区域。在这项工作中，我们提出了一个基于交互式的图像处理、编辑和渲染系统，名为SPIDER，该系统以球形360°室内场景为输入。该系统由基于门控卷积和扩展卷积的新型集成深度学习架构组成，该架构用于提取满房间和空房间的几何和语义信息，然后是一个超分辨率模块，用于提高颜色和深度信号的分辨率。获得的高分辨率表示允许用户对重建的室内场景进行交互式探索和基本编辑操作，即:(i)以各种模式(点云，多边形，线框)渲染场景;(ii)重新布置(转移房间的部分);(iii)通过使用预先计算的法线贴图延迟着色。这些类型的场景编辑和操作可用于评估深度学习模型的推断，并在家具零售、室内设计和房地产等领域实现多种混合现实应用。此外，它还可以用于数据增强、艺术、设计和绘画。我们报告了各种处理组件在公共域球面图像室内数据集上的性能改进。

{"title":"SPIDER: A framework for processing, editing and presenting immersive high-resolution spherical indoor scenes","authors":"M. Tukur , G. Pintore , E. Gobbetti , J. Schneider , M. Agus","doi":"10.1016/j.gmod.2023.101182","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101182","url":null,"abstract":"<div><p>Today’s Extended Reality (XR) applications that call for specific Diminished Reality (DR) strategies to hide specific classes of objects are increasingly using 360° cameras, which can capture entire areas in a single picture. In this work, we present an interactive-based image processing, editing and rendering system named <strong>SPIDER</strong>, that takes a spherical 360° indoor scene as input. The system is composed of a novel integrated deep learning architecture for extracting geometric and semantic information of full and empty rooms, based on gated and dilated convolutions, followed by a super-resolution module for improving the resolution of the color and depth signals. The obtained high resolution representations allow users to perform interactive exploration and basic editing operations on the reconstructed indoor scene, namely: (i) rendering of the scene in various modalities (point cloud, polygonal, wireframe) (ii) refurnishing (transferring portions of rooms) (iii) deferred shading through the usage of precomputed normal maps. These kinds of scene editing and manipulations can be used for assessing the inference from deep learning models and enable several Mixed Reality applications in areas such as furniture retails, interior designs, and real estates. Moreover, it can also be useful in data augmentation, arts, designs, and paintings. We report on the performance improvement of the various processing components on public domain spherical image indoor datasets.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"128 ","pages":"Article 101182"},"PeriodicalIF":1.7,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49875412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

HMDO : Markerless multi-view hand manipulation capture with deformable objects 具有可变形对象的无标记多视图手操作捕获

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-05-01 DOI: 10.1016/j.gmod.2023.101178

Wei Xie, Zhipeng Yu, Zimeng Zhao, Binghui Zuo, Yangang Wang

We construct the first markerless deformable interaction dataset recording interactive motions of the hands and deformable objects, called HMDO (Hand Manipulation with Deformable Objects). With our built multi-view capture system, it captures the deformable interactions with multiple perspectives, various object shapes, and diverse interactive forms. Our motivation is the current lack of hand and deformable object interaction datasets, as 3D hand and deformable object reconstruction is challenging. Mainly due to mutual occlusion, the interaction area is difficult to observe, the visual features between the hand and the object are entangled, and the reconstruction of the interaction area deformation is difficult. To tackle this challenge, we propose a method to annotate our captured data. Our key idea is to collaborate with estimated hand features to guide the object global pose estimation, and then optimize the deformation process of the object by analyzing the relationship between the hand and the object. Through comprehensive evaluation, the proposed method can reconstruct interactive motions of hands and deformable objects with high quality. HMDO currently consists of 21600 frames over 12 sequences. In the future, this dataset could boost the research of learning-based reconstruction of deformable interaction scenes.

我们构建了第一个记录手和可变形物体交互运动的无标记可变形交互数据集，称为HMDO（带可变形物体的手操纵）。通过我们构建的多视图捕捉系统，它捕捉到了具有多个视角、各种物体形状和多种交互形式的可变形交互。我们的动机是目前缺乏手和可变形物体的交互数据集，因为3D手和可形变物体的重建具有挑战性。主要由于相互遮挡，交互区域难以观察，手和物体之间的视觉特征纠缠在一起，交互区域变形的重建也很困难。为了应对这一挑战，我们提出了一种对捕获的数据进行注释的方法。我们的关键思想是与估计的手部特征协作，指导物体的全局姿态估计，然后通过分析手部和物体之间的关系来优化物体的变形过程。通过综合评价，该方法可以高质量地重建手和可变形物体的交互运动。HMDO目前由12个序列上的21600个帧组成。未来，该数据集可以推动基于学习的可变形交互场景重建研究。

{"title":"HMDO : Markerless multi-view hand manipulation capture with deformable objects","authors":"Wei Xie, Zhipeng Yu, Zimeng Zhao, Binghui Zuo, Yangang Wang","doi":"10.1016/j.gmod.2023.101178","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101178","url":null,"abstract":"<div><p>We construct the first markerless deformable interaction dataset recording interactive motions of the hands and deformable objects, called HMDO (Hand Manipulation with Deformable Objects). With our built multi-view capture system, it captures the deformable interactions with multiple perspectives, various object shapes, and diverse interactive forms. Our motivation is the current lack of hand and deformable object interaction datasets, as 3D hand and deformable object reconstruction is challenging. Mainly due to mutual occlusion, the interaction area is difficult to observe, the visual features between the hand and the object are entangled, and the reconstruction of the interaction area deformation is difficult. To tackle this challenge, we propose a method to annotate our captured data. Our key idea is to collaborate with estimated hand features to guide the object global pose estimation, and then optimize the deformation process of the object by analyzing the relationship between the hand and the object. Through comprehensive evaluation, the proposed method can reconstruct interactive motions of hands and deformable objects with high quality. HMDO currently consists of 21600 frames over 12 sequences. In the future, this dataset could boost the research of learning-based reconstruction of deformable interaction scenes.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"127 ","pages":"Article 101178"},"PeriodicalIF":1.7,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49701216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated generation of floorplans with non-rectangular rooms 自动生成非矩形房间的平面图

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-05-01 DOI: 10.1016/j.gmod.2023.101175

Krishnendra Shekhawat, Rohit Lohani, Chirag Dasannacharya, Sumit Bisht, Sujay Rastogi

Existing approaches (in particular graph theoretic) for generating floorplans focus on constructing floorplans for given adjacencies without considering boundary layout or room shapes. With recent developments in designs, it is demanding to consider multiple constraints while generating floorplan layouts. In this paper, we study graph theoretic properties which guarantee the presence of different shaped rooms within the floorplans. Further, we present a graph-algorithms based application, developed in Python, for generating floorplans with given input room shapes. The proposed application is useful in creating floorplans for a given graph with desired room shapes mainly, L, T, F, C, staircase, and plus-shape. Here, the floorplan boundary is always rectangular. In future,we aim to extend this work to generate any (rectilinear) room shape and floor plan boundary for a given graph.

现有的生成平面布置图的方法（特别是图论）侧重于为给定的相邻区域构建平面布置图，而不考虑边界布局或房间形状。随着设计的最新发展，在生成平面布置图时需要考虑多个约束。在本文中，我们研究了保证平面图中存在不同形状房间的图论性质。此外，我们还介绍了一个基于图形算法的应用程序，该应用程序是用Python开发的，用于生成具有给定输入房间形状的平面图。所提出的应用程序可用于为给定图形创建平面布置图，该图形主要具有所需的房间形状，L、T、F、C、楼梯和加号形状。此处，平面布置图边界始终为矩形。在未来，我们的目标是将这项工作扩展到为给定的图生成任何（直线）房间形状和平面图边界。

引用次数: 2

Camera distance helps 3D hand pose estimated from a single RGB image 相机距离有助于从单个RGB图像估计3D手姿势

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-05-01 DOI: 10.1016/j.gmod.2023.101179

Yuan Cui , Moran Li , Yuan Gao , Changxin Gao , Fan Wu , Hao Wen , Jiwei Li , Nong Sang

Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.

大多数现有的RGB手姿态估计方法使用根相对3D坐标进行监督。然而，这种监督忽略了相机和物体（即手）之间的距离。在控制透视投影的深度相关缩放的透视相机下，相机距离尤其重要。因此，具有不同相机距离的相同手姿势可以通过相同的透视相机投影到不同的2D形状。忽略这样的重要信息会导致从2D图像中恢复3D姿态的模糊性。在本文中，我们提出了一个相机投影学习模块（CPLM），该模块使用相机距离中包含的比例因子将3D手部姿势与2D UV坐标相关联，从而有助于进一步优化估计的手关节的精度。具体来说，在之前的工作之后，我们使用两阶段RGB-to-2D和2D-to-3D方法来估计3D手部姿势，并在第二阶段嵌入图卷积网络，以利用2D手部关节的复杂非欧几里得结构中包含的信息。实验结果表明，我们提出的方法在基准数据集RHD上超过了最先进的方法，并在STB和D+O数据集上获得了有竞争力的结果。

{"title":"Camera distance helps 3D hand pose estimated from a single RGB image","authors":"Yuan Cui , Moran Li , Yuan Gao , Changxin Gao , Fan Wu , Hao Wen , Jiwei Li , Nong Sang","doi":"10.1016/j.gmod.2023.101179","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101179","url":null,"abstract":"<div><p>Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"127 ","pages":"Article 101179"},"PeriodicalIF":1.7,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49702994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-fidelity point cloud completion with low-resolution recovery and noise-aware upsampling 具有低分辨率恢复和噪声感知上采样的高保真点云完成

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-04-01 DOI: 10.1016/j.gmod.2023.101173

Ren-Wu Li , Bo Wang , Lin Gao , Ling-Xiao Zhang , Chun-Peng Li

Completing an unordered partial point cloud is a challenging task. Existing approaches that rely on decoding a latent feature to recover the complete shape, often lead to the completed point cloud being over-smoothing, losing details, and noisy. Instead of decoding a whole shape, we propose to decode and refine a low-resolution (low-res) point cloud first, and then perform a patch-wise noise-aware upsampling rather than interpolating the whole sparse point cloud at once, which tends to lose details. Regarding the possibility of lacking details of the initially decoded low-res point cloud, we propose an iterative refinement to recover the geometric details and a symmetrization process to preserve the trustworthy information from the input partial point cloud. After obtaining a sparse and complete point cloud, we propose a patch-wise upsampling strategy. Patch-based upsampling allows to recover fine details better rather than decoding a whole shape. The patch extraction approach is to generate training patch pairs between the sparse and ground-truth point clouds with an outlier removal step to suppress the noisy points from the sparse point cloud. Together with the low-res recovery, our whole pipeline can achieve high-fidelity point cloud completion. Comprehensive evaluations are provided to demonstrate the effectiveness of the proposed method and its components.

完成一个无序的局部点云是一项具有挑战性的任务。现有的方法依赖于对潜在特征进行解码来恢复完整的形状，通常会导致完成的点云过于平滑、丢失细节和噪声。我们建议先解码和细化低分辨率（低分辨率）点云，然后执行逐块噪声感知上采样，而不是一次对整个稀疏点云进行插值，这往往会丢失细节，而不是对整个形状进行解码。关于最初解码的低分辨率点云缺乏细节的可能性，我们提出了一种迭代细化来恢复几何细节，并提出了一个对称化过程来保留来自输入部分点云的可信信息。在获得稀疏和完整的点云后，我们提出了一种逐片上采样策略。基于补丁的上采样可以更好地恢复精细细节，而不是解码整个形状。补丁提取方法是在稀疏点云和地面实况点云之间生成训练补丁对，并通过异常值去除步骤来抑制稀疏点云中的噪声点。结合低分辨率恢复，我们的整个管道可以实现高保真度的点云完成。提供了全面的评估，以证明拟议方法及其组成部分的有效性。

{"title":"High-fidelity point cloud completion with low-resolution recovery and noise-aware upsampling","authors":"Ren-Wu Li , Bo Wang , Lin Gao , Ling-Xiao Zhang , Chun-Peng Li","doi":"10.1016/j.gmod.2023.101173","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101173","url":null,"abstract":"<div><p>Completing an unordered partial point cloud is a challenging task. Existing approaches that rely on decoding a latent feature to recover the complete shape, often lead to the completed point cloud being over-smoothing, losing details, and noisy. Instead of decoding a whole shape, we propose to decode and refine a low-resolution (low-res) point cloud first, and then perform a patch-wise noise-aware upsampling rather than interpolating the whole sparse point cloud at once, which tends to lose details. Regarding the possibility of lacking details of the initially decoded low-res point cloud, we propose an iterative refinement to recover the geometric details and a symmetrization process to preserve the trustworthy information from the input partial point cloud. After obtaining a sparse and complete point cloud, we propose a patch-wise upsampling strategy. Patch-based upsampling allows to recover fine details better rather than decoding a whole shape. The patch extraction approach is to generate training patch pairs between the sparse and ground-truth point clouds with an outlier removal step to suppress the noisy points from the sparse point cloud. Together with the low-res recovery, our whole pipeline can achieve high-fidelity point cloud completion. Comprehensive evaluations are provided to demonstrate the effectiveness of the proposed method and its components.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"126 ","pages":"Article 101173"},"PeriodicalIF":1.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49882826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Procedural generation of semantically plausible small-scale towns 语义上合理的小规模城镇的程序生成

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-04-01 DOI: 10.1016/j.gmod.2023.101170

Abdullah Bulbul

Procedural techniques have been successfully utilized for generating various kinds of 3D models. In this study, we propose a procedural method to build 3D towns that can be manipulated by a set of high-level semantic principles namely security, privacy, sustainability, social-life, economy, and beauty. Based on the user defined weights of these principles, our method generates a 3D settlement to accommodate a desired population over a given terrain. Our approach firstly determines where to establish the settlement over the large terrain which is followed by iteratively constructing the town. In both steps, the principles guide the decisions and our method generates natural looking small-scale 3D residential regions similar to the cities of pre-industrial era. We demonstrate the effectiveness of the proposed approach to build semantically plausible town models by presenting sample results over real world based terrains.

程序技术已经成功地用于生成各种类型的3D模型。在这项研究中，我们提出了一种构建3D城镇的程序方法，该方法可以通过一套高级语义原则来操作，即安全、隐私、可持续性、社会生活、经济和美丽。基于这些原则的用户定义权重，我们的方法生成了一个三维定居点，以在给定的地形上容纳所需的人口。我们的方法首先确定在大地形上的何处建立定居点，然后迭代构建城镇。在这两个步骤中，原则指导决策，我们的方法生成看起来自然的小规模3D住宅区，类似于前工业时代的城市。我们通过在基于现实世界的地形上呈现样本结果，证明了所提出的方法在建立语义上合理的城镇模型方面的有效性。

引用次数: 0

Learning-based 3D imaging from single structured-light image 基于学习的单结构光图像三维成像

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-04-01 DOI: 10.1016/j.gmod.2023.101171

Andrew-Hieu Nguyen , Olivia Rees , Zhaoyang Wang

Integrating structured-light technique with deep learning for single-shot 3D imaging has recently gained enormous attention due to its unprecedented robustness. This paper presents an innovative technique of supervised learning-based 3D imaging from a single grayscale structured-light image. The proposed approach uses a single-input, double-output convolutional neural network to transform a regular fringe-pattern image into two intermediate quantities which facilitate the subsequent 3D image reconstruction with high accuracy. A few experiments have been conducted to demonstrate the validity and robustness of the proposed technique.

将结构光技术与深度学习相结合用于单次3D成像最近因其前所未有的鲁棒性而受到极大关注。本文提出了一种创新的基于监督学习的单灰度结构光图像三维成像技术。所提出的方法使用单输入、双输出卷积神经网络将规则条纹图案图像转换为两个中间量，这有助于后续高精度的3D图像重建。已经进行了一些实验来证明所提出的技术的有效性和鲁棒性。

引用次数: 5

An improved semi-synthetic approach for creating visual-inertial odometry datasets 一种改进的半合成方法创建视觉惯性里程计数据集

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-04-01 DOI: 10.1016/j.gmod.2023.101172

Sam Schofield, Andrew Bainbridge-Smith, Richard Green

Capturing outdoor visual-inertial datasets is a challenging yet vital aspect of developing robust visual-inertial odometry (VIO) algorithms. A significant hurdle is that high-accuracy-ground-truth systems (e.g., motion capture) are not practical for outdoor use. One solution is to use a “semi-synthetic” approach that combines rendered images with real IMU data. This approach can produce sequences containing challenging imagery and accurate ground truth but with less simulated data than a fully synthetic sequence. Existing methods (used by popular tools/datasets) record IMU measurements from a visual-inertial system while measuring its trajectory using motion capture, then rendering images along that trajectory. This work identifies a major flaw in that approach, specifically that using motion capture alone to estimate the pose of the robot/system results in the generation of inconsistent visual-inertial data that is not suitable for evaluating VIO algorithms. However, we show that it is possible to generate high-quality semi-synthetic data for VIO algorithm evaluation. We do so using an open-source full-batch optimisation tool to incorporate both mocap and IMU measurements when estimating the IMU’s trajectory. We demonstrate that this improved trajectory results in better consistency between the IMU data and rendered images and that the resulting data improves VIO trajectory error by 79% compared to existing methods. Furthermore, we examine the effect of visual-inertial data inconsistency (as a result of trajectory noise) on VIO performance to provide a foundation for future work targeting real-time applications.

捕获户外视觉惯性数据集是开发鲁棒视觉惯性里程计（VIO）算法的一个具有挑战性但至关重要的方面。一个重要的障碍是高精度地面实况系统（例如运动捕捉）不适用于户外使用。一种解决方案是使用“半合成”方法，将渲染图像与真实IMU数据相结合。这种方法可以生成包含具有挑战性的图像和准确的地面实况的序列，但模拟数据比完全合成的序列少。现有的方法（由流行的工具/数据集使用）记录视觉惯性系统的IMU测量，同时使用运动捕捉测量其轨迹，然后沿着该轨迹绘制图像。这项工作发现了该方法中的一个主要缺陷，特别是单独使用运动捕捉来估计机器人/系统的姿态会导致产生不一致的视觉惯性数据，不适合评估VIO算法。然而，我们证明了生成用于VIO算法评估的高质量半合成数据是可能的。在估计IMU的轨迹时，我们使用开源的全批优化工具来结合mocap和IMU测量。我们证明，这种改进的轨迹导致IMU数据和渲染图像之间更好的一致性，并且与现有方法相比，所得到的数据将VIO轨迹误差提高了79%。此外，我们还研究了视觉惯性数据不一致性（轨迹噪声的结果）对VIO性能的影响，为未来针对实时应用的工作奠定了基础。

{"title":"An improved semi-synthetic approach for creating visual-inertial odometry datasets","authors":"Sam Schofield, Andrew Bainbridge-Smith, Richard Green","doi":"10.1016/j.gmod.2023.101172","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101172","url":null,"abstract":"<div><p>Capturing outdoor visual-inertial datasets is a challenging yet vital aspect of developing robust visual-inertial odometry (VIO) algorithms. A significant hurdle is that high-accuracy-ground-truth systems (e.g., motion capture) are not practical for outdoor use. One solution is to use a “semi-synthetic” approach that combines rendered images with real IMU data. This approach can produce sequences containing challenging imagery and accurate ground truth but with less simulated data than a fully synthetic sequence. Existing methods (used by popular tools/datasets) record IMU measurements from a visual-inertial system while measuring its trajectory using motion capture, then rendering images along that trajectory. This work identifies a major flaw in that approach, specifically that using motion capture alone to estimate the pose of the robot/system results in the generation of inconsistent visual-inertial data that is not suitable for evaluating VIO algorithms. However, we show that it is possible to generate high-quality semi-synthetic data for VIO algorithm evaluation. We do so using an open-source full-batch optimisation tool to incorporate both mocap and IMU measurements when estimating the IMU’s trajectory. We demonstrate that this improved trajectory results in better consistency between the IMU data and rendered images and that the resulting data improves VIO trajectory error by 79% compared to existing methods. Furthermore, we examine the effect of visual-inertial data inconsistency (as a result of trajectory noise) on VIO performance to provide a foundation for future work targeting real-time applications.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"126 ","pages":"Article 101172"},"PeriodicalIF":1.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49882825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Volume reconstruction based on the six-direction cubic box-spline 基于六向三次盒样条的体积重建

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2023-01-01 DOI: 10.1016/j.gmod.2022.101168

Hyunjun Kim , Minho Kim

We propose a new volume reconstruction technique based on the six-direction cubic box-spline $M_{6}$ . $M_{6}$ is $C^{1}$ continuous and possesses an approximation order of three, the same as that of the tri-quadratic B-spline but with much lower degree. In fact, $M_{6}$ has the lowest degree among the symmetric box-splines on $Z^{3}$ with at least $C^{1}$ continuity. We analyze the polynomial structure induced by the shifts of $M_{6}$ and propose an efficient analytic evaluation algorithm for splines and their derivatives (gradient and Hessian) based on the high symmetry of $M_{6}$ . To verify the evaluation algorithm, we implement a real-time GPU (graphics processing unit) isosurface raycaster which exhibits interactive performance (54.5 fps (frames per second) with $24 1^{3}$ dataset on $51 2^{2}$ framebuffer) on a modern graphics hardware. Moreover, we analyze $M_{6}$ as a reconstruction filter and state that it is comparable to the tri-cubic B-spline, which possesses a higher approximation order.

提出了一种基于六方向三次盒样条M6的体积重建方法。M6是C1连续的，近似阶数为3，与三二次b样条相同，但阶数要低得多。实际上，在Z3上至少具有C1连续性的对称盒样条曲线中，M6的度是最低的。我们分析了由M6的位移引起的多项式结构，并基于M6的高对称性提出了一种有效的样条及其导数(梯度和Hessian)的解析评估算法。为了验证评估算法，我们在现代图形硬件上实现了一个实时GPU(图形处理单元)等面光线投射器，它展示了交互性能(54.5 fps(帧每秒)，在5122 framebuffer上有2413个数据集)。此外，我们分析了M6作为一个重构滤波器，并指出它可与三立方b样条相媲美，具有更高的近似阶。

{"title":"Volume reconstruction based on the six-direction cubic box-spline","authors":"Hyunjun Kim , Minho Kim","doi":"10.1016/j.gmod.2022.101168","DOIUrl":"https://doi.org/10.1016/j.gmod.2022.101168","url":null,"abstract":"<div><p>We propose a new volume reconstruction technique based on the six-direction cubic box-spline <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span>. <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span> is <span><math><msup><mrow><mi>C</mi></mrow><mrow><mn>1</mn></mrow></msup></math></span> continuous and possesses an approximation order of three, the same as that of the tri-quadratic B-spline but with much lower degree. In fact, <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span> has the lowest degree among the symmetric box-splines on <span><math><msup><mrow><mi>Z</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span> with at least <span><math><msup><mrow><mi>C</mi></mrow><mrow><mn>1</mn></mrow></msup></math></span> continuity. We analyze the polynomial structure induced by the shifts of <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span> and propose an efficient analytic evaluation algorithm for splines and their derivatives (gradient and Hessian) based on the high symmetry of <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span>. To verify the evaluation algorithm, we implement a real-time GPU (graphics processing unit) isosurface raycaster which exhibits interactive performance (54.5 fps (frames per second) with <span><math><mrow><mn>24</mn><msup><mrow><mn>1</mn></mrow><mrow><mn>3</mn></mrow></msup></mrow></math></span> dataset on <span><math><mrow><mn>51</mn><msup><mrow><mn>2</mn></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span> framebuffer) on a modern graphics hardware. Moreover, we analyze <span><math><msub><mrow><mi>M</mi></mrow><mrow><mn>6</mn></mrow></msub></math></span> as a reconstruction filter and state that it is comparable to the tri-cubic B-spline, which possesses a higher approximation order.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"125 ","pages":"Article 101168"},"PeriodicalIF":1.7,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49875408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

SharpNet: A deep learning method for normal vector estimation of point cloud with sharp features SharpNet:一种用于尖锐特征点云法向量估计的深度学习方法

IF 1.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models

Pub Date : 2022-11-01 DOI: 10.1016/j.gmod.2022.101167

Zhaochen Zhang, Jianhui Nie, Mengjuan Yu, Xiao Liu

The normal vector is a basic attribute of point clouds. Traditional estimation methods are susceptible to noise and outliers. Recently, it reported that estimation robustness can be greatly improved by introducing Deep Neural Network (DNN), but how to accurately obtain the normal vector of sharp features still needs to be further studied. This paper proposes SharpNet, a DNN framework specializing in sharp features of CAD-like models, to transform problems into feature classification by the discretization of normal vector space. In order to eliminate the discretization error, a normal vector refining method is presented, which uses the difference between the initial normal vectors to distinguish neighborhood points of different local surface patches. Finally, the normal vector can be estimated accurately from the refined neighborhood points. Experiments show that our algorithm can estimate the normal vector of sharp features of CAD-like models accurately in challenging situations, and is superior to other DNN-based methods in terms of efficiency.

法向量是点云的基本属性。传统的估计方法容易受到噪声和异常值的影响。最近有报道称，引入深度神经网络(Deep Neural Network, DNN)可以大大提高估计的鲁棒性，但如何准确获取尖锐特征的法向量仍有待进一步研究。本文提出了一个专门研究类cad模型尖锐特征的深度神经网络框架SharpNet，通过法向量空间的离散化将问题转化为特征分类。为了消除离散化误差，提出了一种法向量细化方法，利用初始法向量的差值来区分不同局部表面斑块的邻域点。最后，通过改进后的邻域点可以准确地估计出法向量。实验表明，该算法可以在具有挑战性的情况下准确地估计类cad模型的尖锐特征的法向量，并且在效率上优于其他基于dnn的方法。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Graphical Models

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀