首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
Supporting tailorability in augmented reality based remote assistance in the manufacturing industry: A user study 在基于增强现实技术的制造业远程协助中支持量身定制:用户研究
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-16 DOI: 10.1016/j.cag.2024.104095
Troels Rasmussen , Kaj Grønbæk , Weidong Huang
Research on remote assistance in real-world industries is sparse, as most research is conducted in the laboratory under controlled conditions. Consequently, little is known about how users tailor remote assistance technologies at work. Therefore, we developed an augmented reality-based remote assistance prototype called Remote Assist Kit (RAK). RAK is a component-based system, allowing us to study tailoring activities and the usefulness of tailorable remote assistance technologies. We conducted a user evaluation with employees from the plastic manufacturing industry. The employees configured the RAK to solve real-world problems in three collaborative scenarios: (1) troubleshooting a running injection molding machine, (2) tool maintenance, (3) solving a trigonometry problem. Our results show that the tailorability of RAK was perceived as useful, and users were able to successfully tailor RAK to the distinct properties of the scenarios. Specific findings and their implications for the design of tailorable remote assistance technologies are presented. Among other findings, requirements specific to remote assistance in the manufacturing industry were discussed, such as the importance of sharing machine sounds between the local operator and the remote helper.
由于大多数研究都是在受控条件下在实验室中进行的,因此对实际行业中远程协助的研究很少。因此,人们对用户在工作中如何使用远程协助技术知之甚少。因此,我们开发了一个基于增强现实技术的远程协助原型,名为远程协助工具包(RAK)。RAK 是一个基于组件的系统,使我们能够研究定制活动和可定制远程协助技术的实用性。我们对塑料制造业的员工进行了用户评估。员工们对 RAK 进行了配置,以解决三个协作场景中的实际问题:(1) 对运行中的注塑机进行故障排除,(2) 工具维护,(3) 解决三角函数问题。我们的结果表明,RAK 的可定制性被认为是有用的,用户能够成功地定制 RAK 以适应不同场景的不同特性。本文介绍了具体的研究结果及其对量身定制的远程协助技术设计的影响。除其他发现外,还讨论了制造业对远程协助的具体要求,例如本地操作员和远程协助人员共享机器声音的重要性。
{"title":"Supporting tailorability in augmented reality based remote assistance in the manufacturing industry: A user study","authors":"Troels Rasmussen ,&nbsp;Kaj Grønbæk ,&nbsp;Weidong Huang","doi":"10.1016/j.cag.2024.104095","DOIUrl":"10.1016/j.cag.2024.104095","url":null,"abstract":"<div><div>Research on remote assistance in real-world industries is sparse, as most research is conducted in the laboratory under controlled conditions. Consequently, little is known about how users tailor remote assistance technologies at work. Therefore, we developed an augmented reality-based remote assistance prototype called Remote Assist Kit (RAK). RAK is a component-based system, allowing us to study tailoring activities and the usefulness of tailorable remote assistance technologies. We conducted a user evaluation with employees from the plastic manufacturing industry. The employees configured the RAK to solve real-world problems in three collaborative scenarios: (1) troubleshooting a running injection molding machine, (2) tool maintenance, (3) solving a trigonometry problem. Our results show that the tailorability of RAK was perceived as useful, and users were able to successfully tailor RAK to the distinct properties of the scenarios. Specific findings and their implications for the design of tailorable remote assistance technologies are presented. Among other findings, requirements specific to remote assistance in the manufacturing industry were discussed, such as the importance of sharing machine sounds between the local operator and the remote helper.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104095"},"PeriodicalIF":2.5,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating implicit object fragment datasets for machine learning 为机器学习生成隐含对象片段数据集
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-15 DOI: 10.1016/j.cag.2024.104104
Alfonso López , Antonio J. Rueda , Rafael J. Segura , Carlos J. Ogayar , Pablo Navarro , José M. Fuertes
One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (Github). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showing similar results.
利用深度学习模型固有的主要挑战之一,是获取足够规模的数据集以促进这些网络的有效训练所面临的稀缺性和可访问性障碍。这一点在物体检测、形状补全和断裂组装方面尤为突出。与其扫描现实世界中的大量碎片,不如用合成碎片生成海量数据集。然而,现实中的碎片在准备(如预制模型)和生成过程中需要大量计算。否则,Voronoi 图等更简单的算法可以提供更快的处理速度,但却牺牲了真实性。在这种情况下,需要在计算效率和真实度之间取得平衡。本文介绍了一种基于 GPU 的框架,用于大规模生成源自高分辨率三维模型的体素化片段,专门用作机器学习模型的训练集。这种快速管道可以控制生成碎片的数量、碎片的分散以及侵蚀等微妙效果的出现。我们用一个考古数据集测试了我们的管道,从 1052 件伊比利亚船只中生成了 100 多万件碎片(Github)。虽然这项工作的主要目的是提供由体素表示的隐式数据碎片,但三角网格和点云也可以从初始隐式表示中推断出来。为了强调 CPU 和 GPU 加速在生成庞大数据集方面的无与伦比的优势,我们与一个现实的片段生成器进行了比较,以突出我们的方法在适用性和处理时间方面的潜力。我们还展示了我们的管道与现实模拟器之间的协同效应,现实模拟器通常无法选择生成片段的数量和大小。为此,我们在现实碎片和我们的数据集上训练了一个深度学习模型,显示了类似的结果。
{"title":"Generating implicit object fragment datasets for machine learning","authors":"Alfonso López ,&nbsp;Antonio J. Rueda ,&nbsp;Rafael J. Segura ,&nbsp;Carlos J. Ogayar ,&nbsp;Pablo Navarro ,&nbsp;José M. Fuertes","doi":"10.1016/j.cag.2024.104104","DOIUrl":"10.1016/j.cag.2024.104104","url":null,"abstract":"<div><div>One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (<span><span>Github</span><svg><path></path></svg></span>). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showing similar results.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104104"},"PeriodicalIF":2.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADA-SCMS Net: A self-supervised clustering-based 3D mesh segmentation network with aggregation dual autoencoder ADA-SCMS 网络:基于自监督聚类的三维网状分割网络与聚合双自动编码器
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-11 DOI: 10.1016/j.cag.2024.104100
Xue Jiao , Xiaohui Yang
Despite significant advances in 3D mesh segmentation techniques driven by deep learning, segmenting 3D meshes without exhaustive manual labeling remains a challenging due to difficulties in acquiring high-quality labeled datasets. This paper introduces an aggregation dual autoencoder self-supervised clustering-based mesh segmentation network for unlabeled 3D meshes (ADA-SCMS Net). Expanding upon the previously proposed SCMS-Net, the ADA-SCMS Net enhances the segmentation process by incorporating a denoising autoencoder with an improved graph autoencoder as its basic structure. This modification prompts the segmentation network to concentrate on the primary structure of the input data during training, enabling the capture of robust features. In addition, the ADA-SCMS network introduces two new modules. One module is named the branch aggregation module, which combines the strengths of two branches to create a semantic latent representation. The other is the aggregation self-supervised clustering module, which facilitates end-to-end clustering training by iteratively updating each branch through mutual supervision. Extensive experiments on benchmark datasets validate the effectiveness of the ADA-SCMS network, demonstrating superior segmentation performance compared to the SCMS network.
尽管深度学习驱动的三维网格分割技术取得了长足进步,但由于难以获得高质量的标记数据集,在不进行详尽人工标记的情况下分割三维网格仍然是一项挑战。本文介绍了一种基于聚合双自动编码器自监督聚类的三维网格分割网络(ADA-SCMS Net)。ADA-SCMS Net 以之前提出的 SCMS-Net 为基础,通过将去噪自动编码器与改进的图自动编码器作为其基本结构,增强了分割过程。这一修改促使分割网络在训练过程中专注于输入数据的主要结构,从而捕捉到稳健的特征。此外,ADA-SCMS 网络还引入了两个新模块。一个模块被命名为分支聚合模块,它结合了两个分支的优势来创建语义潜表征。另一个是聚合自监督聚类模块,它通过相互监督迭代更新每个分支来促进端到端的聚类训练。在基准数据集上进行的大量实验验证了 ADA-SCMS 网络的有效性,与 SCMS 网络相比,ADA-SCMS 网络具有更出色的分割性能。
{"title":"ADA-SCMS Net: A self-supervised clustering-based 3D mesh segmentation network with aggregation dual autoencoder","authors":"Xue Jiao ,&nbsp;Xiaohui Yang","doi":"10.1016/j.cag.2024.104100","DOIUrl":"10.1016/j.cag.2024.104100","url":null,"abstract":"<div><div>Despite significant advances in 3D mesh segmentation techniques driven by deep learning, segmenting 3D meshes without exhaustive manual labeling remains a challenging due to difficulties in acquiring high-quality labeled datasets. This paper introduces an <strong>a</strong>ggregation <strong>d</strong>ual <strong>a</strong>utoencoder <strong>s</strong>elf-supervised <strong>c</strong>lustering-based <strong>m</strong>esh <strong>s</strong>egmentation network for unlabeled 3D meshes (ADA-SCMS Net). Expanding upon the previously proposed SCMS-Net, the ADA-SCMS Net enhances the segmentation process by incorporating a denoising autoencoder with an improved graph autoencoder as its basic structure. This modification prompts the segmentation network to concentrate on the primary structure of the input data during training, enabling the capture of robust features. In addition, the ADA-SCMS network introduces two new modules. One module is named the branch aggregation module, which combines the strengths of two branches to create a semantic latent representation. The other is the aggregation self-supervised clustering module, which facilitates end-to-end clustering training by iteratively updating each branch through mutual supervision. Extensive experiments on benchmark datasets validate the effectiveness of the ADA-SCMS network, demonstrating superior segmentation performance compared to the SCMS network.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104100"},"PeriodicalIF":2.5,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative analysis of spatiotemporal playback manipulation on virtual reality training for External Ventricular Drainage 时空回放操作对虚拟现实脑室外引流训练的比较分析
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-10 DOI: 10.1016/j.cag.2024.104106
Andreas Wrife, Renan Guarese, Alessandro Iop, Mario Romero
Extensive research has been conducted in multiple surgical specialities where Virtual Reality (VR) has been utilised, such as spinal neurosurgery. However, cranial neurosurgery remains relatively unexplored in this regard. This work explores the impact of adopting VR to study External Ventricular Drainage (EVD). In this study, pre-recorded Motion Captured data of an EVD procedure is visualised on a VR headset, in comparison to a desktop monitor condition. Participants (N=20) were tasked with identifying and marking a key moment in the recordings. Objective and subjective metrics were recorded, such as completion time, temporal and spatial error distances, workload, and usability. The results from the experiment showed that the task was completed on average twice as fast in VR, when compared to desktop. However, desktop showed fewer error-prone results. Subjective feedback showed a slightly higher preference towards the VR environment concerning usability, while maintaining a comparable workload. Overall, VR displays are promising as an alternative tool to be used for educational and training purposes in cranial surgery.
虚拟现实(VR)已在脊柱神经外科等多个外科专业得到广泛应用。然而,颅脑神经外科在这方面的研究相对较少。这项工作探讨了采用 VR 研究脑室外引流 (EVD) 的影响。在这项研究中,预先录制的 EVD 手术运动捕捉数据在 VR 头显上可视化,并与桌面显示器条件进行比较。参与者(20 人)的任务是识别并标记记录中的关键时刻。实验记录了客观和主观指标,如完成时间、时空误差距离、工作量和可用性。实验结果表明,与台式机相比,在 VR 中完成任务的平均速度是台式机的两倍。不过,台式机显示的易出错结果更少。主观反馈显示,在保持工作量相当的情况下,VR 环境在可用性方面略胜一筹。总之,VR 显示器有望成为颅脑手术教育和培训的替代工具。
{"title":"Comparative analysis of spatiotemporal playback manipulation on virtual reality training for External Ventricular Drainage","authors":"Andreas Wrife,&nbsp;Renan Guarese,&nbsp;Alessandro Iop,&nbsp;Mario Romero","doi":"10.1016/j.cag.2024.104106","DOIUrl":"10.1016/j.cag.2024.104106","url":null,"abstract":"<div><div>Extensive research has been conducted in multiple surgical specialities where Virtual Reality (VR) has been utilised, such as spinal neurosurgery. However, cranial neurosurgery remains relatively unexplored in this regard. This work explores the impact of adopting VR to study External Ventricular Drainage (EVD). In this study, pre-recorded Motion Captured data of an EVD procedure is visualised on a VR headset, in comparison to a desktop monitor condition. Participants (<span><math><mrow><mi>N</mi><mo>=</mo><mn>20</mn></mrow></math></span>) were tasked with identifying and marking a key moment in the recordings. Objective and subjective metrics were recorded, such as completion time, temporal and spatial error distances, workload, and usability. The results from the experiment showed that the task was completed on average twice as fast in VR, when compared to desktop. However, desktop showed fewer error-prone results. Subjective feedback showed a slightly higher preference towards the VR environment concerning usability, while maintaining a comparable workload. Overall, VR displays are promising as an alternative tool to be used for educational and training purposes in cranial surgery.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104106"},"PeriodicalIF":2.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-image SVBRDF estimation with auto-adaptive high-frequency feature extraction 利用自动适应性高频特征提取进行单图像 SVBRDF 估算
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-09 DOI: 10.1016/j.cag.2024.104103
Jiamin Cheng, Li Wang, Lianghao Zhang, Fangzhou Gao, Jiawan Zhang
In this paper, we address the task of estimating spatially-varying bi-directional reflectance distribution functions (SVBRDF) of a near-planar surface from a single flash-lit image. Disentangling SVBRDF from the material appearance by deep learning has proven a formidable challenge. This difficulty is particularly pronounced when dealing with images lit by a point light source because the uneven distribution of irradiance in the scene interacts with the surface, leading to significant global luminance variations across the image. These variations may be overemphasized by the network and wrongly baked into the material property space. To tackle this issue, we propose a high-frequency path that contains an auto-adaptive subband “knob”. This path aims to extract crucial image textures and details while eliminating global luminance variations present in the original image. Furthermore, recognizing that color information is ignored in this path, we design a two-path strategy to jointly estimate material reflectance from both the high-frequency path and the original image. Extensive experiments on a substantial dataset have confirmed the effectiveness of our method. Our method outperforms state-of-the-art methods across a wide range of materials.
在本文中,我们要解决的任务是从单张闪光图像中估算近平面表面的空间变化双向反射率分布函数(SVBRDF)。事实证明,通过深度学习将 SVBRDF 从材料外观中分离出来是一项艰巨的挑战。在处理由点光源点亮的图像时,这种困难尤为明显,因为场景中不均匀的辐照度分布与表面相互作用,导致整个图像的亮度出现显著的全局变化。这些变化可能会被网络过度强调,并错误地嵌入到材料属性空间中。为了解决这个问题,我们提出了一种包含自动适应子带 "旋钮 "的高频路径。该路径旨在提取关键的图像纹理和细节,同时消除原始图像中存在的全局亮度变化。此外,考虑到该路径忽略了颜色信息,我们设计了一种双路径策略,从高频路径和原始图像中联合估计材料反射率。在大量数据集上进行的广泛实验证实了我们方法的有效性。在各种材料上,我们的方法都优于最先进的方法。
{"title":"Single-image SVBRDF estimation with auto-adaptive high-frequency feature extraction","authors":"Jiamin Cheng,&nbsp;Li Wang,&nbsp;Lianghao Zhang,&nbsp;Fangzhou Gao,&nbsp;Jiawan Zhang","doi":"10.1016/j.cag.2024.104103","DOIUrl":"10.1016/j.cag.2024.104103","url":null,"abstract":"<div><div>In this paper, we address the task of estimating spatially-varying bi-directional reflectance distribution functions (SVBRDF) of a near-planar surface from a single flash-lit image. Disentangling SVBRDF from the material appearance by deep learning has proven a formidable challenge. This difficulty is particularly pronounced when dealing with images lit by a point light source because the uneven distribution of irradiance in the scene interacts with the surface, leading to significant global luminance variations across the image. These variations may be overemphasized by the network and wrongly baked into the material property space. To tackle this issue, we propose a high-frequency path that contains an auto-adaptive subband “knob”. This path aims to extract crucial image textures and details while eliminating global luminance variations present in the original image. Furthermore, recognizing that color information is ignored in this path, we design a two-path strategy to jointly estimate material reflectance from both the high-frequency path and the original image. Extensive experiments on a substantial dataset have confirmed the effectiveness of our method. Our method outperforms state-of-the-art methods across a wide range of materials.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104103"},"PeriodicalIF":2.5,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An immersive labeling method for large point clouds 大型点云的沉浸式标注方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-05 DOI: 10.1016/j.cag.2024.104101
Tianfang Lin , Zhongyuan Yu , Matthew McGinity , Stefan Gumhold
3D point clouds, such as those produced by 3D scanners, often require labeling – the accurate classification of each point into structural or semantic categories – before they can be used in their intended application. However, in the absence of fully automated methods, such labeling must be performed manually, which can prove extremely time and labor intensive. To address this we present a virtual reality tool for accelerating and improving the manual labeling of very large 3D point clouds. The labeling tool provides a variety of 3D interactions for efficient viewing, selection and labeling of points using the controllers of consumer VR-kits. The main contribution of our work is a mixed CPU/GPU-based data structure that supports rendering, selection and labeling with immediate visual feedback at high frame rates necessary for a convenient VR experience. Our mixed CPU/GPU data structure supports fluid interaction with very large point clouds in VR, what is not possible with existing continuous level-of-detail rendering algorithms. We evaluate our method with 25 users on tasks involving point clouds of up to 50 million points and find convincing results that support the case for VR-based point cloud labeling.
三维点云(如三维扫描仪生成的点云)通常需要进行标注--将每个点精确分类为结构或语义类别--才能用于其预期应用。然而,由于缺乏完全自动化的方法,这种标注工作必须手动完成,这可能会耗费大量的时间和人力。为此,我们推出了一款虚拟现实工具,用于加速和改进超大型三维点云的手动标注。该标注工具提供了多种三维交互方式,可使用消费级 VR 工具包的控制器高效查看、选择和标注点。我们工作的主要贡献是基于 CPU/GPU 的混合数据结构,它支持渲染、选择和标注,并能以方便的 VR 体验所需的高帧速率提供即时视觉反馈。我们的 CPU/GPU 混合数据结构支持在 VR 中与超大型点云进行流畅交互,而现有的连续细节级渲染算法则无法实现这一点。我们与 25 位用户就涉及多达 5000 万个点的点云任务对我们的方法进行了评估,结果令人信服,支持基于 VR 的点云标注。
{"title":"An immersive labeling method for large point clouds","authors":"Tianfang Lin ,&nbsp;Zhongyuan Yu ,&nbsp;Matthew McGinity ,&nbsp;Stefan Gumhold","doi":"10.1016/j.cag.2024.104101","DOIUrl":"10.1016/j.cag.2024.104101","url":null,"abstract":"<div><div>3D point clouds, such as those produced by 3D scanners, often require labeling – the accurate classification of each point into structural or semantic categories – before they can be used in their intended application. However, in the absence of fully automated methods, such labeling must be performed manually, which can prove extremely time and labor intensive. To address this we present a virtual reality tool for accelerating and improving the manual labeling of very large 3D point clouds. The labeling tool provides a variety of 3D interactions for efficient viewing, selection and labeling of points using the controllers of consumer VR-kits. The main contribution of our work is a mixed CPU/GPU-based data structure that supports rendering, selection and labeling with immediate visual feedback at high frame rates necessary for a convenient VR experience. Our mixed CPU/GPU data structure supports fluid interaction with very large point clouds in VR, what is not possible with existing continuous level-of-detail rendering algorithms. We evaluate our method with 25 users on tasks involving point clouds of up to 50 million points and find convincing results that support the case for VR-based point cloud labeling.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104101"},"PeriodicalIF":2.5,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances in vision-based deep learning methods for interacting hands reconstruction: A survey 基于视觉的深度学习方法在交互式手部重建方面的进展:调查
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-05 DOI: 10.1016/j.cag.2024.104102
Yu Miao, Yue Liu
Vision-based hand reconstructions have become noteworthy tools in enhancing interactive experiences in various applications such as virtual reality, augmented reality, and autonomous driving, which enable sophisticated interactions by reconstructing complex motions of human hands. Despite significant progress driven by deep-learning methodologies, the quest for high-fidelity interacting hands reconstruction faces challenges such as limited dataset diversity, lack of detailed hand representation, occlusions, and differentiation between similar hand structures. This survey thoroughly reviews deep learning-based methods, diverse datasets, loss functions, and evaluation metrics addressing the complexities of interacting hands reconstruction. Mainstream algorithms of the past five years are systematically classified into two main categories: algorithms that employ explicit representations, such as parametric meshes and 3D Gaussian splatting, and those that utilize implicit representations, including signed distance fields and neural radiance fields. Novel deep-learning models like graph convolutional networks and transformers are applied to solve the aforementioned challenges in hand reconstruction effectively. Beyond summarizing these interaction-aware algorithms, this survey also briefly discusses hand tracking in virtual reality and augmented reality. To the best of our knowledge, this is the first survey specifically focusing on the reconstruction of both hands and their interactions with objects. The survey contains the various facets of hand modeling, deep learning approaches, and datasets, broadening the horizon of hand reconstruction research and future innovation in natural user interactions.
在虚拟现实、增强现实和自动驾驶等各种应用中,基于视觉的手部重建已成为增强交互体验的重要工具,这些应用通过重建人手的复杂动作实现了复杂的交互。尽管在深度学习方法的推动下取得了重大进展,但高保真交互手部重建的探索仍面临挑战,如数据集多样性有限、缺乏详细的手部表示、遮挡以及相似手部结构之间的区分。本调查全面回顾了基于深度学习的方法、各种数据集、损失函数和评估指标,以解决交互式手部重建的复杂性问题。过去五年的主流算法被系统地分为两大类:一类是采用显式表示的算法,如参数网格和三维高斯拼接;另一类是采用隐式表示的算法,包括符号距离场和神经辐射场。图卷积网络和变换器等新型深度学习模型被用于有效解决上述手部重建难题。除了总结这些交互感知算法外,本调查还简要讨论了虚拟现实和增强现实中的手部跟踪。据我们所知,这是第一份专门针对双手重建及其与物体交互的调查报告。该调查包含了手部建模、深度学习方法和数据集的各个方面,拓宽了手部重建研究和未来自然用户交互创新的视野。
{"title":"Advances in vision-based deep learning methods for interacting hands reconstruction: A survey","authors":"Yu Miao,&nbsp;Yue Liu","doi":"10.1016/j.cag.2024.104102","DOIUrl":"10.1016/j.cag.2024.104102","url":null,"abstract":"<div><div>Vision-based hand reconstructions have become noteworthy tools in enhancing interactive experiences in various applications such as virtual reality, augmented reality, and autonomous driving, which enable sophisticated interactions by reconstructing complex motions of human hands. Despite significant progress driven by deep-learning methodologies, the quest for high-fidelity interacting hands reconstruction faces challenges such as limited dataset diversity, lack of detailed hand representation, occlusions, and differentiation between similar hand structures. This survey thoroughly reviews deep learning-based methods, diverse datasets, loss functions, and evaluation metrics addressing the complexities of interacting hands reconstruction. Mainstream algorithms of the past five years are systematically classified into two main categories: algorithms that employ explicit representations, such as parametric meshes and 3D Gaussian splatting, and those that utilize implicit representations, including signed distance fields and neural radiance fields. Novel deep-learning models like graph convolutional networks and transformers are applied to solve the aforementioned challenges in hand reconstruction effectively. Beyond summarizing these interaction-aware algorithms, this survey also briefly discusses hand tracking in virtual reality and augmented reality. To the best of our knowledge, this is the first survey specifically focusing on the reconstruction of both hands and their interactions with objects. The survey contains the various facets of hand modeling, deep learning approaches, and datasets, broadening the horizon of hand reconstruction research and future innovation in natural user interactions.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104102"},"PeriodicalIF":2.5,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diverse non-homogeneous texture synthesis from a single exemplar 从单一范例中合成多样化非均质纹理
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-04 DOI: 10.1016/j.cag.2024.104099
A. Phillips , J. Lang , D. Mould
Capturing non-local, long range features present in non-homogeneous textures is difficult to achieve with existing techniques. We introduce a new training method and architecture for single-exemplar texture synthesis that combines a Generative Adversarial Network (GAN) and a Variational Autoencoder (VAE). In the proposed architecture, the combined networks share information during training via structurally identical, independent blocks, facilitating highly diverse texture variations from a single image exemplar. Supporting this training method, we also include a similarity loss term that further encourages diverse output while also improving the overall quality. Using our approach, it is possible to produce diverse results over the entire sample size taken from a single model that can be trained in approximately 15 min. We show that our approach obtains superior performance when compared to SOTA texture synthesis methods and single image GAN methods using standard diversity and quality metrics.
现有技术难以捕捉非同质纹理中的非局部、长距离特征。我们为单例纹理合成引入了一种新的训练方法和架构,它结合了生成对抗网络(GAN)和变异自动编码器(VAE)。在所提出的架构中,组合网络在训练过程中通过结构相同的独立块共享信息,从而促进单个图像示例的纹理变化高度多样化。为了支持这种训练方法,我们还加入了一个相似性损失项,在提高整体质量的同时,进一步鼓励多样化的输出。使用我们的方法,可以在大约 15 分钟的时间内,通过一个单一模型的训练,在整个样本大小上产生多样化的结果。我们的研究表明,与 SOTA 纹理合成方法和使用标准多样性和质量指标的单图像 GAN 方法相比,我们的方法具有更优越的性能。
{"title":"Diverse non-homogeneous texture synthesis from a single exemplar","authors":"A. Phillips ,&nbsp;J. Lang ,&nbsp;D. Mould","doi":"10.1016/j.cag.2024.104099","DOIUrl":"10.1016/j.cag.2024.104099","url":null,"abstract":"<div><div>Capturing non-local, long range features present in non-homogeneous textures is difficult to achieve with existing techniques. We introduce a new training method and architecture for single-exemplar texture synthesis that combines a Generative Adversarial Network (GAN) and a Variational Autoencoder (VAE). In the proposed architecture, the combined networks share information during training via structurally identical, independent blocks, facilitating highly diverse texture variations from a single image exemplar. Supporting this training method, we also include a similarity loss term that further encourages diverse output while also improving the overall quality. Using our approach, it is possible to produce diverse results over the entire sample size taken from a single model that can be trained in approximately 15 min. We show that our approach obtains superior performance when compared to SOTA texture synthesis methods and single image GAN methods using standard diversity and quality metrics.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104099"},"PeriodicalIF":2.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric implicit neural representations for signed distance functions 带符号距离函数的几何隐含神经表征
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-01 DOI: 10.1016/j.cag.2024.104085
Luiz Schirmer , Tiago Novello , Vinícius da Silva , Guilherme Schardong , Daniel Perazzo , Hélio Lopes , Nuno Gonçalves , Luiz Velho
Implicit neural representations (INRs) have emerged as a promising framework for representing signals in low-dimensional spaces. This survey reviews the existing literature on the specialized INR problem of approximating signed distance functions (SDFs) for surface scenes, using either oriented point clouds or a set of posed images. We refer to neural SDFs that incorporate differential geometry tools, such as normals and curvatures, in their loss functions as geometric INRs. The key idea behind this 3D reconstruction approach is to include additional regularization terms in the loss function, ensuring that the INR satisfies certain global properties that the function should hold — such as having unit gradient in the case of SDFs. We explore key methodological components, including the definition of INR, the construction of geometric loss functions, and sampling schemes from a differential geometry perspective. Our review highlights the significant advancements enabled by geometric INRs in surface reconstruction from oriented point clouds and posed images.
隐式神经表征(INRs)是在低维空间中表示信号的一种有前途的框架。本研究回顾了现有的 INR 专门问题文献,即利用定向点云或一组假定图像来逼近表面场景的符号距离函数 (SDF)。我们将损失函数中包含法线和曲率等微分几何工具的神经 SDF 称为几何 INR。这种三维重建方法背后的关键理念是在损失函数中加入额外的正则化项,确保 INR 满足函数应具有的某些全局属性--例如在 SDF 中具有单位梯度。我们从微分几何的角度探讨了方法论的关键部分,包括 INR 的定义、几何损失函数的构建和采样方案。我们的综述强调了几何 INR 在从定向点云和假定图像进行表面重建方面取得的重大进展。
{"title":"Geometric implicit neural representations for signed distance functions","authors":"Luiz Schirmer ,&nbsp;Tiago Novello ,&nbsp;Vinícius da Silva ,&nbsp;Guilherme Schardong ,&nbsp;Daniel Perazzo ,&nbsp;Hélio Lopes ,&nbsp;Nuno Gonçalves ,&nbsp;Luiz Velho","doi":"10.1016/j.cag.2024.104085","DOIUrl":"10.1016/j.cag.2024.104085","url":null,"abstract":"<div><div><em>Implicit neural representations</em> (INRs) have emerged as a promising framework for representing signals in low-dimensional spaces. This survey reviews the existing literature on the specialized INR problem of approximating <em>signed distance functions</em> (SDFs) for surface scenes, using either oriented point clouds or a set of posed images. We refer to neural SDFs that incorporate differential geometry tools, such as normals and curvatures, in their loss functions as <em>geometric</em> INRs. The key idea behind this 3D reconstruction approach is to include additional <em>regularization</em> terms in the loss function, ensuring that the INR satisfies certain global properties that the function should hold — such as having unit gradient in the case of SDFs. We explore key methodological components, including the definition of INR, the construction of geometric loss functions, and sampling schemes from a differential geometry perspective. Our review highlights the significant advancements enabled by geometric INRs in surface reconstruction from oriented point clouds and posed images.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104085"},"PeriodicalIF":2.5,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow style-aware network for arbitrary style transfer 用于任意样式传输的流量样式感知网络
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-29 DOI: 10.1016/j.cag.2024.104098
Zhenshan Hu, Bin Ge, Chenxing Xia, Wenyan Wu, Guangao Zhou, Baotong Wang
Researchers have recently proposed arbitrary style transfer methods based on various model frameworks. Although all of them have achieved good results, they still face the problems of insufficient stylization, artifacts and inadequate retention of content structure. In order to solve these problems, we propose a flow style-aware network (FSANet) for arbitrary style transfer, which combines a VGG network and a flow network. FSANet consists of a flow style transfer module (FSTM), a dynamic regulation attention module (DRAM), and a style feature interaction module (SFIM). The flow style transfer module uses the reversible residue block features of the flow network to create a sample feature containing the target content and style. To adapt the FSTM to VGG networks, we design the dynamic regulation attention module and exploit the sample features both at the channel and pixel levels. The style feature interaction module computes a style tensor that optimizes the fused features. Extensive qualitative and quantitative experiments demonstrate that our proposed FSANet can effectively avoid artifacts and enhance the preservation of content details while migrating style features.
最近,研究人员提出了基于各种模型框架的任意文体转换方法。虽然这些方法都取得了不错的效果,但仍然面临着风格化不足、伪原创和内容结构保留不充分等问题。为了解决这些问题,我们提出了一种用于任意风格转移的流风格感知网络(FSANet),它将 VGG 网络和流网络相结合。FSANet 由流量风格传输模块(FSTM)、动态调节关注模块(DRAM)和风格特征交互模块(SFIM)组成。流式传输模块利用流式网络的可逆残差块特征创建包含目标内容和风格的样本特征。为使 FSTM 适应 VGG 网络,我们设计了动态调节关注模块,并在通道和像素层面利用样本特征。风格特征交互模块可计算出优化融合特征的风格张量。广泛的定性和定量实验证明,我们提出的 FSANet 可以在迁移风格特征时有效避免伪影,并增强对内容细节的保护。
{"title":"Flow style-aware network for arbitrary style transfer","authors":"Zhenshan Hu,&nbsp;Bin Ge,&nbsp;Chenxing Xia,&nbsp;Wenyan Wu,&nbsp;Guangao Zhou,&nbsp;Baotong Wang","doi":"10.1016/j.cag.2024.104098","DOIUrl":"10.1016/j.cag.2024.104098","url":null,"abstract":"<div><div>Researchers have recently proposed arbitrary style transfer methods based on various model frameworks. Although all of them have achieved good results, they still face the problems of insufficient stylization, artifacts and inadequate retention of content structure. In order to solve these problems, we propose a flow style-aware network (FSANet) for arbitrary style transfer, which combines a VGG network and a flow network. FSANet consists of a flow style transfer module (FSTM), a dynamic regulation attention module (DRAM), and a style feature interaction module (SFIM). The flow style transfer module uses the reversible residue block features of the flow network to create a sample feature containing the target content and style. To adapt the FSTM to VGG networks, we design the dynamic regulation attention module and exploit the sample features both at the channel and pixel levels. The style feature interaction module computes a style tensor that optimizes the fused features. Extensive qualitative and quantitative experiments demonstrate that our proposed FSANet can effectively avoid artifacts and enhance the preservation of content details while migrating style features.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104098"},"PeriodicalIF":2.5,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1