Computers & Graphics-Uk最新文献_第7页

Adaptive 360° video timeline exploration in VR environment 在 VR 环境中探索自适应 360° 视频时间轴

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-11-04 DOI: 10.1016/j.cag.2024.104108

Mengmeng Yu, Chongke Bi

Timeline control is a crucial interaction during video viewing, aiding users in quickly locating or jumping to specific points in the video playback, especially when dealing with lengthy content. 360°videos, with their ability to offer an all-encompassing view, have gradually gained popularity, providing a more immersive experience compared to videos with a single perspective. While most 360°videos are currently displayed on two-dimensional screens, the timeline design has largely remained similar to that of conventional videos. However, virtual reality (VR) headsets provide a more immersive viewing experience for 360°videos and offer additional dimensions for timeline design. In this paper, we initially explored 6 timeline design styles by varying the shape and interaction distance of the timeline, aiming to discover designs more suitable for the VR environment of 360°videos. Subsequently, we introduced an adaptive timeline display mechanism based on eye gaze sequences to optimize the timeline, addressing issues like obstructing the view and causing distractions when the timeline is consistently visible. Through two studies, we first demonstrated that in the 360°space, the three-dimensional timeline performs better in terms of usability than the two-dimensional one, and the reachable timeline has advantages in performance and experience over the distant one. Secondly, we verified that, without compromising interaction efficiency and system usability, the adaptive display timeline gained more user preference due to its accurate prediction of user timeline needs.

时间轴控制是视频观看过程中的重要交互方式，可帮助用户快速定位或跳转到视频播放中的特定点，尤其是在处理冗长的内容时。360° 视频能够提供全方位的视角，与单一视角的视频相比，它能提供更身临其境的体验，因此逐渐受到人们的欢迎。虽然目前大多数 360° 视频都是在二维屏幕上显示的，但时间线的设计与传统视频基本保持相似。然而，虚拟现实（VR）头显为 360° 视频提供了更身临其境的观看体验，并为时间轴设计提供了更多的维度。在本文中，我们通过改变时间轴的形状和交互距离，初步探索了 6 种时间轴设计风格，旨在发现更适合 360° 视频的 VR 环境的设计。随后，我们引入了一种基于眼睛注视序列的自适应时间轴显示机制，以优化时间轴，解决时间轴持续可见时阻碍视线和分散注意力等问题。通过两项研究，我们首先证明了在 360° 空间中，三维时间轴的可用性优于二维时间轴，可触及时间轴的性能和体验优于远距离时间轴。其次，我们验证了在不影响交互效率和系统可用性的情况下，自适应显示时间轴因其对用户时间轴需求的准确预测而获得了更多用户的青睐。

{"title":"Adaptive 360° video timeline exploration in VR environment","authors":"Mengmeng Yu, Chongke Bi","doi":"10.1016/j.cag.2024.104108","DOIUrl":"10.1016/j.cag.2024.104108","url":null,"abstract":"<div><div>Timeline control is a crucial interaction during video viewing, aiding users in quickly locating or jumping to specific points in the video playback, especially when dealing with lengthy content. 360°videos, with their ability to offer an all-encompassing view, have gradually gained popularity, providing a more immersive experience compared to videos with a single perspective. While most 360°videos are currently displayed on two-dimensional screens, the timeline design has largely remained similar to that of conventional videos. However, virtual reality (VR) headsets provide a more immersive viewing experience for 360°videos and offer additional dimensions for timeline design. In this paper, we initially explored 6 timeline design styles by varying the shape and interaction distance of the timeline, aiming to discover designs more suitable for the VR environment of 360°videos. Subsequently, we introduced an adaptive timeline display mechanism based on eye gaze sequences to optimize the timeline, addressing issues like obstructing the view and causing distractions when the timeline is consistently visible. Through two studies, we first demonstrated that in the 360°space, the three-dimensional timeline performs better in terms of usability than the two-dimensional one, and the reachable timeline has advantages in performance and experience over the distant one. Secondly, we verified that, without compromising interaction efficiency and system usability, the adaptive display timeline gained more user preference due to its accurate prediction of user timeline needs.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104108"},"PeriodicalIF":2.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CGLight: An effective indoor illumination estimation method based on improved convmixer and GauGAN CGLight：基于改进型卷积混频器和 GauGAN 的有效室内光照度估计方法

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-11-04 DOI: 10.1016/j.cag.2024.104122

Yang Wang , Shijia Song , Lijun Zhao , Huijuan Xia , Zhenyu Yuan , Ying Zhang

Illumination consistency is a key factor for seamlessly integrating virtual objects with real scenes in augmented reality (AR) systems. High dynamic range (HDR) panoramic images are widely used to estimate scene lighting accurately. However, generating environment maps requires complex deep network architectures, which cannot operate on devices with limited memory space. To address this issue, this paper proposes CGLight, an effective illumination estimation method that predicts HDR panoramic environment maps from a single limited field-of-view (LFOV) image. We first design a CMAtten encoder to extract features from input images, which learns the spherical harmonic (SH) lighting representation with fewer model parameters. Guided by the lighting parameters, we train a generative adversarial network (GAN) to generate HDR environment maps. In addition, to enrich lighting details and reduce training time, we specifically introduce the color consistency loss and independent discriminator, considering the impact of color properties on the lighting estimation task while improving computational efficiency. Furthermore, the effectiveness of CGLight is verified by relighting virtual objects using the predicted environment maps, and the root mean square error and angular error are 0.0494 and 4.0607 in the gray diffuse sphere, respectively. Extensive experiments and analyses demonstrate that CGLight achieves a balance between indoor illumination estimation accuracy and resource efficiency, attaining higher accuracy with nearly 4 times fewer model parameters than the ViT-B16 model.

在增强现实（AR）系统中，照明一致性是将虚拟对象与真实场景无缝集成的关键因素。高动态范围（HDR）全景图像被广泛用于准确估计场景照明。然而，生成环境地图需要复杂的深度网络架构，而这种架构无法在内存空间有限的设备上运行。为了解决这个问题，本文提出了一种有效的光照度估算方法 CGLight，它能从单个有限视场（LFOV）图像预测 HDR 全景环境图。我们首先设计了一个 CMAtten 编码器来从输入图像中提取特征，该编码器以较少的模型参数学习球谐波（SH）光照表示。在光照参数的指导下，我们训练生成式对抗网络（GAN）来生成 HDR 环境贴图。此外，为了丰富光照细节并减少训练时间，我们特别引入了颜色一致性损失和独立判别器，在提高计算效率的同时考虑了颜色属性对光照估计任务的影响。此外，我们还利用预测的环境贴图对虚拟物体进行了重新照明，验证了 CGLight 的有效性，其在灰色漫射球中的均方根误差和角度误差分别为 0.0494 和 4.0607。大量的实验和分析表明，CGLight 实现了室内光照度估计精度和资源效率之间的平衡，以比 ViT-B16 模型少近 4 倍的模型参数获得了更高的精度。

{"title":"CGLight: An effective indoor illumination estimation method based on improved convmixer and GauGAN","authors":"Yang Wang , Shijia Song , Lijun Zhao , Huijuan Xia , Zhenyu Yuan , Ying Zhang","doi":"10.1016/j.cag.2024.104122","DOIUrl":"10.1016/j.cag.2024.104122","url":null,"abstract":"<div><div>Illumination consistency is a key factor for seamlessly integrating virtual objects with real scenes in augmented reality (AR) systems. High dynamic range (HDR) panoramic images are widely used to estimate scene lighting accurately. However, generating environment maps requires complex deep network architectures, which cannot operate on devices with limited memory space. To address this issue, this paper proposes CGLight, an effective illumination estimation method that predicts HDR panoramic environment maps from a single limited field-of-view (LFOV) image. We first design a CMAtten encoder to extract features from input images, which learns the spherical harmonic (SH) lighting representation with fewer model parameters. Guided by the lighting parameters, we train a generative adversarial network (GAN) to generate HDR environment maps. In addition, to enrich lighting details and reduce training time, we specifically introduce the color consistency loss and independent discriminator, considering the impact of color properties on the lighting estimation task while improving computational efficiency. Furthermore, the effectiveness of CGLight is verified by relighting virtual objects using the predicted environment maps, and the root mean square error and angular error are 0.0494 and 4.0607 in the gray diffuse sphere, respectively. Extensive experiments and analyses demonstrate that CGLight achieves a balance between indoor illumination estimation accuracy and resource efficiency, attaining higher accuracy with nearly 4 times fewer model parameters than the ViT-B16 model.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104122"},"PeriodicalIF":2.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Editorial Note Computers & Graphics Issue 124 编者按《计算机与图形》第 124 期

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-11-01 DOI: 10.1016/j.cag.2024.104117

Joaquim Jorge (Editor-in-Chief)

引用次数: 0

Foreword to the special section on Symposium on Virtual and Augmented Reality 2024 (SVR 2024) 2024 年虚拟与增强现实研讨会（SVR 2024）特别部分前言

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-31 DOI: 10.1016/j.cag.2024.104111

Rosa Costa, Cléber Corrêa, Skip Rizzo

引用次数: 0

DVRT: Design and evaluation of a virtual reality drone programming teaching system DVRT：虚拟现实无人机编程教学系统的设计与评估

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-29 DOI: 10.1016/j.cag.2024.104114

Zean Jin, Yulong Bai, Wei Song, Qinghe Yu, Xiaoxin Yue, Xiang Jia

Virtual Reality (VR) is an immersive virtual environment generated through computer technology. VR teaching, by utilizing an immersive learning model, offers innovative learning methods for Science, Technology, Engineering and Mathematics (STEM) education as well as programming education. This study developed a Drone Virtual Reality Teaching (DVRT) system aimed at beginners in drone operation and programming, with the goal of addressing the challenges in traditional drone and programming education, such as difficulty in engaging students and lack of practicality. Through the system's curriculum, students learn basic drone operation skills and advanced programming techniques. We conducted a course experiment primarily targeting undergraduate students who are beginners in drone operation. The test results showed that most students achieved scores above 4 out of 5, indicating that DVRT can effectively promote the development of users' comprehensive STEM literacy and computational thinking, thereby demonstrating the great potential of VR technology in STEM education. Through this innovative teaching method, students not only gain knowledge but also enjoy the fun of immersive learning.

虚拟现实（VR）是通过计算机技术生成的一种身临其境的虚拟环境。虚拟现实教学通过利用沉浸式学习模式，为科学、技术、工程和数学（STEM）教育以及编程教育提供了创新的学习方法。本研究针对无人机操作和编程初学者开发了无人机虚拟现实教学（DVRT）系统，旨在解决传统无人机和编程教育中学生难以参与、缺乏实用性等难题。通过该系统的课程，学生可以学习基本的无人机操作技能和高级编程技术。我们主要针对初学无人机操作的本科生进行了课程实验。测试结果显示，大多数学生的成绩都在 4 分（满分 5 分）以上，这表明 DVRT 能够有效促进用户的 STEM 综合素养和计算思维的发展，从而展示了 VR 技术在 STEM 教育中的巨大潜力。通过这种创新的教学方法，学生们不仅获得了知识，还享受到了身临其境的学习乐趣。

{"title":"DVRT: Design and evaluation of a virtual reality drone programming teaching system","authors":"Zean Jin, Yulong Bai, Wei Song, Qinghe Yu, Xiaoxin Yue, Xiang Jia","doi":"10.1016/j.cag.2024.104114","DOIUrl":"10.1016/j.cag.2024.104114","url":null,"abstract":"<div><div>Virtual Reality (VR) is an immersive virtual environment generated through computer technology. VR teaching, by utilizing an immersive learning model, offers innovative learning methods for Science, Technology, Engineering and Mathematics (STEM) education as well as programming education. This study developed a Drone Virtual Reality Teaching (DVRT) system aimed at beginners in drone operation and programming, with the goal of addressing the challenges in traditional drone and programming education, such as difficulty in engaging students and lack of practicality. Through the system's curriculum, students learn basic drone operation skills and advanced programming techniques. We conducted a course experiment primarily targeting undergraduate students who are beginners in drone operation. The test results showed that most students achieved scores above 4 out of 5, indicating that DVRT can effectively promote the development of users' comprehensive STEM literacy and computational thinking, thereby demonstrating the great potential of VR technology in STEM education. Through this innovative teaching method, students not only gain knowledge but also enjoy the fun of immersive learning.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104114"},"PeriodicalIF":2.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast spline collision detection (FSCD) algorithm for solving multiple contacts in real-time 用于实时解决多重接触的快速样条碰撞检测（FSCD）算法

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-20 DOI: 10.1016/j.cag.2024.104107

Lucas Zanusso Morais , Marcelo Gomes Martins , Rafael Piccin Torchelsen , Anderson Maciel , Luciana Porcher Nedel

Collision detection has been widely studied in the last decades. While plenty of solutions exist, certain simulation scenarios are still challenging when permanent contact and deformable bodies are involved. In this paper, we introduce a novel approach based on volumetric splines that is applicable to complex deformable tubes, such as in the simulation of colonoscopy and other endoscopies. The method relies on modeling radial control points, extracting surface information from a triangle mesh, and storing the volume information around a spline path. Such information is later used to compute the intersection between the object surfaces under the assumption of spatial coherence between neighboring splines. We analyze the method’s performance in terms of both speed and accuracy, comparing it with previous works. Results show that our method solves collisions between complex meshes with over 300k triangles, generating over 1,000 collisions per frame between objects while maintaining an average time of under 1ms without compromising accuracy.

碰撞检测在过去几十年中得到了广泛的研究。虽然已有大量解决方案，但在涉及永久接触和可变形体时，某些模拟场景仍具有挑战性。在本文中，我们介绍了一种基于体积样条的新方法，它适用于复杂的可变形管道，如结肠镜和其他内窥镜的模拟。该方法依赖于径向控制点建模，从三角形网格中提取表面信息，并将体积信息存储在花键路径周围。这些信息随后将用于计算物体表面之间的交点，前提是相邻花键之间存在空间一致性。我们从速度和准确性两方面分析了该方法的性能，并将其与之前的研究成果进行了比较。结果表明，我们的方法可以解决超过 30 万个三角形的复杂网格之间的碰撞，每帧产生超过 1000 次物体之间的碰撞，同时保持低于 1 毫秒的平均时间而不影响精度。

{"title":"Fast spline collision detection (FSCD) algorithm for solving multiple contacts in real-time","authors":"Lucas Zanusso Morais , Marcelo Gomes Martins , Rafael Piccin Torchelsen , Anderson Maciel , Luciana Porcher Nedel","doi":"10.1016/j.cag.2024.104107","DOIUrl":"10.1016/j.cag.2024.104107","url":null,"abstract":"<div><div>Collision detection has been widely studied in the last decades. While plenty of solutions exist, certain simulation scenarios are still challenging when permanent contact and deformable bodies are involved. In this paper, we introduce a novel approach based on volumetric splines that is applicable to complex deformable tubes, such as in the simulation of colonoscopy and other endoscopies. The method relies on modeling radial control points, extracting surface information from a triangle mesh, and storing the volume information around a spline path. Such information is later used to compute the intersection between the object surfaces under the assumption of spatial coherence between neighboring splines. We analyze the method’s performance in terms of both speed and accuracy, comparing it with previous works. Results show that our method solves collisions between complex meshes with over 300k triangles, generating over 1,000 collisions per frame between objects while maintaining an average time of under 1ms without compromising accuracy.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104107"},"PeriodicalIF":2.5,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Supporting tailorability in augmented reality based remote assistance in the manufacturing industry: A user study 在基于增强现实技术的制造业远程协助中支持量身定制：用户研究

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-16 DOI: 10.1016/j.cag.2024.104095

Troels Rasmussen , Kaj Grønbæk , Weidong Huang

Research on remote assistance in real-world industries is sparse, as most research is conducted in the laboratory under controlled conditions. Consequently, little is known about how users tailor remote assistance technologies at work. Therefore, we developed an augmented reality-based remote assistance prototype called Remote Assist Kit (RAK). RAK is a component-based system, allowing us to study tailoring activities and the usefulness of tailorable remote assistance technologies. We conducted a user evaluation with employees from the plastic manufacturing industry. The employees configured the RAK to solve real-world problems in three collaborative scenarios: (1) troubleshooting a running injection molding machine, (2) tool maintenance, (3) solving a trigonometry problem. Our results show that the tailorability of RAK was perceived as useful, and users were able to successfully tailor RAK to the distinct properties of the scenarios. Specific findings and their implications for the design of tailorable remote assistance technologies are presented. Among other findings, requirements specific to remote assistance in the manufacturing industry were discussed, such as the importance of sharing machine sounds between the local operator and the remote helper.

由于大多数研究都是在受控条件下在实验室中进行的，因此对实际行业中远程协助的研究很少。因此，人们对用户在工作中如何使用远程协助技术知之甚少。因此，我们开发了一个基于增强现实技术的远程协助原型，名为远程协助工具包（RAK）。RAK 是一个基于组件的系统，使我们能够研究定制活动和可定制远程协助技术的实用性。我们对塑料制造业的员工进行了用户评估。员工们对 RAK 进行了配置，以解决三个协作场景中的实际问题：(1) 对运行中的注塑机进行故障排除，(2) 工具维护，(3) 解决三角函数问题。我们的结果表明，RAK 的可定制性被认为是有用的，用户能够成功地定制 RAK 以适应不同场景的不同特性。本文介绍了具体的研究结果及其对量身定制的远程协助技术设计的影响。除其他发现外，还讨论了制造业对远程协助的具体要求，例如本地操作员和远程协助人员共享机器声音的重要性。

{"title":"Supporting tailorability in augmented reality based remote assistance in the manufacturing industry: A user study","authors":"Troels Rasmussen , Kaj Grønbæk , Weidong Huang","doi":"10.1016/j.cag.2024.104095","DOIUrl":"10.1016/j.cag.2024.104095","url":null,"abstract":"<div><div>Research on remote assistance in real-world industries is sparse, as most research is conducted in the laboratory under controlled conditions. Consequently, little is known about how users tailor remote assistance technologies at work. Therefore, we developed an augmented reality-based remote assistance prototype called Remote Assist Kit (RAK). RAK is a component-based system, allowing us to study tailoring activities and the usefulness of tailorable remote assistance technologies. We conducted a user evaluation with employees from the plastic manufacturing industry. The employees configured the RAK to solve real-world problems in three collaborative scenarios: (1) troubleshooting a running injection molding machine, (2) tool maintenance, (3) solving a trigonometry problem. Our results show that the tailorability of RAK was perceived as useful, and users were able to successfully tailor RAK to the distinct properties of the scenarios. Specific findings and their implications for the design of tailorable remote assistance technologies are presented. Among other findings, requirements specific to remote assistance in the manufacturing industry were discussed, such as the importance of sharing machine sounds between the local operator and the remote helper.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104095"},"PeriodicalIF":2.5,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating implicit object fragment datasets for machine learning 为机器学习生成隐含对象片段数据集

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-15 DOI: 10.1016/j.cag.2024.104104

Alfonso López , Antonio J. Rueda , Rafael J. Segura , Carlos J. Ogayar , Pablo Navarro , José M. Fuertes

One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (Github). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showing similar results.

利用深度学习模型固有的主要挑战之一，是获取足够规模的数据集以促进这些网络的有效训练所面临的稀缺性和可访问性障碍。这一点在物体检测、形状补全和断裂组装方面尤为突出。与其扫描现实世界中的大量碎片，不如用合成碎片生成海量数据集。然而，现实中的碎片在准备（如预制模型）和生成过程中需要大量计算。否则，Voronoi 图等更简单的算法可以提供更快的处理速度，但却牺牲了真实性。在这种情况下，需要在计算效率和真实度之间取得平衡。本文介绍了一种基于 GPU 的框架，用于大规模生成源自高分辨率三维模型的体素化片段，专门用作机器学习模型的训练集。这种快速管道可以控制生成碎片的数量、碎片的分散以及侵蚀等微妙效果的出现。我们用一个考古数据集测试了我们的管道，从 1052 件伊比利亚船只中生成了 100 多万件碎片（Github）。虽然这项工作的主要目的是提供由体素表示的隐式数据碎片，但三角网格和点云也可以从初始隐式表示中推断出来。为了强调 CPU 和 GPU 加速在生成庞大数据集方面的无与伦比的优势，我们与一个现实的片段生成器进行了比较，以突出我们的方法在适用性和处理时间方面的潜力。我们还展示了我们的管道与现实模拟器之间的协同效应，现实模拟器通常无法选择生成片段的数量和大小。为此，我们在现实碎片和我们的数据集上训练了一个深度学习模型，显示了类似的结果。

{"title":"Generating implicit object fragment datasets for machine learning","authors":"Alfonso López , Antonio J. Rueda , Rafael J. Segura , Carlos J. Ogayar , Pablo Navarro , José M. Fuertes","doi":"10.1016/j.cag.2024.104104","DOIUrl":"10.1016/j.cag.2024.104104","url":null,"abstract":"<div><div>One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (<span><span>Github</span><svg><path></path></svg></span>). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showing similar results.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104104"},"PeriodicalIF":2.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ADA-SCMS Net: A self-supervised clustering-based 3D mesh segmentation network with aggregation dual autoencoder ADA-SCMS 网络：基于自监督聚类的三维网状分割网络与聚合双自动编码器

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-11 DOI: 10.1016/j.cag.2024.104100

Xue Jiao , Xiaohui Yang

Despite significant advances in 3D mesh segmentation techniques driven by deep learning, segmenting 3D meshes without exhaustive manual labeling remains a challenging due to difficulties in acquiring high-quality labeled datasets. This paper introduces an aggregation dual autoencoder self-supervised clustering-based mesh segmentation network for unlabeled 3D meshes (ADA-SCMS Net). Expanding upon the previously proposed SCMS-Net, the ADA-SCMS Net enhances the segmentation process by incorporating a denoising autoencoder with an improved graph autoencoder as its basic structure. This modification prompts the segmentation network to concentrate on the primary structure of the input data during training, enabling the capture of robust features. In addition, the ADA-SCMS network introduces two new modules. One module is named the branch aggregation module, which combines the strengths of two branches to create a semantic latent representation. The other is the aggregation self-supervised clustering module, which facilitates end-to-end clustering training by iteratively updating each branch through mutual supervision. Extensive experiments on benchmark datasets validate the effectiveness of the ADA-SCMS network, demonstrating superior segmentation performance compared to the SCMS network.

尽管深度学习驱动的三维网格分割技术取得了长足进步，但由于难以获得高质量的标记数据集，在不进行详尽人工标记的情况下分割三维网格仍然是一项挑战。本文介绍了一种基于聚合双自动编码器自监督聚类的三维网格分割网络（ADA-SCMS Net）。ADA-SCMS Net 以之前提出的 SCMS-Net 为基础，通过将去噪自动编码器与改进的图自动编码器作为其基本结构，增强了分割过程。这一修改促使分割网络在训练过程中专注于输入数据的主要结构，从而捕捉到稳健的特征。此外，ADA-SCMS 网络还引入了两个新模块。一个模块被命名为分支聚合模块，它结合了两个分支的优势来创建语义潜表征。另一个是聚合自监督聚类模块，它通过相互监督迭代更新每个分支来促进端到端的聚类训练。在基准数据集上进行的大量实验验证了 ADA-SCMS 网络的有效性，与 SCMS 网络相比，ADA-SCMS 网络具有更出色的分割性能。

{"title":"ADA-SCMS Net: A self-supervised clustering-based 3D mesh segmentation network with aggregation dual autoencoder","authors":"Xue Jiao , Xiaohui Yang","doi":"10.1016/j.cag.2024.104100","DOIUrl":"10.1016/j.cag.2024.104100","url":null,"abstract":"<div><div>Despite significant advances in 3D mesh segmentation techniques driven by deep learning, segmenting 3D meshes without exhaustive manual labeling remains a challenging due to difficulties in acquiring high-quality labeled datasets. This paper introduces an <strong>a</strong>ggregation <strong>d</strong>ual <strong>a</strong>utoencoder <strong>s</strong>elf-supervised <strong>c</strong>lustering-based <strong>m</strong>esh <strong>s</strong>egmentation network for unlabeled 3D meshes (ADA-SCMS Net). Expanding upon the previously proposed SCMS-Net, the ADA-SCMS Net enhances the segmentation process by incorporating a denoising autoencoder with an improved graph autoencoder as its basic structure. This modification prompts the segmentation network to concentrate on the primary structure of the input data during training, enabling the capture of robust features. In addition, the ADA-SCMS network introduces two new modules. One module is named the branch aggregation module, which combines the strengths of two branches to create a semantic latent representation. The other is the aggregation self-supervised clustering module, which facilitates end-to-end clustering training by iteratively updating each branch through mutual supervision. Extensive experiments on benchmark datasets validate the effectiveness of the ADA-SCMS network, demonstrating superior segmentation performance compared to the SCMS network.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104100"},"PeriodicalIF":2.5,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparative analysis of spatiotemporal playback manipulation on virtual reality training for External Ventricular Drainage 时空回放操作对虚拟现实脑室外引流训练的比较分析

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-10-10 DOI: 10.1016/j.cag.2024.104106

Andreas Wrife, Renan Guarese, Alessandro Iop, Mario Romero

Extensive research has been conducted in multiple surgical specialities where Virtual Reality (VR) has been utilised, such as spinal neurosurgery. However, cranial neurosurgery remains relatively unexplored in this regard. This work explores the impact of adopting VR to study External Ventricular Drainage (EVD). In this study, pre-recorded Motion Captured data of an EVD procedure is visualised on a VR headset, in comparison to a desktop monitor condition. Participants (

N = 20

) were tasked with identifying and marking a key moment in the recordings. Objective and subjective metrics were recorded, such as completion time, temporal and spatial error distances, workload, and usability. The results from the experiment showed that the task was completed on average twice as fast in VR, when compared to desktop. However, desktop showed fewer error-prone results. Subjective feedback showed a slightly higher preference towards the VR environment concerning usability, while maintaining a comparable workload. Overall, VR displays are promising as an alternative tool to be used for educational and training purposes in cranial surgery.

虚拟现实（VR）已在脊柱神经外科等多个外科专业得到广泛应用。然而，颅脑神经外科在这方面的研究相对较少。这项工作探讨了采用 VR 研究脑室外引流 (EVD) 的影响。在这项研究中，预先录制的 EVD 手术运动捕捉数据在 VR 头显上可视化，并与桌面显示器条件进行比较。参与者（20 人）的任务是识别并标记记录中的关键时刻。实验记录了客观和主观指标，如完成时间、时空误差距离、工作量和可用性。实验结果表明，与台式机相比，在 VR 中完成任务的平均速度是台式机的两倍。不过，台式机显示的易出错结果更少。主观反馈显示，在保持工作量相当的情况下，VR 环境在可用性方面略胜一筹。总之，VR 显示器有望成为颅脑手术教育和培训的替代工具。

{"title":"Comparative analysis of spatiotemporal playback manipulation on virtual reality training for External Ventricular Drainage","authors":"Andreas Wrife, Renan Guarese, Alessandro Iop, Mario Romero","doi":"10.1016/j.cag.2024.104106","DOIUrl":"10.1016/j.cag.2024.104106","url":null,"abstract":"<div><div>Extensive research has been conducted in multiple surgical specialities where Virtual Reality (VR) has been utilised, such as spinal neurosurgery. However, cranial neurosurgery remains relatively unexplored in this regard. This work explores the impact of adopting VR to study External Ventricular Drainage (EVD). In this study, pre-recorded Motion Captured data of an EVD procedure is visualised on a VR headset, in comparison to a desktop monitor condition. Participants (<span><math><mrow><mi>N</mi><mo>=</mo><mn>20</mn></mrow></math></span>) were tasked with identifying and marking a key moment in the recordings. Objective and subjective metrics were recorded, such as completion time, temporal and spatial error distances, workload, and usability. The results from the experiment showed that the task was completed on average twice as fast in VR, when compared to desktop. However, desktop showed fewer error-prone results. Subjective feedback showed a slightly higher preference towards the VR environment concerning usability, while maintaining a comparable workload. Overall, VR displays are promising as an alternative tool to be used for educational and training purposes in cranial surgery.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104106"},"PeriodicalIF":2.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0