Computers & Graphics-Uk最新文献_第3页

Self-supervised reconstruction of re-renderable facial textures from single image 从单张图像自监督重建可重新渲染的面部纹理

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-28 DOI: 10.1016/j.cag.2024.104096

Mingxin Yang , Jianwei Guo , Xiaopeng Zhang , Zhanglin Cheng

Reconstructing high-fidelity 3D facial texture from a single image is a quite challenging task due to the lack of complete face information and the domain gap between the 3D face and 2D image. Further, obtaining re-renderable 3D faces has become a strongly desired property in many applications, where the term ’re-renderable’ demands the facial texture to be spatially complete and disentangled with environmental illumination. In this paper, we propose a new self-supervised deep learning framework for reconstructing high-quality and re-renderable facial albedos from single-view images in the wild. Our main idea is to first utilize a prior generation module based on the 3DMM proxy model to produce an unwrapped texture and a globally parameterized prior albedo. Then we apply a detail refinement module to synthesize the final texture with both high-frequency details and completeness. To further make facial textures disentangled with illumination, we propose a novel detailed illumination representation that is reconstructed with the detailed albedo together. We also design several novel regularization losses on both the albedo and illumination maps to facilitate the disentanglement of these two factors. Finally, by leveraging a differentiable renderer, each face attribute can be jointly trained in a self-supervised manner without requiring ground-truth facial reflectance. Extensive comparisons and ablation studies on challenging datasets demonstrate that our framework outperforms state-of-the-art approaches.

由于缺乏完整的人脸信息以及三维人脸和二维图像之间的域差距，从单张图像重建高保真三维人脸纹理是一项相当具有挑战性的任务。此外，在许多应用中，获得可重新渲染的三维人脸已成为人们强烈渴望的属性，其中 "可重新渲染 "一词要求面部纹理在空间上是完整的，并且与环境光照相分离。在本文中，我们提出了一种新的自监督深度学习框架，用于从野外单视角图像中重建高质量和可重新渲染的面部反差。我们的主要思路是，首先利用基于 3DMM 代理模型的先验生成模块，生成无包裹纹理和全局参数化的先验反照率。然后，我们使用细节细化模块合成具有高频细节和完整性的最终纹理。为了进一步使面部纹理与光照分离，我们提出了一种新颖的详细光照表示法，该表示法与详细反照率一起重建。我们还在反照率和光照图上设计了几种新的正则化损失，以促进这两个因素的分离。最后，通过利用可微分渲染器，每个脸部属性都能以自我监督的方式得到联合训练，而不需要地面真实的脸部反射率。在具有挑战性的数据集上进行的广泛比较和消融研究表明，我们的框架优于最先进的方法。

{"title":"Self-supervised reconstruction of re-renderable facial textures from single image","authors":"Mingxin Yang , Jianwei Guo , Xiaopeng Zhang , Zhanglin Cheng","doi":"10.1016/j.cag.2024.104096","DOIUrl":"10.1016/j.cag.2024.104096","url":null,"abstract":"<div><div>Reconstructing high-fidelity 3D facial texture from a single image is a quite challenging task due to the lack of complete face information and the domain gap between the 3D face and 2D image. Further, obtaining re-renderable 3D faces has become a strongly desired property in many applications, where the term ’re-renderable’ demands the facial texture to be spatially complete and disentangled with environmental illumination. In this paper, we propose a new self-supervised deep learning framework for reconstructing high-quality and re-renderable facial albedos from single-view images in the wild. Our main idea is to first utilize a <em>prior generation module</em> based on the 3DMM proxy model to produce an unwrapped texture and a globally parameterized prior albedo. Then we apply a <em>detail refinement module</em> to synthesize the final texture with both high-frequency details and completeness. To further make facial textures disentangled with illumination, we propose a novel detailed illumination representation that is reconstructed with the detailed albedo together. We also design several novel regularization losses on both the albedo and illumination maps to facilitate the disentanglement of these two factors. Finally, by leveraging a differentiable renderer, each face attribute can be jointly trained in a self-supervised manner without requiring ground-truth facial reflectance. Extensive comparisons and ablation studies on challenging datasets demonstrate that our framework outperforms state-of-the-art approaches.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104096"},"PeriodicalIF":2.5,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Psychophysiology of rhythmic stimuli and time experience in virtual reality 虚拟现实中节奏刺激和时间体验的心理生理学

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-27 DOI: 10.1016/j.cag.2024.104097

Stéven Picard, Jean Botev

Time experience is an essential part of one’s perception of any environment, real or virtual. In this article, from a virtual environment design perspective, we explore how rhythmic stimuli can influence an unrelated cognitive task regarding time experience and performance in virtual reality. This study explicitly includes physiological data to investigate how, overall, experience correlates with psychophysiological observations. The task involves sorting 3D objects by shape, with varying rhythmic stimuli in terms of their tempo and sensory channel (auditory and/or visual) in different trials, to collect subjective measures of time estimation and judgment. The results indicate different effects on time experience and performance depending on the context, such as user fatigue and trial repetition. Depending on the context, a positive impact of audio stimuli or a negative impact of visual stimuli on task performance can be observed, as well as time being underestimated concerning tempo in relation to task familiarity. However, some effects are consistent regardless of context, such as time being judged to pass faster with additional stimuli or consistent correlations between participants’ performance and time experience, suggesting flow-related aspects. We also observe correlations between time experience with eye-tracking data and body temperature, yet some of these correlations may be due to a confounding effect of fatigue. If confirmed as separate from fatigue, these physiological data could be used as reference point for evaluating a user’s time experience. This might be of great interest for designing virtual environments, as purposeful stimuli can strongly influence task performance and time experience, both essential components of virtual environment user experience.

时间体验是人们感知任何真实或虚拟环境的重要组成部分。在本文中，我们从虚拟环境设计的角度出发，探讨了节奏刺激如何影响一项无关的认知任务，即虚拟现实中的时间体验和表现。本研究明确包含生理数据，以调查总体体验如何与心理生理观察结果相关联。这项任务涉及按照形状对三维物体进行分类，在不同的试验中，节奏刺激的节奏和感官通道（听觉和/或视觉）各不相同，以收集时间估计和判断的主观测量结果。结果表明，不同的情境（如用户疲劳和重复试验）对时间体验和性能有不同的影响。根据情境的不同，可以观察到音频刺激对任务执行的积极影响或视觉刺激对任务执行的消极影响，以及与任务熟悉程度有关的节奏对时间估计不足的影响。不过，有些影响是一致的，与情境无关，例如额外刺激会使时间流逝得更快，或者参与者的表现与时间体验之间存在一致的相关性，这表明与流动有关。我们还观察到时间体验与眼动跟踪数据和体温之间的相关性，但其中一些相关性可能是由于疲劳的干扰效应造成的。如果证实与疲劳无关，这些生理数据可用作评估用户时间体验的参考点。这可能对虚拟环境的设计具有重大意义，因为有目的的刺激会强烈影响任务执行和时间体验，而这两者都是虚拟环境用户体验的重要组成部分。

{"title":"Psychophysiology of rhythmic stimuli and time experience in virtual reality","authors":"Stéven Picard, Jean Botev","doi":"10.1016/j.cag.2024.104097","DOIUrl":"10.1016/j.cag.2024.104097","url":null,"abstract":"<div><div>Time experience is an essential part of one’s perception of any environment, real or virtual. In this article, from a virtual environment design perspective, we explore how rhythmic stimuli can influence an unrelated cognitive task regarding time experience and performance in virtual reality. This study explicitly includes physiological data to investigate how, overall, experience correlates with psychophysiological observations. The task involves sorting 3D objects by shape, with varying rhythmic stimuli in terms of their tempo and sensory channel (auditory and/or visual) in different trials, to collect subjective measures of time estimation and judgment. The results indicate different effects on time experience and performance depending on the context, such as user fatigue and trial repetition. Depending on the context, a positive impact of audio stimuli or a negative impact of visual stimuli on task performance can be observed, as well as time being underestimated concerning tempo in relation to task familiarity. However, some effects are consistent regardless of context, such as time being judged to pass faster with additional stimuli or consistent correlations between participants’ performance and time experience, suggesting flow-related aspects. We also observe correlations between time experience with eye-tracking data and body temperature, yet some of these correlations may be due to a confounding effect of fatigue. If confirmed as separate from fatigue, these physiological data could be used as reference point for evaluating a user’s time experience. This might be of great interest for designing virtual environments, as purposeful stimuli can strongly influence task performance and time experience, both essential components of virtual environment user experience.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104097"},"PeriodicalIF":2.5,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing MeshNet for 3D shape classification with focal and regularization losses 利用焦点和正则化损失增强 MeshNet 的三维形状分类能力

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-25 DOI: 10.1016/j.cag.2024.104094

Meng Liu, Feiyu Zhao

With the development of deep learning and computer vision, an increasing amount of research has focused on applying deep learning models to the recognition and classification of three-dimensional shapes. In classification tasks, differences in sample quantity, feature amount, model complexity, and other aspects among different categories of 3D model data cause significant variations in classification difficulty. However, simple cross-entropy loss is generally used as the loss function, but it is insufficient to address these differences. In this paper, we used MeshNet as the base model and introduced focal loss as a metric for the loss function. Additionally, to prevent deep learning models from developing a preference for specific categories, we incorporated regularization loss. The combined use of focal loss and regularization loss in optimizing the MeshNet model’s loss function resulted in a classification accuracy of up to 92.46%, representing a 0.20% improvement over the original model’s highest accuracy of 92.26%. Furthermore, the average accuracy over the final 50 epochs remained stable at a higher level of 92.01%, reflecting a 0.71% improvement compared to the original MeshNet model’s 91.30%. These results indicate that our method performs better in 3D shape classification task.

随着深度学习和计算机视觉的发展，越来越多的研究集中于将深度学习模型应用于三维形状的识别和分类。在分类任务中，不同类别的三维模型数据在样本数量、特征数量、模型复杂度等方面的差异会导致分类难度的显著不同。然而，一般采用简单的交叉熵损失作为损失函数，但不足以解决这些差异。本文以 MeshNet 为基础模型，引入焦点损失作为损失函数的度量。此外，为了防止深度学习模型对特定类别产生偏好，我们还加入了正则化损失。在优化 MeshNet 模型的损失函数时，综合使用了焦点损失和正则化损失，结果分类准确率高达 92.46%，比原始模型的最高准确率 92.26% 提高了 0.20%。此外，最后 50 个历时的平均准确率稳定在 92.01% 的较高水平，与原始 MeshNet 模型的 91.30% 相比提高了 0.71%。这些结果表明，我们的方法在三维形状分类任务中表现更好。

{"title":"Enhancing MeshNet for 3D shape classification with focal and regularization losses","authors":"Meng Liu, Feiyu Zhao","doi":"10.1016/j.cag.2024.104094","DOIUrl":"10.1016/j.cag.2024.104094","url":null,"abstract":"<div><div>With the development of deep learning and computer vision, an increasing amount of research has focused on applying deep learning models to the recognition and classification of three-dimensional shapes. In classification tasks, differences in sample quantity, feature amount, model complexity, and other aspects among different categories of 3D model data cause significant variations in classification difficulty. However, simple cross-entropy loss is generally used as the loss function, but it is insufficient to address these differences. In this paper, we used MeshNet as the base model and introduced focal loss as a metric for the loss function. Additionally, to prevent deep learning models from developing a preference for specific categories, we incorporated regularization loss. The combined use of focal loss and regularization loss in optimizing the MeshNet model’s loss function resulted in a classification accuracy of up to 92.46%, representing a 0.20% improvement over the original model’s highest accuracy of 92.26%. Furthermore, the average accuracy over the final 50 epochs remained stable at a higher level of 92.01%, reflecting a 0.71% improvement compared to the original MeshNet model’s 91.30%. These results indicate that our method performs better in 3D shape classification task.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104094"},"PeriodicalIF":2.5,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ChatKG: Visualizing time-series patterns aided by intelligent agents and a knowledge graph ChatKG：在智能代理和知识图谱的帮助下可视化时间序列模式

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-24 DOI: 10.1016/j.cag.2024.104092

Leonardo Christino , Fernando V. Paulovich

Line-chart visualizations of temporal data enable users to identify interesting patterns for the user to inquire about. Using Intelligent Agents (IA), Visual Analytic tools can automatically uncover explicit knowledge related information to said patterns. Yet, visualizing the association of data, patterns, and knowledge is not straightforward. In this paper, we present ChatKG, a novel visual analytics strategy that allows exploratory data analysis of a Knowledge Graph that associates temporal sequences, the patterns found in each sequence, the temporal overlap between patterns, the related knowledge of each given pattern gathered from a multi-agent IA, and the IA’s suggestions of related datasets for further analysis visualized as annotations. We exemplify and informally evaluate ChatKG by analyzing the world’s life expectancy. For this, we implement an oracle that automatically extracts relevant or interesting patterns, populates the Knowledge Graph to be visualized, and, during user interaction, inquires the multi-agent IA for related information and suggests related datasets to be displayed as visual annotations. Our tests and an interview conducted showed that ChatKG is well suited for temporal analysis of temporal patterns and their related knowledge when applied to history studies.

时间数据的线图可视化使用户能够识别有趣的模式，供用户查询。利用智能代理（IA），可视化分析工具可以自动发现与上述模式相关的显性知识信息。然而，将数据、模式和知识的关联可视化并不简单。在本文中，我们介绍了一种新颖的可视化分析策略 ChatKG，它允许对知识图谱（Knowledge Graph）进行探索性数据分析，该图谱关联了时间序列、每个序列中发现的模式、模式之间的时间重叠、从多智能代理（IA）那里收集到的每个给定模式的相关知识，以及智能代理为进一步分析而提出的相关数据集建议（可视化为注释）。我们通过分析世界人口的预期寿命对 ChatKG 进行了示范和非正式评估。为此，我们实现了一个甲骨文，它能自动提取相关或有趣的模式，填充要可视化的知识图谱，并在用户交互过程中，向多代理执行机构询问相关信息，并建议将相关数据集显示为可视化注释。我们的测试和访谈表明，ChatKG 非常适合用于历史研究中的时间模式及其相关知识的时间分析。

{"title":"ChatKG: Visualizing time-series patterns aided by intelligent agents and a knowledge graph","authors":"Leonardo Christino , Fernando V. Paulovich","doi":"10.1016/j.cag.2024.104092","DOIUrl":"10.1016/j.cag.2024.104092","url":null,"abstract":"<div><div>Line-chart visualizations of temporal data enable users to identify interesting patterns for the user to inquire about. Using Intelligent Agents (IA), Visual Analytic tools can automatically uncover <em>explicit knowledge</em> related information to said patterns. Yet, visualizing the association of data, patterns, and knowledge is not straightforward. In this paper, we present <em>ChatKG</em>, a novel visual analytics strategy that allows exploratory data analysis of a Knowledge Graph that associates temporal sequences, the patterns found in each sequence, the temporal overlap between patterns, the related knowledge of each given pattern gathered from a multi-agent IA, and the IA’s suggestions of related datasets for further analysis visualized as annotations. We exemplify and informally evaluate ChatKG by analyzing the world’s life expectancy. For this, we implement an oracle that automatically extracts relevant or interesting patterns, populates the Knowledge Graph to be visualized, and, during user interaction, inquires the multi-agent IA for related information and suggests related datasets to be displayed as visual annotations. Our tests and an interview conducted showed that ChatKG is well suited for temporal analysis of temporal patterns and their related knowledge when applied to history studies.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104092"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Executing realistic earthquake simulations in unreal engine with material calibration 通过材料校准在虚幻引擎中执行逼真的地震模拟

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-24 DOI: 10.1016/j.cag.2024.104091

Yitong Sun , Hanchun Wang , Zhejun Zhang , Cyriel Diels , Ali Asadipour

Earthquakes significantly impact societies and economies, underscoring the need for effective search and rescue strategies. As AI and robotics increasingly support these efforts, the demand for high-fidelity, real-time simulation environments for training has become pressing. Earthquake simulation can be considered as a complex system. Traditional simulation methods, which primarily focus on computing intricate factors for single buildings or simplified architectural agglomerations, often fall short in providing realistic visuals and real-time structural damage assessments for urban environments. To address this deficiency, we introduce a real-time, high visual fidelity earthquake simulation platform based on the Chaos Physics System in Unreal Engine, specifically designed to simulate the damage to urban buildings. Initially, we use a genetic algorithm to calibrate material simulation parameters from Ansys into the Unreal Engine’s fracture system, based on real-world test standards. This alignment ensures the similarity of results between the two systems while achieving real-time capabilities. Additionally, by integrating real earthquake waveform data, we improve the simulation’s authenticity, ensuring it accurately reflects historical events. All functionalities are integrated into a visual user interface, enabling zero-code operation, which facilitates testing and further development by cross-disciplinary users. We verify the platform’s effectiveness through three AI-based tasks: similarity detection, path planning, and image segmentation. This paper builds upon the preliminary earthquake simulation study we presented at IMET 2023, with significant enhancements, including improvements to the material calibration workflow and the method for binding building foundations.

地震对社会和经济产生了重大影响，凸显了对有效搜救战略的需求。随着人工智能和机器人技术越来越多地支持这些工作，对用于培训的高保真实时模拟环境的需求也变得越来越迫切。地震模拟可视为一个复杂的系统。传统的模拟方法主要侧重于计算单个建筑物或简化建筑群的复杂因素，往往无法为城市环境提供逼真的视觉效果和实时的结构损坏评估。为了弥补这一不足，我们基于虚幻引擎中的混沌物理系统，引入了一个实时、高视觉保真度的地震模拟平台，专门用于模拟对城市建筑的破坏。最初，我们使用遗传算法，根据真实世界的测试标准，将来自 Ansys 的材料模拟参数校准到虚幻引擎的断裂系统中。这种校准确保了两个系统结果的相似性，同时实现了实时功能。此外，通过整合真实的地震波形数据，我们提高了模拟的真实性，确保其准确反映历史事件。所有功能都集成在一个可视化用户界面中，实现了零代码操作，这为跨学科用户的测试和进一步开发提供了便利。我们通过三个基于人工智能的任务来验证该平台的有效性：相似性检测、路径规划和图像分割。本文以我们在 IMET 2023 上展示的初步地震模拟研究为基础，进行了重大改进，包括改进材料校准工作流程和绑定建筑地基的方法。

{"title":"Executing realistic earthquake simulations in unreal engine with material calibration","authors":"Yitong Sun , Hanchun Wang , Zhejun Zhang , Cyriel Diels , Ali Asadipour","doi":"10.1016/j.cag.2024.104091","DOIUrl":"10.1016/j.cag.2024.104091","url":null,"abstract":"<div><div>Earthquakes significantly impact societies and economies, underscoring the need for effective search and rescue strategies. As AI and robotics increasingly support these efforts, the demand for high-fidelity, real-time simulation environments for training has become pressing. Earthquake simulation can be considered as a complex system. Traditional simulation methods, which primarily focus on computing intricate factors for single buildings or simplified architectural agglomerations, often fall short in providing realistic visuals and real-time structural damage assessments for urban environments. To address this deficiency, we introduce a real-time, high visual fidelity earthquake simulation platform based on the Chaos Physics System in Unreal Engine, specifically designed to simulate the damage to urban buildings. Initially, we use a genetic algorithm to calibrate material simulation parameters from Ansys into the Unreal Engine’s fracture system, based on real-world test standards. This alignment ensures the similarity of results between the two systems while achieving real-time capabilities. Additionally, by integrating real earthquake waveform data, we improve the simulation’s authenticity, ensuring it accurately reflects historical events. All functionalities are integrated into a visual user interface, enabling zero-code operation, which facilitates testing and further development by cross-disciplinary users. We verify the platform’s effectiveness through three AI-based tasks: similarity detection, path planning, and image segmentation. This paper builds upon the preliminary earthquake simulation study we presented at IMET 2023, with significant enhancements, including improvements to the material calibration workflow and the method for binding building foundations.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104091"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Controlling the scatterplot shapes of 2D and 3D multidimensional projections 控制二维和三维多维投影的散点图形状

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-24 DOI: 10.1016/j.cag.2024.104093

Alister Machado, Alexandru Telea, Michael Behrisch

Multidimensional projections are effective techniques for depicting high-dimensional data. The point patterns created by such techniques, or a technique’s visual signature, depend — apart from the data themselves — on the technique design and its parameter settings. Controlling such visual signatures — something that only few projections allow — can bring additional freedom for generating insightful depictions of the data. We present a novel projection technique — ShaRP — that allows explicit control on such visual signatures in terms of shapes of similar-value point clusters (settable to rectangles, triangles, ellipses, and convex polygons) and the projection space (2D or 3D Euclidean or

S^{2}

). We show that ShaRP scales computationally well with dimensionality and dataset size, provides its signature-control by a small set of parameters, allows trading off projection quality to signature enforcement, and can be used to generate decision maps to explore the behavior of trained machine-learning classifiers.

多维投影是描述高维数据的有效技术。除数据本身外，此类技术所创建的点模式或技术的视觉特征还取决于技术设计及其参数设置。控制这种视觉特征--只有少数投影技术可以做到--可以为生成有洞察力的数据描述带来更多自由。我们提出了一种新颖的投影技术--ShaRP，它允许在相似值点簇的形状（可设置为矩形、三角形、椭圆形和凸多边形）和投影空间（二维或三维欧几里得或 S2）方面对这种视觉特征进行明确控制。我们的研究表明，ShaRP 在计算维度和数据集大小方面具有良好的扩展性，只需一组很小的参数就能实现对签名的控制，允许在投影质量与签名执行之间进行权衡，并可用于生成决策图，以探索训练有素的机器学习分类器的行为。

{"title":"Controlling the scatterplot shapes of 2D and 3D multidimensional projections","authors":"Alister Machado, Alexandru Telea, Michael Behrisch","doi":"10.1016/j.cag.2024.104093","DOIUrl":"10.1016/j.cag.2024.104093","url":null,"abstract":"<div><div>Multidimensional projections are effective techniques for depicting high-dimensional data. The point patterns created by such techniques, or a technique’s <em>visual signature</em>, depend — apart from the data themselves — on the technique design and its parameter settings. Controlling such visual signatures — something that only few projections allow — can bring additional freedom for generating insightful depictions of the data. We present a novel projection technique — ShaRP — that allows explicit control on such visual signatures in terms of shapes of similar-value point clusters (settable to rectangles, triangles, ellipses, and convex polygons) and the projection space (2D or 3D Euclidean or <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>). We show that ShaRP scales computationally well with dimensionality and dataset size, provides its signature-control by a small set of parameters, allows trading off projection quality to signature enforcement, and can be used to generate decision maps to explore the behavior of trained machine-learning classifiers.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104093"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142322382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ten years of immersive education: Overview of a Virtual and Augmented Reality course at postgraduate level 沉浸式教育十年：研究生虚拟和增强现实课程概览

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-20 DOI: 10.1016/j.cag.2024.104088

Bernardo Marques, Beatriz Sousa Santos, Paulo Dias

In recent years, the market has seen the emergence of numerous affordable sensors, interaction devices, and displays, which have greatly facilitated the adoption of Virtual and Augmented Reality (VR/AR) across various applications. However, developing these applications requires a solid understanding of the field and specific technical skills, which are often lacking in current Computer Science and Engineering education programs. This work details an extended version from a Eurographics 2024 Education Paper, reporting a post-graduate-level course that has been taught for the past ten years to almost 200 students, across several Master’s programs. The course introduces students to the fundamental principles, methods, and tools of VR/AR. Its primary objective is to equip students with the knowledge necessary to understand, create, implement, and evaluate applications using these technologies. The paper provides insights into the course structure, key topics covered, assessment methods, as well as the devices and infrastructure utilized. It also includes an overview of various practical projects completed over the years. Among other reflections, we discuss the challenges of teaching this course, particularly due to the rapid evolution of the field, which necessitates constant updates to the curriculum. Finally, future perspectives for the course are outlined.

近年来，市场上出现了许多价格低廉的传感器、交互设备和显示器，极大地促进了虚拟现实和增强现实（VR/AR）在各种应用中的普及。然而，开发这些应用需要对该领域有扎实的了解并掌握特定的技术技能，而目前的计算机科学与工程教育课程往往缺乏这些技能。本作品详细介绍了 Eurographics 2024 教育论文的扩展版本，报告了一门研究生水平的课程，该课程在过去十年中已教授了多个硕士项目的近 200 名学生。该课程向学生介绍 VR/AR 的基本原理、方法和工具。其主要目的是让学生掌握必要的知识，以便理解、创建、实施和评估使用这些技术的应用程序。本文深入介绍了课程结构、涵盖的关键主题、评估方法以及使用的设备和基础设施。本文还概述了多年来完成的各种实践项目。除其他思考外，我们还讨论了本课程教学所面临的挑战，特别是由于该领域的快速发展，需要不断更新课程。最后，概述了该课程的未来前景。

{"title":"Ten years of immersive education: Overview of a Virtual and Augmented Reality course at postgraduate level","authors":"Bernardo Marques, Beatriz Sousa Santos, Paulo Dias","doi":"10.1016/j.cag.2024.104088","DOIUrl":"10.1016/j.cag.2024.104088","url":null,"abstract":"<div><div>In recent years, the market has seen the emergence of numerous affordable sensors, interaction devices, and displays, which have greatly facilitated the adoption of Virtual and Augmented Reality (VR/AR) across various applications. However, developing these applications requires a solid understanding of the field and specific technical skills, which are often lacking in current Computer Science and Engineering education programs. This work details an extended version from a Eurographics 2024 Education Paper, reporting a post-graduate-level course that has been taught for the past ten years to almost 200 students, across several Master’s programs. The course introduces students to the fundamental principles, methods, and tools of VR/AR. Its primary objective is to equip students with the knowledge necessary to understand, create, implement, and evaluate applications using these technologies. The paper provides insights into the course structure, key topics covered, assessment methods, as well as the devices and infrastructure utilized. It also includes an overview of various practical projects completed over the years. Among other reflections, we discuss the challenges of teaching this course, particularly due to the rapid evolution of the field, which necessitates constant updates to the curriculum. Finally, future perspectives for the course are outlined.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104088"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002231/pdfft?md5=f05085791d28d06cef00928e6ebd0b31&pid=1-s2.0-S0097849324002231-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142312679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient image generation with Contour Wavelet Diffusion 利用轮廓小波扩散生成高效图像

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-20 DOI: 10.1016/j.cag.2024.104087

Dimeng Zhang , JiaYao Li , Zilong Chen , Yuntao Zou

The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.

新兴的图像生成领域以其生成高质量图像的潜力吸引了学术界和工业界，促进了文本到图像的转换、图像翻译和恢复等应用。这些进步显著推动了元宇宙的发展，由生成图像构建的虚拟环境提供了全新的互动体验，尤其是与数字图书馆结合使用时。该技术可生成细节丰富的高质量图像，带来身临其境的体验。尽管扩散模型在图像质量和模式覆盖率方面比 GANs 更胜一筹，但其缓慢的训练和推理速度阻碍了更广泛的应用。为了解决这一问题，我们引入了轮廓小波扩散模型，该模型通过分解特征和采用多向、各向异性分析来加速这一过程。该模型集成了一种关注机制，用于关注高频细节；还集成了一种重构损失函数，用于确保图像一致性并加速收敛。其结果是在不牺牲图像质量的前提下，显著缩短了训练和推理时间，使扩散模型在大规模应用中变得可行，并增强了其在不断发展的数字领域中的实用性。

{"title":"Efficient image generation with Contour Wavelet Diffusion","authors":"Dimeng Zhang , JiaYao Li , Zilong Chen , Yuntao Zou","doi":"10.1016/j.cag.2024.104087","DOIUrl":"10.1016/j.cag.2024.104087","url":null,"abstract":"<div><div>The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104087"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Supporting motion-capture acting with collaborative Mixed Reality 利用协作式混合现实支持动作捕捉表演

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-19 DOI: 10.1016/j.cag.2024.104090

Alberto Cannavò, Francesco Bottino, Fabrizio Lamberti

Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors’ ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method.

色键、LED 墙、动作捕捉（mocap）、三维视觉故事板和模拟摄像机等技术正在彻底改变以视觉效果为特色的电影制作方式。尽管这些技术很受欢迎，但也给演员带来了新的挑战。当通过 mocap 制作数字角色动画时，演员的工作量会增加，因为他们需要发挥想象力来设想角色在片场的所见所闻。本作品通过展示一个名为 CoMR-MoCap 的协作式 MR 系统，研究了混合现实（MR）技术如何在 mocap 过程中为演员提供支持。通过使用视频透视头戴式显示器（VST-HMD），演员可以实时看到穿着 mocap 服的演员的数字表现和数字场景内容。该系统支持协作，使多名演员既能穿上 mocap 套装为数字角色制作动画，又能佩戴 VST-HMD 将数字内容可视化。一项有 24 人参与的用户研究将 CoMR-MoCap 与使用实物道具和视觉提示的传统方法进行了比较。结果表明，CoMR-MoCap 显著提高了演员定位和引导视线的能力，与传统方法相比，它在可用性、空间和社交存在感、体现和感知效果方面更具优势。

{"title":"Supporting motion-capture acting with collaborative Mixed Reality","authors":"Alberto Cannavò, Francesco Bottino, Fabrizio Lamberti","doi":"10.1016/j.cag.2024.104090","DOIUrl":"10.1016/j.cag.2024.104090","url":null,"abstract":"<div><div>Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors’ ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104090"},"PeriodicalIF":2.5,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LightingFormer: Transformer-CNN hybrid network for low-light image enhancement LightingFormer：用于弱光图像增强的变换器-CNN 混合网络

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-18 DOI: 10.1016/j.cag.2024.104089

Cong Bi , Wenhua Qian , Jinde Cao , Xue Wang

Recent deep-learning methods have shown promising results in low-light image enhancement. However, current methods often suffer from noise and artifacts, and most are based on convolutional neural networks, which have limitations in capturing long-range dependencies resulting in insufficient recovery of extremely dark parts in low-light images. To tackle these issues, this paper proposes a novel Transformer-based low-light image enhancement network called LightingFormer. Specifically, we propose a novel Transformer-CNN hybrid block that captures global and local information via mixed attention. It combines the advantages of the Transformer in capturing long-range dependencies and the advantages of CNNs in extracting low-level features and enhancing locality to recover extremely dark parts and enhance local details in low-light images. Moreover, we adopt the U-Net discriminator to enhance different regions in low-light images adaptively, avoiding overexposure or underexposure, and suppressing noise and artifacts. Extensive experiments show that our method outperforms the state-of-the-art methods quantitatively and qualitatively. Furthermore, the application to object detection demonstrates the potential of our method in high-level vision tasks.

最近的深度学习方法在弱光图像增强方面取得了可喜的成果。然而，目前的方法往往受到噪声和伪影的影响，而且大多数方法都是基于卷积神经网络，在捕捉长程依赖性方面存在局限性，导致对低照度图像中极暗部分的恢复不足。为了解决这些问题，本文提出了一种基于 Transformer 的新型低照度图像增强网络，称为 LightingFormer。具体来说，我们提出了一种新颖的 Transformer-CNN 混合块，通过混合注意力捕捉全局和局部信息。它结合了 Transformer 在捕捉长距离依赖性方面的优势，以及 CNN 在提取低层次特征和增强局部性方面的优势，从而在弱光图像中恢复极暗部分并增强局部细节。此外，我们还采用 U-Net 鉴别器自适应增强弱光图像中的不同区域，避免曝光过度或曝光不足，抑制噪声和伪影。大量实验表明，我们的方法在数量和质量上都优于最先进的方法。此外，在物体检测中的应用也证明了我们的方法在高级视觉任务中的潜力。

{"title":"LightingFormer: Transformer-CNN hybrid network for low-light image enhancement","authors":"Cong Bi , Wenhua Qian , Jinde Cao , Xue Wang","doi":"10.1016/j.cag.2024.104089","DOIUrl":"10.1016/j.cag.2024.104089","url":null,"abstract":"<div><div>Recent deep-learning methods have shown promising results in low-light image enhancement. However, current methods often suffer from noise and artifacts, and most are based on convolutional neural networks, which have limitations in capturing long-range dependencies resulting in insufficient recovery of extremely dark parts in low-light images. To tackle these issues, this paper proposes a novel Transformer-based low-light image enhancement network called LightingFormer. Specifically, we propose a novel Transformer-CNN hybrid block that captures global and local information via mixed attention. It combines the advantages of the Transformer in capturing long-range dependencies and the advantages of CNNs in extracting low-level features and enhancing locality to recover extremely dark parts and enhance local details in low-light images. Moreover, we adopt the U-Net discriminator to enhance different regions in low-light images adaptively, avoiding overexposure or underexposure, and suppressing noise and artifacts. Extensive experiments show that our method outperforms the state-of-the-art methods quantitatively and qualitatively. Furthermore, the application to object detection demonstrates the potential of our method in high-level vision tasks.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104089"},"PeriodicalIF":2.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0