首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
Ten years of immersive education: Overview of a Virtual and Augmented Reality course at postgraduate level 沉浸式教育十年:研究生虚拟和增强现实课程概览
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-20 DOI: 10.1016/j.cag.2024.104088
Bernardo Marques, Beatriz Sousa Santos, Paulo Dias
In recent years, the market has seen the emergence of numerous affordable sensors, interaction devices, and displays, which have greatly facilitated the adoption of Virtual and Augmented Reality (VR/AR) across various applications. However, developing these applications requires a solid understanding of the field and specific technical skills, which are often lacking in current Computer Science and Engineering education programs. This work details an extended version from a Eurographics 2024 Education Paper, reporting a post-graduate-level course that has been taught for the past ten years to almost 200 students, across several Master’s programs. The course introduces students to the fundamental principles, methods, and tools of VR/AR. Its primary objective is to equip students with the knowledge necessary to understand, create, implement, and evaluate applications using these technologies. The paper provides insights into the course structure, key topics covered, assessment methods, as well as the devices and infrastructure utilized. It also includes an overview of various practical projects completed over the years. Among other reflections, we discuss the challenges of teaching this course, particularly due to the rapid evolution of the field, which necessitates constant updates to the curriculum. Finally, future perspectives for the course are outlined.
近年来,市场上出现了许多价格低廉的传感器、交互设备和显示器,极大地促进了虚拟现实和增强现实(VR/AR)在各种应用中的普及。然而,开发这些应用需要对该领域有扎实的了解并掌握特定的技术技能,而目前的计算机科学与工程教育课程往往缺乏这些技能。本作品详细介绍了 Eurographics 2024 教育论文的扩展版本,报告了一门研究生水平的课程,该课程在过去十年中已教授了多个硕士项目的近 200 名学生。该课程向学生介绍 VR/AR 的基本原理、方法和工具。其主要目的是让学生掌握必要的知识,以便理解、创建、实施和评估使用这些技术的应用程序。本文深入介绍了课程结构、涵盖的关键主题、评估方法以及使用的设备和基础设施。本文还概述了多年来完成的各种实践项目。除其他思考外,我们还讨论了本课程教学所面临的挑战,特别是由于该领域的快速发展,需要不断更新课程。最后,概述了该课程的未来前景。
{"title":"Ten years of immersive education: Overview of a Virtual and Augmented Reality course at postgraduate level","authors":"Bernardo Marques,&nbsp;Beatriz Sousa Santos,&nbsp;Paulo Dias","doi":"10.1016/j.cag.2024.104088","DOIUrl":"10.1016/j.cag.2024.104088","url":null,"abstract":"<div><div>In recent years, the market has seen the emergence of numerous affordable sensors, interaction devices, and displays, which have greatly facilitated the adoption of Virtual and Augmented Reality (VR/AR) across various applications. However, developing these applications requires a solid understanding of the field and specific technical skills, which are often lacking in current Computer Science and Engineering education programs. This work details an extended version from a Eurographics 2024 Education Paper, reporting a post-graduate-level course that has been taught for the past ten years to almost 200 students, across several Master’s programs. The course introduces students to the fundamental principles, methods, and tools of VR/AR. Its primary objective is to equip students with the knowledge necessary to understand, create, implement, and evaluate applications using these technologies. The paper provides insights into the course structure, key topics covered, assessment methods, as well as the devices and infrastructure utilized. It also includes an overview of various practical projects completed over the years. Among other reflections, we discuss the challenges of teaching this course, particularly due to the rapid evolution of the field, which necessitates constant updates to the curriculum. Finally, future perspectives for the course are outlined.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104088"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002231/pdfft?md5=f05085791d28d06cef00928e6ebd0b31&pid=1-s2.0-S0097849324002231-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142312679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient image generation with Contour Wavelet Diffusion 利用轮廓小波扩散生成高效图像
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-20 DOI: 10.1016/j.cag.2024.104087
Dimeng Zhang , JiaYao Li , Zilong Chen , Yuntao Zou
The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.
新兴的图像生成领域以其生成高质量图像的潜力吸引了学术界和工业界,促进了文本到图像的转换、图像翻译和恢复等应用。这些进步显著推动了元宇宙的发展,由生成图像构建的虚拟环境提供了全新的互动体验,尤其是与数字图书馆结合使用时。该技术可生成细节丰富的高质量图像,带来身临其境的体验。尽管扩散模型在图像质量和模式覆盖率方面比 GANs 更胜一筹,但其缓慢的训练和推理速度阻碍了更广泛的应用。为了解决这一问题,我们引入了轮廓小波扩散模型,该模型通过分解特征和采用多向、各向异性分析来加速这一过程。该模型集成了一种关注机制,用于关注高频细节;还集成了一种重构损失函数,用于确保图像一致性并加速收敛。其结果是在不牺牲图像质量的前提下,显著缩短了训练和推理时间,使扩散模型在大规模应用中变得可行,并增强了其在不断发展的数字领域中的实用性。
{"title":"Efficient image generation with Contour Wavelet Diffusion","authors":"Dimeng Zhang ,&nbsp;JiaYao Li ,&nbsp;Zilong Chen ,&nbsp;Yuntao Zou","doi":"10.1016/j.cag.2024.104087","DOIUrl":"10.1016/j.cag.2024.104087","url":null,"abstract":"<div><div>The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104087"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supporting motion-capture acting with collaborative Mixed Reality 利用协作式混合现实支持动作捕捉表演
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-19 DOI: 10.1016/j.cag.2024.104090
Alberto Cannavò, Francesco Bottino, Fabrizio Lamberti
Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors’ ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method.
色键、LED 墙、动作捕捉(mocap)、三维视觉故事板和模拟摄像机等技术正在彻底改变以视觉效果为特色的电影制作方式。尽管这些技术很受欢迎,但也给演员带来了新的挑战。当通过 mocap 制作数字角色动画时,演员的工作量会增加,因为他们需要发挥想象力来设想角色在片场的所见所闻。本作品通过展示一个名为 CoMR-MoCap 的协作式 MR 系统,研究了混合现实(MR)技术如何在 mocap 过程中为演员提供支持。通过使用视频透视头戴式显示器(VST-HMD),演员可以实时看到穿着 mocap 服的演员的数字表现和数字场景内容。该系统支持协作,使多名演员既能穿上 mocap 套装为数字角色制作动画,又能佩戴 VST-HMD 将数字内容可视化。一项有 24 人参与的用户研究将 CoMR-MoCap 与使用实物道具和视觉提示的传统方法进行了比较。结果表明,CoMR-MoCap 显著提高了演员定位和引导视线的能力,与传统方法相比,它在可用性、空间和社交存在感、体现和感知效果方面更具优势。
{"title":"Supporting motion-capture acting with collaborative Mixed Reality","authors":"Alberto Cannavò,&nbsp;Francesco Bottino,&nbsp;Fabrizio Lamberti","doi":"10.1016/j.cag.2024.104090","DOIUrl":"10.1016/j.cag.2024.104090","url":null,"abstract":"<div><div>Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors’ ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104090"},"PeriodicalIF":2.5,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LightingFormer: Transformer-CNN hybrid network for low-light image enhancement LightingFormer:用于弱光图像增强的变换器-CNN 混合网络
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-18 DOI: 10.1016/j.cag.2024.104089
Cong Bi , Wenhua Qian , Jinde Cao , Xue Wang
Recent deep-learning methods have shown promising results in low-light image enhancement. However, current methods often suffer from noise and artifacts, and most are based on convolutional neural networks, which have limitations in capturing long-range dependencies resulting in insufficient recovery of extremely dark parts in low-light images. To tackle these issues, this paper proposes a novel Transformer-based low-light image enhancement network called LightingFormer. Specifically, we propose a novel Transformer-CNN hybrid block that captures global and local information via mixed attention. It combines the advantages of the Transformer in capturing long-range dependencies and the advantages of CNNs in extracting low-level features and enhancing locality to recover extremely dark parts and enhance local details in low-light images. Moreover, we adopt the U-Net discriminator to enhance different regions in low-light images adaptively, avoiding overexposure or underexposure, and suppressing noise and artifacts. Extensive experiments show that our method outperforms the state-of-the-art methods quantitatively and qualitatively. Furthermore, the application to object detection demonstrates the potential of our method in high-level vision tasks.
最近的深度学习方法在弱光图像增强方面取得了可喜的成果。然而,目前的方法往往受到噪声和伪影的影响,而且大多数方法都是基于卷积神经网络,在捕捉长程依赖性方面存在局限性,导致对低照度图像中极暗部分的恢复不足。为了解决这些问题,本文提出了一种基于 Transformer 的新型低照度图像增强网络,称为 LightingFormer。具体来说,我们提出了一种新颖的 Transformer-CNN 混合块,通过混合注意力捕捉全局和局部信息。它结合了 Transformer 在捕捉长距离依赖性方面的优势,以及 CNN 在提取低层次特征和增强局部性方面的优势,从而在弱光图像中恢复极暗部分并增强局部细节。此外,我们还采用 U-Net 鉴别器自适应增强弱光图像中的不同区域,避免曝光过度或曝光不足,抑制噪声和伪影。大量实验表明,我们的方法在数量和质量上都优于最先进的方法。此外,在物体检测中的应用也证明了我们的方法在高级视觉任务中的潜力。
{"title":"LightingFormer: Transformer-CNN hybrid network for low-light image enhancement","authors":"Cong Bi ,&nbsp;Wenhua Qian ,&nbsp;Jinde Cao ,&nbsp;Xue Wang","doi":"10.1016/j.cag.2024.104089","DOIUrl":"10.1016/j.cag.2024.104089","url":null,"abstract":"<div><div>Recent deep-learning methods have shown promising results in low-light image enhancement. However, current methods often suffer from noise and artifacts, and most are based on convolutional neural networks, which have limitations in capturing long-range dependencies resulting in insufficient recovery of extremely dark parts in low-light images. To tackle these issues, this paper proposes a novel Transformer-based low-light image enhancement network called LightingFormer. Specifically, we propose a novel Transformer-CNN hybrid block that captures global and local information via mixed attention. It combines the advantages of the Transformer in capturing long-range dependencies and the advantages of CNNs in extracting low-level features and enhancing locality to recover extremely dark parts and enhance local details in low-light images. Moreover, we adopt the U-Net discriminator to enhance different regions in low-light images adaptively, avoiding overexposure or underexposure, and suppressing noise and artifacts. Extensive experiments show that our method outperforms the state-of-the-art methods quantitatively and qualitatively. Furthermore, the application to object detection demonstrates the potential of our method in high-level vision tasks.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104089"},"PeriodicalIF":2.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
APE-GAN: A colorization method for focal areas of infrared images guided by an improved attention mask mechanism APE-GAN:以改进的注意力掩码机制为指导的红外图像焦点区域着色方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-18 DOI: 10.1016/j.cag.2024.104086
Wenchao Ren, Liangfu Li, Shiyi Wen, Lingmei Ai
Due to their minimal susceptibility to environmental changes, infrared images are widely applicable across various fields, particularly in the realm of traffic. Nonetheless, a common drawback of infrared images lies in their limited chroma and detail information, posing challenges for clear information retrieval. While extensive research has been conducted on colorizing infrared images in recent years, existing methods primarily focus on overall translation without adequately addressing the foreground area containing crucial details. To address this issue, we propose a novel approach that distinguishes and colors the foreground content with important information and the background content with less significant details separately before fusing them into a colored image. Consequently, we introduce an enhanced generative adversarial network based on Attention mask to meticulously translate the foreground content containing vital information more comprehensively. Furthermore, we have carefully designed a new composite loss function to optimize high-level detail generation and improve image colorization at a finer granularity. Detailed testing on IRVI datasets validates the effectiveness of our proposed method in solving the problem of infrared image coloring.
由于红外图像受环境变化的影响极小,因此广泛应用于各个领域,尤其是交通领域。然而,红外图像的一个共同缺点是色度和细节信息有限,给清晰的信息检索带来了挑战。虽然近年来对红外图像着色进行了广泛的研究,但现有的方法主要侧重于整体翻译,而没有充分解决包含关键细节的前景区域。为了解决这个问题,我们提出了一种新方法,即在将包含重要信息的前景内容和包含次要细节的背景内容融合为彩色图像之前,分别对它们进行区分和着色。因此,我们引入了基于注意力掩码的增强型生成对抗网络,以更全面地翻译包含重要信息的前景内容。此外,我们还精心设计了一个新的复合损失函数,以优化高级细节生成,并在更细的粒度上改进图像着色。对 IRVI 数据集的详细测试验证了我们提出的方法在解决红外图像着色问题方面的有效性。
{"title":"APE-GAN: A colorization method for focal areas of infrared images guided by an improved attention mask mechanism","authors":"Wenchao Ren,&nbsp;Liangfu Li,&nbsp;Shiyi Wen,&nbsp;Lingmei Ai","doi":"10.1016/j.cag.2024.104086","DOIUrl":"10.1016/j.cag.2024.104086","url":null,"abstract":"<div><div>Due to their minimal susceptibility to environmental changes, infrared images are widely applicable across various fields, particularly in the realm of traffic. Nonetheless, a common drawback of infrared images lies in their limited chroma and detail information, posing challenges for clear information retrieval. While extensive research has been conducted on colorizing infrared images in recent years, existing methods primarily focus on overall translation without adequately addressing the foreground area containing crucial details. To address this issue, we propose a novel approach that distinguishes and colors the foreground content with important information and the background content with less significant details separately before fusing them into a colored image. Consequently, we introduce an enhanced generative adversarial network based on Attention mask to meticulously translate the foreground content containing vital information more comprehensively. Furthermore, we have carefully designed a new composite loss function to optimize high-level detail generation and improve image colorization at a finer granularity. Detailed testing on IRVI datasets validates the effectiveness of our proposed method in solving the problem of infrared image coloring.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104086"},"PeriodicalIF":2.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ST2SI: Image Style Transfer via Vision Transformer using Spatial Interaction ST2SI:通过视觉转换器利用空间交互进行图像风格转换
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-16 DOI: 10.1016/j.cag.2024.104084
Wenshu Li , Yinliang Chen , Xiaoying Guo , Xiaoyu He
While retaining the original content structure, image style transfer uses style image to render it to obtain stylized images with artistic features. Because the content image contains different detail units and the style image has various style patterns, it is easy to cause the distortion of the stylized image. We proposes a new Style Transfer based on Vision Transformer using Spatial Interaction (ST2SI), which takes advantage of Spatial Interactive Convolution (SIC) and Spatial Unit Attention (SUA) to further enhance the content and style representation, so that the encoder can not only better learn the features of the content domain and the style domain, but also maintain the structural integrity of the image content and the effective integration of style features. Concretely, the high-order spatial interaction ability of Spatial Interactive Convolution can capture complex style patterns, and Spatial Unit Attention can balance the content information of different detail units through the change of attention weight, thus solving the problem of image distortion. Comprehensive qualitative and quantitative experiments prove the efficacy of our approach.
图像风格转换在保留原有内容结构的同时,利用风格图像对其进行渲染,从而获得具有艺术特色的风格化图像。由于内容图像包含不同的细节单元,而风格图像具有各种风格模式,因此很容易造成风格化图像的失真。我们提出了一种新的基于空间交互视觉转换器的风格转换(ST2SI),利用空间交互卷积(SIC)和空间单元注意(SUA)的优势,进一步增强内容和风格的表示,使编码器不仅能更好地学习内容域和风格域的特征,还能保持图像内容结构的完整性和风格特征的有效融合。具体来说,空间交互卷积的高阶空间交互能力可以捕捉复杂的风格模式,而空间单元注意力则可以通过注意力权重的变化平衡不同细节单元的内容信息,从而解决图像失真的问题。全面的定性和定量实验证明了我们方法的有效性。
{"title":"ST2SI: Image Style Transfer via Vision Transformer using Spatial Interaction","authors":"Wenshu Li ,&nbsp;Yinliang Chen ,&nbsp;Xiaoying Guo ,&nbsp;Xiaoyu He","doi":"10.1016/j.cag.2024.104084","DOIUrl":"10.1016/j.cag.2024.104084","url":null,"abstract":"<div><div>While retaining the original content structure, image style transfer uses style image to render it to obtain stylized images with artistic features. Because the content image contains different detail units and the style image has various style patterns, it is easy to cause the distortion of the stylized image. We proposes a new Style Transfer based on Vision Transformer using Spatial Interaction (ST2SI), which takes advantage of Spatial Interactive Convolution (SIC) and Spatial Unit Attention (SUA) to further enhance the content and style representation, so that the encoder can not only better learn the features of the content domain and the style domain, but also maintain the structural integrity of the image content and the effective integration of style features. Concretely, the high-order spatial interaction ability of Spatial Interactive Convolution can capture complex style patterns, and Spatial Unit Attention can balance the content information of different detail units through the change of attention weight, thus solving the problem of image distortion. Comprehensive qualitative and quantitative experiments prove the efficacy of our approach.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104084"},"PeriodicalIF":2.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142312678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial Note Computers & Graphics Issue 123 编者按 《计算机与图形》第 123 期
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-13 DOI: 10.1016/j.cag.2024.104072
{"title":"Editorial Note Computers & Graphics Issue 123","authors":"","doi":"10.1016/j.cag.2024.104072","DOIUrl":"10.1016/j.cag.2024.104072","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104072"},"PeriodicalIF":2.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SHAPE: A visual computing pipeline for interactive landmarking of 3D photograms and patient reporting for assessing craniosynostosis SHAPE:用于交互式三维照片标记和患者报告的视觉计算管道,以评估颅骨发育不良症
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-12 DOI: 10.1016/j.cag.2024.104056
Carsten Görg , Connor Elkhill , Jasmine Chaij , Kristin Royalty , Phuong D. Nguyen , Brooke French , Ines A. Cruz-Guerrero , Antonio R. Porras
3D photogrammetry is a cost-effective, non-invasive imaging modality that does not require the use of ionizing radiation or sedation. Therefore, it is specifically valuable in pediatrics and is used to support the diagnosis and longitudinal study of craniofacial developmental pathologies such as craniosynostosis — the premature fusion of one or more cranial sutures resulting in local cranial growth restrictions and cranial malformations. Analysis of 3D photogrammetry requires the identification of craniofacial landmarks to segment the head surface and compute metrics to quantify anomalies. Unfortunately, commercial 3D photogrammetry software requires intensive manual landmark placements, which is time-consuming and prone to errors. We designed and implemented SHAPE, a System for Head-shape Analysis and Pediatric Evaluation. It integrates our previously developed automated landmarking method in a visual computing pipeline to evaluate a patient’s 3D photogram while allowing for manual confirmation and correction. It also automatically computes advanced metrics to quantify craniofacial anomalies and automatically creates a report that can be uploaded to the patient’s electronic health record. We conducted a user study with a professional clinical photographer to compare SHAPE to the existing clinical workflow. We found that SHAPE allows for the evaluation of a craniofacial 3D photogram more than three times faster than the current clinical workflow (3.85±0.99 vs. 13.07±5.29 minutes, p<0.001). Our qualitative study findings indicate that the SHAPE workflow is well aligned with the existing clinical workflow and that SHAPE has useful features and is easy to learn.
三维摄影测量是一种经济有效的非侵入性成像方式,无需使用电离辐射或镇静剂。因此,它在儿科具有特殊的价值,可用于支持颅面发育病症的诊断和纵向研究,例如颅骨发育不全--一条或多条颅缝过早融合,导致局部颅骨生长受限和颅骨畸形。三维摄影测量分析需要识别颅面地标,以分割头部表面并计算量化异常的指标。遗憾的是,商业三维摄影测量软件需要大量的人工放置地标,既费时又容易出错。我们设计并实施了 SHAPE--头形分析和儿科评估系统。该系统将我们之前开发的自动标记方法集成到视觉计算管道中,以评估患者的三维照片,同时允许手动确认和校正。它还能自动计算量化颅面畸形的高级指标,并自动创建可上传到患者电子健康记录的报告。我们与一名专业临床摄影师进行了用户研究,将 SHAPE 与现有的临床工作流程进行比较。我们发现,SHAPE 评估颅面三维照片的速度比现有临床工作流程快三倍多(3.85±0.99 分钟 vs. 13.07±5.29 分钟,p<0.001)。我们的定性研究结果表明,SHAPE 工作流程与现有的临床工作流程非常吻合,而且 SHAPE 功能实用、易于学习。
{"title":"SHAPE: A visual computing pipeline for interactive landmarking of 3D photograms and patient reporting for assessing craniosynostosis","authors":"Carsten Görg ,&nbsp;Connor Elkhill ,&nbsp;Jasmine Chaij ,&nbsp;Kristin Royalty ,&nbsp;Phuong D. Nguyen ,&nbsp;Brooke French ,&nbsp;Ines A. Cruz-Guerrero ,&nbsp;Antonio R. Porras","doi":"10.1016/j.cag.2024.104056","DOIUrl":"10.1016/j.cag.2024.104056","url":null,"abstract":"<div><div>3D photogrammetry is a cost-effective, non-invasive imaging modality that does not require the use of ionizing radiation or sedation. Therefore, it is specifically valuable in pediatrics and is used to support the diagnosis and longitudinal study of craniofacial developmental pathologies such as craniosynostosis — the premature fusion of one or more cranial sutures resulting in local cranial growth restrictions and cranial malformations. Analysis of 3D photogrammetry requires the identification of craniofacial landmarks to segment the head surface and compute metrics to quantify anomalies. Unfortunately, commercial 3D photogrammetry software requires intensive manual landmark placements, which is time-consuming and prone to errors. We designed and implemented SHAPE, a System for Head-shape Analysis and Pediatric Evaluation. It integrates our previously developed automated landmarking method in a visual computing pipeline to evaluate a patient’s 3D photogram while allowing for manual confirmation and correction. It also automatically computes advanced metrics to quantify craniofacial anomalies and automatically creates a report that can be uploaded to the patient’s electronic health record. We conducted a user study with a professional clinical photographer to compare SHAPE to the existing clinical workflow. We found that SHAPE allows for the evaluation of a craniofacial 3D photogram more than three times faster than the current clinical workflow (<span><math><mrow><mn>3</mn><mo>.</mo><mn>85</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>99</mn></mrow></math></span> vs. <span><math><mrow><mn>13</mn><mo>.</mo><mn>07</mn><mo>±</mo><mn>5</mn><mo>.</mo><mn>29</mn></mrow></math></span> minutes, <span><math><mrow><mi>p</mi><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>). Our qualitative study findings indicate that the SHAPE workflow is well aligned with the existing clinical workflow and that SHAPE has useful features and is easy to learn.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"125 ","pages":"Article 104056"},"PeriodicalIF":2.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GIC-Flow: Appearance flow estimation via global information correlation for virtual try-on under large deformation GIC-Flow:通过大变形下虚拟试穿的全局信息相关性进行外观流估计
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-12 DOI: 10.1016/j.cag.2024.104071
Peng Zhang , Jiamei Zhan , Kexin Sun , Jie Zhang , Meng Wei , Kexin Wang

The primary aim of image-based virtual try-on is to seamlessly deform the target garment image to align with the human body. Owing to the inherent non-rigid nature of garments, current methods prioritise flexible deformation through appearance flow with high degrees of freedom. However, existing appearance flow estimation methods solely focus on the correlation of local feature information. While this strategy successfully avoids the extensive computational effort associated with the direct computation of the global information correlation of feature maps, it leads to challenges in garments adapting to large deformation scenarios. To overcome these limitations, we propose the GIC-Flow framework, which obtains appearance flow by calculating the global information correlation while reducing computational regression. Specifically, our proposed global streak information matching module is designed to decompose the appearance flow into horizontal and vertical vectors, effectively propagating global information in both directions. This innovative approach considerably diminishes computational requirements, contributing to an enhanced and efficient process. In addition, to ensure the accurate deformation of local texture in garments, we propose the local aggregate information matching module to aggregate information from the nearest neighbours before computing the global correlation and to enhance weak semantic information. Comprehensive experiments conducted using our method on the VITON and VITON-HD datasets show that GIC-Flow outperforms existing state-of-the-art algorithms, particularly in cases involving complex garment deformation.

基于图像的虚拟试穿的主要目的是对目标服装图像进行无缝变形,使其与人体保持一致。由于服装固有的非刚性特性,目前的方法优先考虑通过高自由度的外观流进行灵活变形。然而,现有的外观流估算方法仅关注局部特征信息的相关性。虽然这种策略成功地避免了直接计算特征图的全局信息相关性所带来的大量计算工作,但却给服装适应大变形场景带来了挑战。为了克服这些限制,我们提出了 GIC-Flow 框架,通过计算全局信息相关性获得外观流,同时减少计算回归。具体来说,我们提出的全局条纹信息匹配模块旨在将外观流分解为水平和垂直向量,从而有效地在两个方向传播全局信息。这种创新方法大大降低了计算要求,有助于提高流程的效率。此外,为了确保服装局部纹理的准确变形,我们提出了局部聚合信息匹配模块,在计算全局相关性之前聚合最近邻域的信息,并增强弱语义信息。使用我们的方法在 VITON 和 VITON-HD 数据集上进行的综合实验表明,GIC-Flow 优于现有的最先进算法,尤其是在涉及复杂服装变形的情况下。
{"title":"GIC-Flow: Appearance flow estimation via global information correlation for virtual try-on under large deformation","authors":"Peng Zhang ,&nbsp;Jiamei Zhan ,&nbsp;Kexin Sun ,&nbsp;Jie Zhang ,&nbsp;Meng Wei ,&nbsp;Kexin Wang","doi":"10.1016/j.cag.2024.104071","DOIUrl":"10.1016/j.cag.2024.104071","url":null,"abstract":"<div><p>The primary aim of image-based virtual try-on is to seamlessly deform the target garment image to align with the human body. Owing to the inherent non-rigid nature of garments, current methods prioritise flexible deformation through appearance flow with high degrees of freedom. However, existing appearance flow estimation methods solely focus on the correlation of local feature information. While this strategy successfully avoids the extensive computational effort associated with the direct computation of the global information correlation of feature maps, it leads to challenges in garments adapting to large deformation scenarios. To overcome these limitations, we propose the GIC-Flow framework, which obtains appearance flow by calculating the global information correlation while reducing computational regression. Specifically, our proposed global streak information matching module is designed to decompose the appearance flow into horizontal and vertical vectors, effectively propagating global information in both directions. This innovative approach considerably diminishes computational requirements, contributing to an enhanced and efficient process. In addition, to ensure the accurate deformation of local texture in garments, we propose the local aggregate information matching module to aggregate information from the nearest neighbours before computing the global correlation and to enhance weak semantic information. Comprehensive experiments conducted using our method on the VITON and VITON-HD datasets show that GIC-Flow outperforms existing state-of-the-art algorithms, particularly in cases involving complex garment deformation.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104071"},"PeriodicalIF":2.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds MuSic-UDF:学习多尺度动态网格表示法,实现点云高保真表面重建
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-10 DOI: 10.1016/j.cag.2024.104081
Chuan Jin , Tieru Wu , Yu-Shen Liu , Junsheng Zhou

Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages Multi-Scale dynamic grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.

点云表面重建是三维建模的一项核心任务。最近,一些有吸引力的方法通过从点云中学习神经隐式表示(如无符号距离函数(UDF))来解决这一问题,并取得了良好的效果。然而,现有的基于 UDF 的方法仍难以恢复局部几何细节。其中一个困难来自于所使用的表征方式不够灵活,难以捕捉局部高保真几何细节。在本文中,我们提出了一种名为 MuSic-UDF 的新型神经隐式表示法,它利用多尺度动态网格从任意类型的原始点云中进行高保真、灵活的曲面重建。具体来说,我们初始化了一个分层体素网格,其中每个网格点都存储了一个可学习的三维坐标。然后,我们对这些网格进行优化,从而可以自适应地捕捉不同层次的几何结构。为了进一步探索几何细节,我们引入了频率编码策略,对这些坐标进行分层编码。MuSic-UDF 不需要任何监督,如地面真实距离值或点法线。我们在广泛使用的基准下进行了全面的实验,实验结果表明,与最先进的方法相比,我们提出的方法性能更优越。
{"title":"MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds","authors":"Chuan Jin ,&nbsp;Tieru Wu ,&nbsp;Yu-Shen Liu ,&nbsp;Junsheng Zhou","doi":"10.1016/j.cag.2024.104081","DOIUrl":"10.1016/j.cag.2024.104081","url":null,"abstract":"<div><p>Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages <strong>Mu</strong>lti-<strong>S</strong>cale dynam<strong>ic</strong> grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104081"},"PeriodicalIF":2.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1