首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
Hybrid attention adaptive sampling network for human pose estimation in videos 用于视频中人体姿态估计的混合注意力自适应采样网络
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-20 DOI: 10.1002/cav.2244
Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu

Human pose estimation in videos often uses sampling strategies like sparse uniform sampling and keyframe selection. Sparse uniform sampling can miss spatial-temporal relationships, while keyframe selection using CNNs struggles to fully capture these relationships and is costly. Neither strategy ensures the reliability of pose data from single-frame estimators. To address these issues, this article proposes an efficient and effective hybrid attention adaptive sampling network. This network includes a dynamic attention module and a pose quality attention module, which comprehensively consider the dynamic information and the quality of pose data. Additionally, the network improves efficiency through compact uniform sampling and parallel mechanism of multi-head self-attention. Our network is compatible with various video-based pose estimation frameworks and demonstrates greater robustness in high degree of occlusion, motion blur, and illumination changes, achieving state-of-the-art performance on Sub-JHMDB dataset.

视频中的人体姿态估计通常使用稀疏均匀采样和关键帧选择等采样策略。稀疏均匀采样会遗漏空间-时间关系,而使用 CNN 的关键帧选择则难以完全捕捉这些关系,而且成本高昂。这两种策略都无法确保来自单帧估计器的姿态数据的可靠性。为了解决这些问题,本文提出了一种高效的混合注意力自适应采样网络。该网络包括一个动态注意力模块和一个姿态质量注意力模块,全面考虑了姿态数据的动态信息和质量。此外,该网络还通过紧凑的均匀采样和多头自注意并行机制提高了效率。我们的网络兼容各种基于视频的姿态估计框架,在高度遮挡、运动模糊和光照变化的情况下表现出更强的鲁棒性,在 Sub-JHMDB 数据集上取得了最先进的性能。
{"title":"Hybrid attention adaptive sampling network for human pose estimation in videos","authors":"Qianyun Song,&nbsp;Hao Zhang,&nbsp;Yanan Liu,&nbsp;Shouzheng Sun,&nbsp;Dan Xu","doi":"10.1002/cav.2244","DOIUrl":"https://doi.org/10.1002/cav.2244","url":null,"abstract":"<p>Human pose estimation in videos often uses sampling strategies like sparse uniform sampling and keyframe selection. Sparse uniform sampling can miss spatial-temporal relationships, while keyframe selection using CNNs struggles to fully capture these relationships and is costly. Neither strategy ensures the reliability of pose data from single-frame estimators. To address these issues, this article proposes an efficient and effective hybrid attention adaptive sampling network. This network includes a dynamic attention module and a pose quality attention module, which comprehensively consider the dynamic information and the quality of pose data. Additionally, the network improves efficiency through compact uniform sampling and parallel mechanism of multi-head self-attention. Our network is compatible with various video-based pose estimation frameworks and demonstrates greater robustness in high degree of occlusion, motion blur, and illumination changes, achieving state-of-the-art performance on Sub-JHMDB dataset.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142013622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nadine: A large language model-driven intelligent social robot with affective capabilities and human-like memory 纳丁大型语言模型驱动的智能社交机器人,具有情感能力和类人记忆力
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-15 DOI: 10.1002/cav.2290
Hangyeol Kang, Maher Ben Moussa, Nadia Magnenat Thalmann

In this work, we describe our approach to developing an intelligent and robust social robotic system for the Nadine social robot platform. We achieve this by integrating large language models (LLMs) and skillfully leveraging the powerful reasoning and instruction-following capabilities of these types of models to achieve advanced human-like affective and cognitive capabilities. This approach is novel compared to the current state-of-the-art LLM-based agents which do not implement human-like long-term memory or sophisticated emotional capabilities. We built a social robot system that enables generating appropriate behaviors through multimodal input processing, bringing episodic memories accordingly to the recognized user, and simulating the emotional states of the robot induced by the interaction with the human partner. In particular, we introduce an LLM-agent frame for social robots, social robotics reasoning and acting, serving as a core component for the interaction module in our system. This design has brought forth the advancement of social robots and aims to increase the quality of human–robot interaction.

在这项工作中,我们介绍了为 Nadine 社交机器人平台开发智能、强大的社交机器人系统的方法。为此,我们整合了大型语言模型(LLM),并巧妙地利用这类模型强大的推理和指令遵循能力,实现了先进的类人情感和认知能力。与目前最先进的基于 LLM 的代理相比,这种方法非常新颖,因为后者无法实现类似人类的长期记忆或复杂的情感能力。我们建立了一个社交机器人系统,它能通过多模态输入处理生成适当的行为,为识别出的用户带来相应的偶发记忆,并模拟机器人与人类伙伴互动时产生的情感状态。特别是,我们为社交机器人、社交机器人推理和行动引入了一个 LLM 代理框架,作为系统中交互模块的核心组件。这一设计推动了社交机器人的发展,旨在提高人机交互的质量。
{"title":"Nadine: A large language model-driven intelligent social robot with affective capabilities and human-like memory","authors":"Hangyeol Kang,&nbsp;Maher Ben Moussa,&nbsp;Nadia Magnenat Thalmann","doi":"10.1002/cav.2290","DOIUrl":"https://doi.org/10.1002/cav.2290","url":null,"abstract":"<p>In this work, we describe our approach to developing an intelligent and robust social robotic system for the Nadine social robot platform. We achieve this by integrating large language models (LLMs) and skillfully leveraging the powerful reasoning and instruction-following capabilities of these types of models to achieve advanced human-like affective and cognitive capabilities. This approach is novel compared to the current state-of-the-art LLM-based agents which do not implement human-like long-term memory or sophisticated emotional capabilities. We built a social robot system that enables generating appropriate behaviors through multimodal input processing, bringing episodic memories accordingly to the recognized user, and simulating the emotional states of the robot induced by the interaction with the human partner. In particular, we introduce an LLM-agent frame for social robots, social robotics reasoning and acting, serving as a core component for the interaction module in our system. This design has brought forth the advancement of social robots and aims to increase the quality of human–robot interaction.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2290","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141986061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing virtual reality exposure therapy: Optimizing treatment outcomes for agoraphobia through advanced simulation and comparative analysis 加强虚拟现实暴露疗法:通过高级模拟和比较分析优化广场恐惧症的治疗效果
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-13 DOI: 10.1002/cav.2291
Jackson Yang, Xiaoping Che, Chenxin Qu, Xiaofei Di, Haiming Liu

This paper investigates the application of Virtual Reality Exposure Therapy (VRET) to treat agoraphobia, focusing on two pivotal research questions derived from identified gaps in current therapeutic approaches. The first question (RQ1) addresses the development of complex VR environments to enhance therapy's effectiveness by simulating real-world anxiety triggers. The second question (RQ2) examines the differential impact of these VR environments on agoraphobic and nonagoraphobic participants through rigorous comparative analyses using t-tests. Methodologies include advanced data processing techniques for electrodermal activity (EDA) and eye-tracking metrics to assess the anxiety levels induced by these environments. Additionally, qualitative methods such as structured interviews and questionnaires complement these measurements, providing deeper insights into the subjective experiences of participants. Video recordings of sessions using Unity software offer a layer of data, enabling the study to replay and analyze interactions within the VR environment meticulously. The experimental results confirm the efficacy of VR settings in eliciting significant physiological and psychological responses from participants, substantiating the VR scenarios' potential as a therapeutic tool. This study contributes to the broader discourse on the viability and optimization of VR technologies in clinical settings, offering a methodologically sound approach to the practicality and accessibility of exposure therapies for anxiety disorders.

本文研究了虚拟现实暴露疗法(VRET)在治疗广场恐惧症中的应用,重点关注两个关键的研究问题,这两个问题源于当前治疗方法中已发现的差距。第一个问题(RQ1)是开发复杂的虚拟现实环境,通过模拟现实世界中的焦虑诱因来提高治疗效果。第二个问题(RQ2)是通过使用 t 检验进行严格的比较分析,研究这些 VR 环境对恐旷症和非恐旷症参与者的不同影响。研究方法包括先进的皮电活动(EDA)数据处理技术和眼动追踪指标,以评估这些环境诱发的焦虑程度。此外,结构化访谈和问卷调查等定性方法对这些测量方法进行了补充,为深入了解参与者的主观体验提供了依据。使用 Unity 软件进行的会话视频录像提供了一层数据,使研究能够细致地回放和分析 VR 环境中的互动。实验结果证实,VR 环境能有效激发参与者的生理和心理反应,从而证实了 VR 场景作为治疗工具的潜力。这项研究为VR技术在临床环境中的可行性和优化做出了更广泛的贡献,为焦虑症暴露疗法的实用性和可及性提供了一种方法论上合理的途径。
{"title":"Enhancing virtual reality exposure therapy: Optimizing treatment outcomes for agoraphobia through advanced simulation and comparative analysis","authors":"Jackson Yang,&nbsp;Xiaoping Che,&nbsp;Chenxin Qu,&nbsp;Xiaofei Di,&nbsp;Haiming Liu","doi":"10.1002/cav.2291","DOIUrl":"https://doi.org/10.1002/cav.2291","url":null,"abstract":"<p>This paper investigates the application of Virtual Reality Exposure Therapy (VRET) to treat agoraphobia, focusing on two pivotal research questions derived from identified gaps in current therapeutic approaches. The first question (RQ1) addresses the development of complex VR environments to enhance therapy's effectiveness by simulating real-world anxiety triggers. The second question (RQ2) examines the differential impact of these VR environments on agoraphobic and nonagoraphobic participants through rigorous comparative analyses using <i>t</i>-tests. Methodologies include advanced data processing techniques for electrodermal activity (EDA) and eye-tracking metrics to assess the anxiety levels induced by these environments. Additionally, qualitative methods such as structured interviews and questionnaires complement these measurements, providing deeper insights into the subjective experiences of participants. Video recordings of sessions using Unity software offer a layer of data, enabling the study to replay and analyze interactions within the VR environment meticulously. The experimental results confirm the efficacy of VR settings in eliciting significant physiological and psychological responses from participants, substantiating the VR scenarios' potential as a therapeutic tool. This study contributes to the broader discourse on the viability and optimization of VR technologies in clinical settings, offering a methodologically sound approach to the practicality and accessibility of exposure therapies for anxiety disorders.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time simulation of thin-film interference with surface thickness variation using the shallow water equations 利用浅水方程对表面厚度变化的薄膜干涉进行实时模拟
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-08 DOI: 10.1002/cav.2289
Mingyi Gu, Jiajia Dai, Jiazhou Chen, Ke Yan, Jing Huang

Thin-film interference is a significant optical phenomenon. In this study, we employed the transfer matrix method to pre-calculate the reflectance of thin-films at visible light wavelengths. The reflectance is saved as a texture through color space transformation. This advancement has made real-time rendering of thin-film interference feasible. Furthermore, we proposed the implementation of shallow water equations to simulate the morphological evolution of liquid thin-films. This approach facilitates the interpretation and prediction of behaviors and thickness variations in liquid thin-films. We also introduced a viscosity term into the shallow water equations to more accurately simulate the behavior of thin-films, thus facilitating the creation of authentic interference patterns.

薄膜干涉是一种重要的光学现象。在这项研究中,我们采用了转移矩阵法来预先计算薄膜在可见光波长下的反射率。通过色彩空间转换,反射率被保存为纹理。这一进步使得薄膜干涉的实时渲染变得可行。此外,我们还提出实施浅水方程来模拟液体薄膜的形态演变。这种方法有助于解释和预测液体薄膜的行为和厚度变化。我们还在浅水方程中引入了粘度项,以更准确地模拟薄膜的行为,从而有助于创建真实的干涉模式。
{"title":"Real-time simulation of thin-film interference with surface thickness variation using the shallow water equations","authors":"Mingyi Gu,&nbsp;Jiajia Dai,&nbsp;Jiazhou Chen,&nbsp;Ke Yan,&nbsp;Jing Huang","doi":"10.1002/cav.2289","DOIUrl":"https://doi.org/10.1002/cav.2289","url":null,"abstract":"<p>Thin-film interference is a significant optical phenomenon. In this study, we employed the transfer matrix method to pre-calculate the reflectance of thin-films at visible light wavelengths. The reflectance is saved as a texture through color space transformation. This advancement has made real-time rendering of thin-film interference feasible. Furthermore, we proposed the implementation of shallow water equations to simulate the morphological evolution of liquid thin-films. This approach facilitates the interpretation and prediction of behaviors and thickness variations in liquid thin-films. We also introduced a viscosity term into the shallow water equations to more accurately simulate the behavior of thin-films, thus facilitating the creation of authentic interference patterns.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frontal person image generation based on arbitrary-view human images 基于任意视角人体图像生成正面人像
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-23 DOI: 10.1002/cav.2234
Yong Zhang, Yuqing Zhang, Lufei Chen, Baocai Yin, Yongliang Sun

Frontal person images contain the richest detailed features of humans, which can effectively assist in behavioral recognition, virtual dress fitting and other applications. While many remarkable networks are devoted to the person image generation task, most of them need accurate target poses as the network inputs. However, the target pose annotation is difficult and time-consuming. In this work, we proposed a first frontal person image generation network based on the proposed anchor pose set and the generative adversarial network. Specifically, our method first classify a rough frontal pose to the input human image based on the proposed anchor pose set, and regress all key points of the rough frontal pose to estimate an accurate frontal pose. Then, we consider the estimated frontal pose as the target pose, and construct a two-stream generator based on the generative adversarial network to update the person's shape and appearance feature in a crossing way and generate a realistic frontal person image. Experiments on the challenging CMU Panoptic dataset show that our method can generate realistic frontal images from arbitrary-view human images.

人的正面图像包含人类最丰富的细节特征,可有效帮助行为识别、虚拟试衣等应用。虽然有许多卓越的网络致力于人物图像生成任务,但它们大多需要精确的目标姿势作为网络输入。然而,目标姿态标注既困难又耗时。在这项工作中,我们基于提出的锚姿势集和生成式对抗网络,首次提出了一种正面人物图像生成网络。具体来说,我们的方法首先根据提出的锚姿态集对输入的人体图像进行粗略正面姿态分类,并对粗略正面姿态的所有关键点进行回归,从而估计出准确的正面姿态。然后,我们将估计出的正面姿势视为目标姿势,并基于生成式对抗网络构建双流生成器,以交叉方式更新人物的形状和外观特征,生成逼真的正面人物图像。在极具挑战性的 CMU Panoptic 数据集上的实验表明,我们的方法可以从任意视角的人体图像中生成逼真的正面图像。
{"title":"Frontal person image generation based on arbitrary-view human images","authors":"Yong Zhang,&nbsp;Yuqing Zhang,&nbsp;Lufei Chen,&nbsp;Baocai Yin,&nbsp;Yongliang Sun","doi":"10.1002/cav.2234","DOIUrl":"10.1002/cav.2234","url":null,"abstract":"<p>Frontal person images contain the richest detailed features of humans, which can effectively assist in behavioral recognition, virtual dress fitting and other applications. While many remarkable networks are devoted to the person image generation task, most of them need accurate target poses as the network inputs. However, the target pose annotation is difficult and time-consuming. In this work, we proposed a first frontal person image generation network based on the proposed anchor pose set and the generative adversarial network. Specifically, our method first classify a rough frontal pose to the input human image based on the proposed anchor pose set, and regress all key points of the rough frontal pose to estimate an accurate frontal pose. Then, we consider the estimated frontal pose as the target pose, and construct a two-stream generator based on the generative adversarial network to update the person's shape and appearance feature in a crossing way and generate a realistic frontal person image. Experiments on the challenging CMU Panoptic dataset show that our method can generate realistic frontal images from arbitrary-view human images.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Peridynamic-based modeling of elastoplasticity and fracture dynamics 基于周动力的弹塑性和断裂动力学建模
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-16 DOI: 10.1002/cav.2242
Haoping Wang, Xiaokun Wang, Yanrui Xu, Yalan Zhang, Chao Yao, Yu Guo, Xiaojuan Ban

This paper introduces a particle-based framework for simulating the behavior of elastoplastic materials and the formation of fractures, grounded in Peridynamic theory. Traditional approaches, such as the Finite Element Method (FEM) and Smoothed Particle Hydrodynamics (SPH), to modeling elastic materials have primarily relied on discretization techniques and continuous constitutive model. However, accurately capturing fracture and crack development in elastoplastic materials poses significant challenges for these conventional models. Our approach integrates a Peridynamic-based elastic model with a density constraint, enhancing stability and realism. We adopt the Von Mises yield criterion and a bond stretch criterion to simulate plastic deformation and fracture formation, respectively. The proposed method stabilizes the elastic model through a density-based position constraint, while plasticity is modeled using the Von Mises yield criterion within the bond of particle paris. Fracturing and the generation of fine fragments are facilitated by the fracture criterion and the application of complementarity operations to the inter-particle connections. Our experimental results demonstrate the efficacy of our framework in realistically depicting a wide range of material behaviors, including elasticity, plasticity, and fracturing, across various scenarios.

本文介绍了一种基于粒子的框架,该框架以周动理论为基础,用于模拟弹塑性材料的行为和断裂的形成。有限元法(FEM)和平滑粒子流体力学(SPH)等弹性材料建模的传统方法主要依赖离散化技术和连续构成模型。然而,准确捕捉弹塑性材料的断裂和裂纹发展对这些传统模型提出了巨大挑战。我们的方法将基于 Peridynamic 的弹性模型与密度约束相结合,增强了稳定性和真实性。我们采用 Von Mises 屈服准则和粘接拉伸准则分别模拟塑性变形和断裂形成。所提出的方法通过基于密度的位置约束来稳定弹性模型,而塑性则是在粒子抛物线的结合部使用冯米塞斯屈服准则来建模的。断裂准则和粒子间连接的互补运算促进了碎裂和细小碎片的产生。实验结果表明,我们的框架能够在各种场景下真实地描述各种材料行为,包括弹性、塑性和断裂。
{"title":"Peridynamic-based modeling of elastoplasticity and fracture dynamics","authors":"Haoping Wang,&nbsp;Xiaokun Wang,&nbsp;Yanrui Xu,&nbsp;Yalan Zhang,&nbsp;Chao Yao,&nbsp;Yu Guo,&nbsp;Xiaojuan Ban","doi":"10.1002/cav.2242","DOIUrl":"https://doi.org/10.1002/cav.2242","url":null,"abstract":"<p>This paper introduces a particle-based framework for simulating the behavior of elastoplastic materials and the formation of fractures, grounded in Peridynamic theory. Traditional approaches, such as the Finite Element Method (FEM) and Smoothed Particle Hydrodynamics (SPH), to modeling elastic materials have primarily relied on discretization techniques and continuous constitutive model. However, accurately capturing fracture and crack development in elastoplastic materials poses significant challenges for these conventional models. Our approach integrates a Peridynamic-based elastic model with a density constraint, enhancing stability and realism. We adopt the Von Mises yield criterion and a bond stretch criterion to simulate plastic deformation and fracture formation, respectively. The proposed method stabilizes the elastic model through a density-based position constraint, while plasticity is modeled using the Von Mises yield criterion within the bond of particle paris. Fracturing and the generation of fine fragments are facilitated by the fracture criterion and the application of complementarity operations to the inter-particle connections. Our experimental results demonstrate the efficacy of our framework in realistically depicting a wide range of material behaviors, including elasticity, plasticity, and fracturing, across various scenarios.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPSwap: High-resolution face swapping based on StyleGAN prior GPSwap:基于 StyleGAN 先验的高分辨率人脸交换
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-11 DOI: 10.1002/cav.2238
Dongjin Huang, Chuanman Liu, Jinhua Liu

Existing high-resolution face-swapping works are still challenges in preserving identity consistency while maintaining high visual quality. We present a novel high-resolution face-swapping method GPSwap, which is based on StyleGAN prior. To better preserves identity consistency, the proposed facial feature recombination network fully leverages the properties of both w space and encoders to decouple identities. Furthermore, we presents the image reconstruction module aligns and blends images in FS space, which further supplements facial details and achieves natural blending. It not only improves image resolution but also optimizes visual quality. Extensive experiments and user studies demonstrate that GPSwap is superior to state-of-the-art high-resolution face-swapping methods in terms of image quality and identity consistency. In addition, GPSwap saves nearly 80% of training costs compared to other high-resolution face-swapping works.

现有的高分辨率人脸互换技术在保持身份一致性和高视觉质量方面仍面临挑战。我们提出了一种基于 StyleGAN 先验的新型高分辨率换脸方法 GPSwap。为了更好地保持身份一致性,我们提出的面部特征重组网络充分利用了 w 空间和编码器的特性来解耦身份。此外,我们还提出了图像重建模块,在 FS 空间中对图像进行对齐和混合,从而进一步补充面部细节,实现自然混合。它不仅提高了图像分辨率,还优化了视觉质量。广泛的实验和用户研究表明,就图像质量和身份一致性而言,GPSwap 优于最先进的高分辨率人脸交换方法。此外,与其他高分辨率换脸方法相比,GPSwap 还能节省近 80% 的培训成本。
{"title":"GPSwap: High-resolution face swapping based on StyleGAN prior","authors":"Dongjin Huang,&nbsp;Chuanman Liu,&nbsp;Jinhua Liu","doi":"10.1002/cav.2238","DOIUrl":"https://doi.org/10.1002/cav.2238","url":null,"abstract":"<p>Existing high-resolution face-swapping works are still challenges in preserving identity consistency while maintaining high visual quality. We present a novel high-resolution face-swapping method GPSwap, which is based on StyleGAN prior. To better preserves identity consistency, the proposed facial feature recombination network fully leverages the properties of both <i>w</i> space and encoders to decouple identities. Furthermore, we presents the image reconstruction module aligns and blends images in <i>FS</i> space, which further supplements facial details and achieves natural blending. It not only improves image resolution but also optimizes visual quality. Extensive experiments and user studies demonstrate that GPSwap is superior to state-of-the-art high-resolution face-swapping methods in terms of image quality and identity consistency. In addition, GPSwap saves nearly 80% of training costs compared to other high-resolution face-swapping works.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural foveated super-resolution for real-time VR rendering 用于实时虚拟现实渲染的神经凹陷超分辨率
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-11 DOI: 10.1002/cav.2287
Jiannan Ye, Xiaoxu Meng, Daiyun Guo, Cheng Shang, Haotian Mao, Xubo Yang

As virtual reality display technologies advance, resolutions and refresh rates continue to approach human perceptual limits, presenting a challenge for real-time rendering algorithms. Neural super-resolution is promising in reducing the computation cost and boosting the visual experience by scaling up low-resolution renderings. However, the added workload of running neural networks cannot be neglected. In this article, we try to alleviate the burden by exploiting the foveated nature of the human visual system, in a way that we upscale the coarse input in a heterogeneous manner instead of uniform super-resolution according to the visual acuity decreasing rapidly from the focal point to the periphery. With the help of dynamic and geometric information (i.e., pixel-wise motion vectors, depth, and camera transformation) available inherently in the real-time rendering content, we propose a neural accumulator to effectively aggregate the amortizedly rendered low-resolution visual information from frame to frame recurrently. By leveraging a partition-assemble scheme, we use a neural super-resolution module to upsample the low-resolution image tiles to different qualities according to their perceptual importance and reconstruct the final output adaptively. Perceptually high-fidelity foveated high-resolution frames are generated in real-time, surpassing the quality of other foveated super-resolution methods.

随着虚拟现实显示技术的发展,分辨率和刷新率不断接近人类的感知极限,给实时渲染算法带来了挑战。神经超分辨率有望降低计算成本,并通过提升低分辨率渲染来增强视觉体验。然而,运行神经网络所增加的工作量不容忽视。在本文中,我们试图利用人类视觉系统的有焦点特性来减轻负担,即根据视觉敏锐度从焦点向外围迅速下降的规律,以异构的方式而不是统一的超分辨率来提升粗糙输入。借助实时渲染内容中固有的动态和几何信息(即像素级运动矢量、深度和相机变换),我们提出了一种神经累加器,可以有效地将摊销后渲染的低分辨率视觉信息从一帧到另一帧进行循环累加。通过利用分区-集合方案,我们使用神经超分辨率模块,根据低分辨率图像的感知重要性,将其上采样为不同的质量,并自适应地重建最终输出。实时生成的高保真视网膜高分辨率帧的感知质量超过了其他视网膜超分辨率方法。
{"title":"Neural foveated super-resolution for real-time VR rendering","authors":"Jiannan Ye,&nbsp;Xiaoxu Meng,&nbsp;Daiyun Guo,&nbsp;Cheng Shang,&nbsp;Haotian Mao,&nbsp;Xubo Yang","doi":"10.1002/cav.2287","DOIUrl":"https://doi.org/10.1002/cav.2287","url":null,"abstract":"<p>As virtual reality display technologies advance, resolutions and refresh rates continue to approach human perceptual limits, presenting a challenge for real-time rendering algorithms. Neural super-resolution is promising in reducing the computation cost and boosting the visual experience by scaling up low-resolution renderings. However, the added workload of running neural networks cannot be neglected. In this article, we try to alleviate the burden by exploiting the foveated nature of the human visual system, in a way that we upscale the coarse input in a heterogeneous manner instead of uniform super-resolution according to the visual acuity decreasing rapidly from the focal point to the periphery. With the help of dynamic and geometric information (i.e., pixel-wise motion vectors, depth, and camera transformation) available inherently in the real-time rendering content, we propose a neural accumulator to effectively aggregate the amortizedly rendered low-resolution visual information from frame to frame recurrently. By leveraging a partition-assemble scheme, we use a neural super-resolution module to upsample the low-resolution image tiles to different qualities according to their perceptual importance and reconstruct the final output adaptively. Perceptually high-fidelity foveated high-resolution frames are generated in real-time, surpassing the quality of other foveated super-resolution methods.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and development of a mixed reality teaching systems for IV cannulation and clinical instruction 设计和开发用于静脉插管和临床教学的混合现实教学系统
IF 1.1 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-15 DOI: 10.1002/cav.2288
Wei Xiong, Yingda Peng

Intravenous cannulation (IV) is a common technique used in clinical infusion. This study developed a mixed reality IV cannulation teaching system based on the Hololens2 platform. The paper integrates cognitive-affective theory of learning with media (CATLM) and investigates the cognitive engagement and willingness to use the system from the learners' perspective. Through experimental research on 125 subjects, the variables affecting learners' cognitive engagement and intention to use were determined. On the basis of CATLM, three new mixed reality attributes, immersion, system verisimilitude, and response time, were introduced, and their relationships with cognitive participation and willingness to use were determined. The results show that high immersion of mixed reality technology promotes students' higher cognitive engagement; however, this high immersion does not significantly affect learners' intention to use mixed reality technology for learning. Overall, cognitive and emotional theories are effective in mixed reality environments, and the model has good adaptability. This study provides a reference for the application of mixed reality technology in medical education.

静脉插管(IV)是临床输液中常用的一项技术。本研究开发了基于 Hololens2 平台的混合现实静脉插管教学系统。论文整合了媒体学习的认知-情感理论(CATLM),从学习者的角度研究了认知参与度和使用该系统的意愿。通过对 125 名受试者的实验研究,确定了影响学习者认知参与度和使用意愿的变量。在 CATLM 的基础上,引入了三个新的混合现实属性,即沉浸感、系统真实性和响应时间,并确定了它们与认知参与和使用意愿的关系。结果表明,混合现实技术的高沉浸度会促进学生更高的认知参与;然而,这种高沉浸度并不会显著影响学习者使用混合现实技术进行学习的意愿。总体而言,认知和情感理论在混合现实环境中是有效的,而且该模型具有良好的适应性。本研究为混合现实技术在医学教育中的应用提供了参考。
{"title":"Design and development of a mixed reality teaching systems for IV cannulation and clinical instruction","authors":"Wei Xiong,&nbsp;Yingda Peng","doi":"10.1002/cav.2288","DOIUrl":"https://doi.org/10.1002/cav.2288","url":null,"abstract":"<p>Intravenous cannulation (IV) is a common technique used in clinical infusion. This study developed a mixed reality IV cannulation teaching system based on the Hololens2 platform. The paper integrates cognitive-affective theory of learning with media (CATLM) and investigates the cognitive engagement and willingness to use the system from the learners' perspective. Through experimental research on 125 subjects, the variables affecting learners' cognitive engagement and intention to use were determined. On the basis of CATLM, three new mixed reality attributes, immersion, system verisimilitude, and response time, were introduced, and their relationships with cognitive participation and willingness to use were determined. The results show that high immersion of mixed reality technology promotes students' higher cognitive engagement; however, this high immersion does not significantly affect learners' intention to use mixed reality technology for learning. Overall, cognitive and emotional theories are effective in mixed reality environments, and the model has good adaptability. This study provides a reference for the application of mixed reality technology in medical education.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141329416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mastering broom-like tools for object transportation animation using deep reinforcement learning 利用深度强化学习掌握物体运输动画中的扫帚类工具
IF 1.1 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-14 DOI: 10.1002/cav.2255
Guan-Ting Liu, Sai-Keung Wong

In this paper, we propose a deep reinforcement-based approach to generate an animation of an agent using a broom-like tool to transport a target object. The tool is attached to the agent. So when the agent moves, the tool moves as well.The challenge is to control the agent to move and use the tool to push the target while avoiding obstacles. We propose a direction sensor to guide the agent's movement direction in environments with static obstacles. Furthermore, different rewards and a curriculum learning are implemented to make the agent efficiently learn skills for manipulating the tool. Experimental results show that the agent can naturally control the tool with different shapes to transport target objects. The result of ablation tests revealed the impacts of the rewards and some state components.

在本文中,我们提出了一种基于深度强化的方法,用于生成代理使用类似扫帚的工具搬运目标物体的动画。该工具与代理相连。我们面临的挑战是如何控制代理移动并使用工具推动目标,同时避开障碍物。我们建议使用方向传感器来引导代理在有静态障碍物的环境中的移动方向。此外,我们还实施了不同的奖励和课程学习,以使代理高效地学习操作工具的技能。实验结果表明,代理可以自然地控制不同形状的工具搬运目标物体。烧蚀测试结果显示了奖励和一些状态组件的影响。
{"title":"Mastering broom-like tools for object transportation animation using deep reinforcement learning","authors":"Guan-Ting Liu,&nbsp;Sai-Keung Wong","doi":"10.1002/cav.2255","DOIUrl":"https://doi.org/10.1002/cav.2255","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we propose a deep reinforcement-based approach to generate an animation of an agent using a broom-like tool to transport a target object. The tool is attached to the agent. So when the agent moves, the tool moves as well.The challenge is to control the agent to move and use the tool to push the target while avoiding obstacles. We propose a direction sensor to guide the agent's movement direction in environments with static obstacles. Furthermore, different rewards and a curriculum learning are implemented to make the agent efficiently learn skills for manipulating the tool. Experimental results show that the agent can naturally control the tool with different shapes to transport target objects. The result of ablation tests revealed the impacts of the rewards and some state components.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141326754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1