Computer Animation and Virtual Worlds最新文献_第4页

Fast constrained optimization for cloth simulation parameters from static drapes 从静态窗帘快速约束优化布料模拟参数

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-14 DOI: 10.1002/cav.2265

Eunjung Ju, Eungjune Shim, Kwang-yun Kim, Sungjin Yoon, Myung Geol Choi

We present a cloth simulation parameter estimation method that integrates the flexibility of global optimization with the speed of neural networks. While global optimization allows for varied designs in objective functions and specifying the range of optimization variables, it requires thousands of objective function evaluations. Each evaluation, which involves a cloth simulation, is computationally demanding and impractical time-wise. On the other hand, neural network learning methods offer quick estimation results but face challenges such as the need for data collection, re-training when input data formats change, and difficulties in setting constraints on variable ranges. Our proposed method addresses these issues by replacing the simulation process, typically necessary for objective function evaluations in global optimization, with a neural network for inference. We demonstrate that, once an estimation model is trained, optimization for various objective functions becomes straightforward. Moreover, we illustrate that it is possible to achieve optimization results that reflect the intentions of expert users through visualization of a wide optimization space and the use of range constraints.

我们提出了一种布模拟参数估计方法，它将全局优化的灵活性与神经网络的速度融为一体。虽然全局优化允许对目标函数进行多种设计并指定优化变量的范围，但它需要对目标函数进行数千次评估。每次评估都需要进行布模拟，计算量很大，时间上也不现实。另一方面，神经网络学习方法虽然能快速得出估算结果，但也面临着一些挑战，如需要收集数据、输入数据格式改变时需要重新训练，以及难以设定变量范围限制等。我们提出的方法解决了这些问题，用神经网络推理取代了全局优化中目标函数评估通常所需的模拟过程。我们证明，一旦估算模型得到训练，各种目标函数的优化就变得简单易行。此外，我们还说明，通过可视化广阔的优化空间和使用范围约束，可以获得反映专家用户意图的优化结果。

{"title":"Fast constrained optimization for cloth simulation parameters from static drapes","authors":"Eunjung Ju, Eungjune Shim, Kwang-yun Kim, Sungjin Yoon, Myung Geol Choi","doi":"10.1002/cav.2265","DOIUrl":"https://doi.org/10.1002/cav.2265","url":null,"abstract":"We present a cloth simulation parameter estimation method that integrates the flexibility of global optimization with the speed of neural networks. While global optimization allows for varied designs in objective functions and specifying the range of optimization variables, it requires thousands of objective function evaluations. Each evaluation, which involves a cloth simulation, is computationally demanding and impractical time-wise. On the other hand, neural network learning methods offer quick estimation results but face challenges such as the need for data collection, re-training when input data formats change, and difficulties in setting constraints on variable ranges. Our proposed method addresses these issues by replacing the simulation process, typically necessary for objective function evaluations in global optimization, with a neural network for inference. We demonstrate that, once an estimation model is trained, optimization for various objective functions becomes straightforward. Moreover, we illustrate that it is possible to achieve optimization results that reflect the intentions of expert users through visualization of a wide optimization space and the use of range constraints.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2265","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141326755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward comprehensive Chiroptera modeling: A parametric multiagent model for bat behavior 实现全面的脊索动物建模：蝙蝠行为的多代理参数模型

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-14 DOI: 10.1002/cav.2251

Brendan Marney, Brandon Haworth

Chiroptera behavior is complex and often unseen as bats are nocturnal, small, and elusive animals. Chiroptology has led to significant insights into the behavior and environmental interactions of bats. Biology, ecology, and even digital media often benefit from mathematical models of animals including humans. However, the history of Chiroptera modeling is often limited to specific behaviors, species, or biological functions and relies heavily on classical modeling methodologies that may not fully represent individuals or colonies well. This work proposes a continuous, parametric, multiagent, Chiroptera behavior model that captures the latest research in echolocation, hunting, and energetics of bats. This includes echolocation-based perception (or lack thereof), hunting patterns, roosting behavior, and energy consumption rates. We proposed the integration of these mathematical models in a framework that affords the individual simulation of bats within large-scale colonies. Practitioners can adjust the model to account for different perceptual affordances or patterns among species of bats, or even individuals (such as sickness or injury). We show that our model closely matches results from the literature, affords an animated graphical simulation, and has utility in simulation-based studies.

蝙蝠是一种夜间活动、体型较小且难以捉摸的动物，因此蝙蝠的行为非常复杂，而且常常不为人所见。爬行动物学使人们对蝙蝠的行为和环境互动有了更深入的了解。生物学、生态学甚至数字媒体都常常受益于包括人类在内的动物数学模型。然而，翼手目动物建模的历史往往局限于特定的行为、物种或生物功能，并严重依赖于经典的建模方法，这些方法可能无法很好地完全代表个体或群体。这项研究提出了一种连续的、参数化的、多代理的蝙蝠行为模型，该模型捕捉到了蝙蝠回声定位、狩猎和能量学方面的最新研究成果。这包括基于回声定位的感知（或缺乏感知）、狩猎模式、栖息行为和能量消耗率。我们建议将这些数学模型整合到一个框架中，以便对大规模群落中的蝙蝠进行个体模拟。实践者可以对模型进行调整，以考虑不同种类蝙蝠甚至个体（如生病或受伤）的不同感知能力或模式。我们的研究表明，我们的模型与文献中的结果非常吻合，并提供了动画图形模拟，在基于模拟的研究中非常有用。

{"title":"Toward comprehensive Chiroptera modeling: A parametric multiagent model for bat behavior","authors":"Brendan Marney, Brandon Haworth","doi":"10.1002/cav.2251","DOIUrl":"https://doi.org/10.1002/cav.2251","url":null,"abstract":"Chiroptera behavior is complex and often unseen as bats are nocturnal, small, and elusive animals. Chiroptology has led to significant insights into the behavior and environmental interactions of bats. Biology, ecology, and even digital media often benefit from mathematical models of animals including humans. However, the history of Chiroptera modeling is often limited to specific behaviors, species, or biological functions and relies heavily on classical modeling methodologies that may not fully represent individuals or colonies well. This work proposes a continuous, parametric, multiagent, Chiroptera behavior model that captures the latest research in echolocation, hunting, and energetics of bats. This includes echolocation-based perception (or lack thereof), hunting patterns, roosting behavior, and energy consumption rates. We proposed the integration of these mathematical models in a framework that affords the individual simulation of bats within large-scale colonies. Practitioners can adjust the model to account for different perceptual affordances or patterns among species of bats, or even individuals (such as sickness or injury). We show that our model closely matches results from the literature, affords an animated graphical simulation, and has utility in simulation-based studies.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2251","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141326753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extracting roads from satellite images via enhancing road feature investigation in learning 在学习中通过加强道路特征调查从卫星图像中提取道路

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-12 DOI: 10.1002/cav.2275

Shiming Feng, Fei Hou, Jialu Chen, Wencheng Wang

It is a hot topic to extract road maps from satellite images. However, it is still very challenging with existing methods to achieve high-quality results, because the regions covered by satellite images are very large and the roads are slender, complex and only take up a small part of a satellite image, making it difficult to distinguish roads from the background in satellite images. In this article, we address this challenge by presenting two modules to more effectively learn road features, and so improving road extraction. The first module exploits the differences between the patches containing roads and the patches containing no road to exclude the background regions as many as possible, by which the small part containing roads can be more specifically investigated for improvement. The second module enhances feature alignment in decoding feature maps by using strip convolution in combination with the attention mechanism. These two modules can be easily integrated into the networks of existing learning methods for improvement. Experimental results show that our modules can help existing methods to achieve high-quality results, superior to the state-of-the-art methods.

从卫星图像中提取道路地图是一个热门话题。然而，由于卫星图像覆盖的区域非常大，而道路细长、复杂，且只占卫星图像的一小部分，因此在卫星图像中很难将道路与背景区分开来，因此现有的方法要实现高质量的结果仍然非常具有挑战性。本文针对这一难题提出了两个模块，以更有效地学习道路特征，从而改进道路提取。第一个模块利用含有道路的斑块与不含道路的斑块之间的差异，尽可能多地排除背景区域，从而更有针对性地研究含有道路的小部分区域，以改进道路提取。第二个模块通过条带卷积与注意力机制相结合，在解码特征图时加强特征对齐。这两个模块可以很容易地集成到现有学习方法的网络中进行改进。实验结果表明，我们的模块可以帮助现有方法获得高质量的结果，优于最先进的方法。

{"title":"Extracting roads from satellite images via enhancing road feature investigation in learning","authors":"Shiming Feng, Fei Hou, Jialu Chen, Wencheng Wang","doi":"10.1002/cav.2275","DOIUrl":"https://doi.org/10.1002/cav.2275","url":null,"abstract":"It is a hot topic to extract road maps from satellite images. However, it is still very challenging with existing methods to achieve high-quality results, because the regions covered by satellite images are very large and the roads are slender, complex and only take up a small part of a satellite image, making it difficult to distinguish roads from the background in satellite images. In this article, we address this challenge by presenting two modules to more effectively learn road features, and so improving road extraction. The first module exploits the differences between the patches containing roads and the patches containing no road to exclude the background regions as many as possible, by which the small part containing roads can be more specifically investigated for improvement. The second module enhances feature alignment in decoding feature maps by using strip convolution in combination with the attention mechanism. These two modules can be easily integrated into the networks of existing learning methods for improvement. Experimental results show that our modules can help existing methods to achieve high-quality results, superior to the state-of-the-art methods.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identity-consistent transfer learning of portraits for digital apparel sample display 数字服装样品展示中的人像身份一致性迁移学习

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-10 DOI: 10.1002/cav.2278

Luyuan Wang, Yiqian Wu, Yong-Liang Yang, Chen Liu, Xiaogang Jin

The rapid development of the online apparel shopping industry demands innovative solutions for high-quality digital apparel sample displays with virtual avatars. However, developing such displays is prohibitively expensive and prone to the well-known “uncanny valley” effect, where a nearly human-looking artifact arouses eeriness and repulsiveness, thus affecting the user experience. To effectively mitigate the “uncanny valley” effect and improve the overall authenticity of digital apparel sample displays, we present a novel photo-realistic portrait generation framework. Our key idea is to employ transfer learning to learn an identity-consistent mapping from the latent space of rendered portraits to that of real portraits. During the inference stage, the input portrait of an avatar can be directly transferred to a realistic portrait by changing its appearance style while maintaining the facial identity. To this end, we collect a new dataset, Daz-Rendered-Faces-HQ (DRFHQ), specifically designed for rendering-style portraits. We leverage this dataset to fine-tune the StyleGAN2-FFHQ generator, using our carefully crafted framework, which helps to preserve the geometric and color features relevant to facial identity. We evaluate our framework using portraits with diverse gender, age, and race variations. Qualitative and quantitative evaluations, along with ablation studies, highlight our method's advantages over state-of-the-art approaches.

服装网购行业的迅猛发展需要创新的解决方案，以实现带有虚拟人像的高质量数字服装样品展示。然而，开发这样的显示屏成本过高，而且容易产生众所周知的 "不可思议谷 "效应，即近似人类的人工制品会让人感到恐怖和厌恶，从而影响用户体验。为了有效缓解 "不可思议谷 "效应，提高数字服装样本展示的整体真实性，我们提出了一个新颖的照片逼真人像生成框架。我们的主要想法是利用迁移学习来学习从渲染肖像的潜空间到真实肖像的身份一致性映射。在推理阶段，化身的输入肖像可以通过改变外观风格直接转换为逼真肖像，同时保持面部特征。为此，我们收集了一个新的数据集--Daz-Rendered-Faces-HQ（DRFHQ），专门用于渲染风格的肖像。我们利用这个数据集对 StyleGAN2-FFHQ 生成器进行微调，使用我们精心设计的框架，帮助保留与面部特征相关的几何和颜色特征。我们使用不同性别、年龄和种族的肖像对我们的框架进行了评估。定性和定量评估以及消融研究凸显了我们的方法相对于最先进方法的优势。

{"title":"Identity-consistent transfer learning of portraits for digital apparel sample display","authors":"Luyuan Wang, Yiqian Wu, Yong-Liang Yang, Chen Liu, Xiaogang Jin","doi":"10.1002/cav.2278","DOIUrl":"https://doi.org/10.1002/cav.2278","url":null,"abstract":"The rapid development of the online apparel shopping industry demands innovative solutions for high-quality digital apparel sample displays with virtual avatars. However, developing such displays is prohibitively expensive and prone to the well-known “uncanny valley” effect, where a nearly human-looking artifact arouses eeriness and repulsiveness, thus affecting the user experience. To effectively mitigate the “uncanny valley” effect and improve the overall authenticity of digital apparel sample displays, we present a novel photo-realistic portrait generation framework. Our key idea is to employ transfer learning to learn an identity-consistent mapping from the latent space of rendered portraits to that of real portraits. During the inference stage, the input portrait of an avatar can be directly transferred to a realistic portrait by changing its appearance style while maintaining the facial identity. To this end, we collect a new dataset, Daz-Rendered-Faces-HQ (DRFHQ), specifically designed for rendering-style portraits. We leverage this dataset to fine-tune the StyleGAN2-FFHQ generator, using our carefully crafted framework, which helps to preserve the geometric and color features relevant to facial identity. We evaluate our framework using portraits with diverse gender, age, and race variations. Qualitative and quantitative evaluations, along with ablation studies, highlight our method's advantages over state-of-the-art approaches.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive information fusion network for multi-modal personality recognition 用于多模态个性识别的自适应信息融合网络

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-10 DOI: 10.1002/cav.2268

Yongtang Bao, Xiang Liu, Yue Qi, Ruijun Liu, Haojie Li

Personality recognition is of great significance in deepening the understanding of social relations. While personality recognition methods have made significant strides in recent years, the challenge of heterogeneity between modalities during feature fusion still needs to be solved. This paper introduces an adaptive multi-modal information fusion network (AMIF-Net) capable of concurrently processing video, audio, and text data. First, utilizing the AMIF-Net encoder, we process the extracted audio and video features separately, effectively capturing long-term data relationships. Then, adding adaptive elements in the fusion network can alleviate the problem of heterogeneity between modes. Lastly, we concatenate audio-video and text features into a regression network to obtain Big Five personality trait scores. Furthermore, we introduce a novel loss function to address the problem of training inaccuracies, taking advantage of its unique property of exhibiting a peak at the critical mean. Our tests on the ChaLearn First Impressions V2 multi-modal dataset show partial performance surpassing state-of-the-art networks.

人格识别对于加深对社会关系的理解具有重要意义。近年来，人格识别方法取得了长足进步，但特征融合过程中模态间的异质性仍是亟待解决的难题。本文介绍了一种能够同时处理视频、音频和文本数据的自适应多模态信息融合网络（AMIF-Net）。首先，我们利用 AMIF-Net 编码器分别处理提取的音频和视频特征，有效捕捉长期数据关系。然后，在融合网络中加入自适应元素，可以缓解不同模式之间的异质性问题。最后，我们将音频视频和文本特征整合到一个回归网络中，从而获得大五人格特质得分。此外，我们还引入了一种新的损失函数，利用其在临界均值处显示峰值的独特特性来解决训练不准确的问题。我们在 ChaLearn First Impressions V2 多模态数据集上进行的测试表明，其部分性能超过了最先进的网络。

{"title":"Adaptive information fusion network for multi-modal personality recognition","authors":"Yongtang Bao, Xiang Liu, Yue Qi, Ruijun Liu, Haojie Li","doi":"10.1002/cav.2268","DOIUrl":"https://doi.org/10.1002/cav.2268","url":null,"abstract":"Personality recognition is of great significance in deepening the understanding of social relations. While personality recognition methods have made significant strides in recent years, the challenge of heterogeneity between modalities during feature fusion still needs to be solved. This paper introduces an adaptive multi-modal information fusion network (AMIF-Net) capable of concurrently processing video, audio, and text data. First, utilizing the AMIF-Net encoder, we process the extracted audio and video features separately, effectively capturing long-term data relationships. Then, adding adaptive elements in the fusion network can alleviate the problem of heterogeneity between modes. Lastly, we concatenate audio-video and text features into a regression network to obtain Big Five personality trait scores. Furthermore, we introduce a novel loss function to address the problem of training inaccuracies, taking advantage of its unique property of exhibiting a peak at the critical mean. Our tests on the ChaLearn First Impressions V2 multi-modal dataset show partial performance surpassing state-of-the-art networks.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing doctor-patient communication in surgical explanations: Designing effective facial expressions and gestures for animated physician characters 在手术讲解中加强医患沟通：为动画医生角色设计有效的面部表情和手势

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-06 DOI: 10.1002/cav.2236

Hwang Youn Kim, Ghazanfar Ali, Jae-In Hwang

Paying close attention to facial expressions, gestures, and communication techniques is essential when creating animated physician characters that are realistic and captivating when describing surgical procedures. This paper emphasizes the integration of appropriate emotions, co-speech gestures when medical experts explain the medical procedure, and designing animated characters. We can achieve healthy doctor-patient relationships and improvement of patients' understanding by depicting these components truthfully. We suggest two critical approaches to developing virtual medical experts by incorporating these elements. First, doctors can generate the contents of the surgical procedure with a virtual doctor. Second, patients can listen to the surgical procedure described by the virtual doctor and ask if they have any questions. Our system helps patients by considering their psychology and adding medical professionals' opinions. These improvements ensure the animated virtual agent is comforting, reassuring, and emotionally supportive. Through a user study, we evaluated our hypothesis and gained insight into improvements.

在制作医生动画角色时，密切关注面部表情、手势和交流技巧至关重要，这样才能在描述手术过程时逼真而吸引人。本文强调将适当的情绪、医学专家解释医疗过程时的共同言语手势与设计动画角色相结合。通过真实地描述这些内容，我们可以实现健康的医患关系，并提高患者的理解能力。我们建议采用两种关键方法来开发包含这些元素的虚拟医学专家。首先，医生可以通过虚拟医生生成手术过程的内容。其次，患者可以聆听虚拟医生描述的手术过程，并询问是否有任何问题。我们的系统通过考虑患者的心理并加入医疗专业人员的意见来帮助患者。这些改进确保了动画虚拟代理具有安慰性、安抚性和情感支持性。通过用户研究，我们对假设进行了评估，并获得了改进意见。

{"title":"Enhancing doctor-patient communication in surgical explanations: Designing effective facial expressions and gestures for animated physician characters","authors":"Hwang Youn Kim, Ghazanfar Ali, Jae-In Hwang","doi":"10.1002/cav.2236","DOIUrl":"https://doi.org/10.1002/cav.2236","url":null,"abstract":"Paying close attention to facial expressions, gestures, and communication techniques is essential when creating animated physician characters that are realistic and captivating when describing surgical procedures. This paper emphasizes the integration of appropriate emotions, co-speech gestures when medical experts explain the medical procedure, and designing animated characters. We can achieve healthy doctor-patient relationships and improvement of patients' understanding by depicting these components truthfully. We suggest two critical approaches to developing virtual medical experts by incorporating these elements. First, doctors can generate the contents of the surgical procedure with a virtual doctor. Second, patients can listen to the surgical procedure described by the virtual doctor and ask if they have any questions. Our system helps patients by considering their psychology and adding medical professionals' opinions. These improvements ensure the animated virtual agent is comforting, reassuring, and emotionally supportive. Through a user study, we evaluated our hypothesis and gained insight into improvements.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2236","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141264573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A language-directed virtual human motion generation approach based on musculoskeletal models 基于肌肉骨骼模型的语言导向虚拟人体运动生成方法

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-05 DOI: 10.1002/cav.2257

Libo Sun, Yongxiang Wang, Wenhu Qin

The development of the systems capable of synthesizing natural and life-like motions for virtual characters has long been a central focus in computer animation. It needs to generate high-quality motions for characters and provide users with a convenient and flexible interface for guiding character motions. In this work, we propose a language-directed virtual human motion generation approach based on musculoskeletal models to achieve interactive and higher-fidelity virtual human motion, which lays the foundation for the development of language-directed controllers in physics-based character animation. First, we construct a simplified model of musculoskeletal dynamics for the virtual character. Subsequently, we propose a hierarchical control framework consisting of a trajectory tracking layer and a muscle control layer, obtaining the optimal control policy for imitating the reference motions through the training. We design a multi-policy aggregation controller based on large language models, which selects the motion policy with the highest similarity to user text commands from the action-caption data pool, facilitating natural language-based control of virtual character motions. Experimental results demonstrate that the proposed approach not only generates high-quality motions highly resembling reference motions but also enables users to effectively guide virtual characters to perform various motions via natural language instructions.

长期以来，开发能够为虚拟角色合成自然、逼真动作的系统一直是计算机动画领域的核心重点。它需要为角色生成高质量的动作，并为用户提供方便灵活的界面来指导角色动作。在这项工作中，我们提出了一种基于肌肉骨骼模型的语言指导虚拟人运动生成方法，以实现交互式和更高保真的虚拟人运动，这为基于物理的角色动画中语言指导控制器的开发奠定了基础。首先，我们为虚拟角色构建了一个简化的肌肉骨骼动力学模型。随后，我们提出了一个由轨迹跟踪层和肌肉控制层组成的分层控制框架，通过训练获得模仿参考运动的最优控制策略。我们设计了基于大型语言模型的多策略聚合控制器，从动作字幕数据池中选择与用户文本指令相似度最高的运动策略，从而实现基于自然语言的虚拟角色运动控制。实验结果表明，所提出的方法不仅能生成与参考动作高度相似的高质量动作，还能让用户通过自然语言指令有效地引导虚拟角色执行各种动作。

{"title":"A language-directed virtual human motion generation approach based on musculoskeletal models","authors":"Libo Sun, Yongxiang Wang, Wenhu Qin","doi":"10.1002/cav.2257","DOIUrl":"https://doi.org/10.1002/cav.2257","url":null,"abstract":"The development of the systems capable of synthesizing natural and life-like motions for virtual characters has long been a central focus in computer animation. It needs to generate high-quality motions for characters and provide users with a convenient and flexible interface for guiding character motions. In this work, we propose a language-directed virtual human motion generation approach based on musculoskeletal models to achieve interactive and higher-fidelity virtual human motion, which lays the foundation for the development of language-directed controllers in physics-based character animation. First, we construct a simplified model of musculoskeletal dynamics for the virtual character. Subsequently, we propose a hierarchical control framework consisting of a trajectory tracking layer and a muscle control layer, obtaining the optimal control policy for imitating the reference motions through the training. We design a multi-policy aggregation controller based on large language models, which selects the motion policy with the highest similarity to user text commands from the action-caption data pool, facilitating natural language-based control of virtual character motions. Experimental results demonstrate that the proposed approach not only generates high-quality motions highly resembling reference motions but also enables users to effectively guide virtual characters to perform various motions via natural language instructions.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HIDE: Hierarchical iterative decoding enhancement for multi-view 3D human parameter regression HIDE：针对多视角三维人体参数回归的分层迭代解码增强技术

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-05 DOI: 10.1002/cav.2266

Weitao Lin, Jiguang Zhang, Weiliang Meng, Xianglong Liu, Xiaopeng Zhang

Parametric human modeling are limited to either single-view frameworks or simple multi-view frameworks, failing to fully leverage the advantages of easily trainable single-view networks and the occlusion-resistant capabilities of multi-view images. The prevalent presence of object occlusion and self-occlusion in real-world scenarios leads to issues of robustness and accuracy in predicting human body parameters. Additionally, many methods overlook the spatial connectivity of human joints in the global estimation of model pose parameters, resulting in cumulative errors in continuous joint parameters.To address these challenges, we propose a flexible and efficient iterative decoding strategy. By extending from single-view images to multi-view video inputs, we achieve local-to-global optimization. We utilize attention mechanisms to capture the rotational dependencies between any node in the human body and all its ancestor nodes, thereby enhancing pose decoding capability. We employ a parameter-level iterative fusion of multi-view image data to achieve flexible integration of global pose information, rapidly obtaining appropriate projection features from different viewpoints, ultimately resulting in precise parameter estimation. Through experiments, we validate the effectiveness of the HIDE method on the Human3.6M and 3DPW datasets, demonstrating significantly improved visualization results compared to previous methods.

参数化人体建模局限于单视角框架或简单的多视角框架，未能充分利用易于训练的单视角网络的优势和多视角图像的抗遮挡能力。在现实世界中，物体遮挡和自我遮挡的普遍存在导致了预测人体参数的鲁棒性和准确性问题。此外，许多方法在全局估计模型姿势参数时忽略了人体关节的空间连通性，导致连续关节参数的累积误差。通过从单视角图像扩展到多视角视频输入，我们实现了从局部到全局的优化。我们利用注意力机制来捕捉人体中任何节点与其所有祖先节点之间的旋转依赖关系，从而增强姿势解码能力。我们采用参数级迭代融合多视角图像数据的方法，灵活整合全局姿态信息，从不同视角快速获取合适的投影特征，最终实现精确的参数估计。通过实验，我们在 Human3.6M 和 3DPW 数据集上验证了 HIDE 方法的有效性，与之前的方法相比，可视化效果有了显著提高。

{"title":"HIDE: Hierarchical iterative decoding enhancement for multi-view 3D human parameter regression","authors":"Weitao Lin, Jiguang Zhang, Weiliang Meng, Xianglong Liu, Xiaopeng Zhang","doi":"10.1002/cav.2266","DOIUrl":"https://doi.org/10.1002/cav.2266","url":null,"abstract":"Parametric human modeling are limited to either single-view frameworks or simple multi-view frameworks, failing to fully leverage the advantages of easily trainable single-view networks and the occlusion-resistant capabilities of multi-view images. The prevalent presence of object occlusion and self-occlusion in real-world scenarios leads to issues of robustness and accuracy in predicting human body parameters. Additionally, many methods overlook the spatial connectivity of human joints in the global estimation of model pose parameters, resulting in cumulative errors in continuous joint parameters.To address these challenges, we propose a flexible and efficient iterative decoding strategy. By extending from single-view images to multi-view video inputs, we achieve local-to-global optimization. We utilize attention mechanisms to capture the rotational dependencies between any node in the human body and all its ancestor nodes, thereby enhancing pose decoding capability. We employ a parameter-level iterative fusion of multi-view image data to achieve flexible integration of global pose information, rapidly obtaining appropriate projection features from different viewpoints, ultimately resulting in precise parameter estimation. Through experiments, we validate the effectiveness of the HIDE method on the Human3.6M and 3DPW datasets, demonstrating significantly improved visualization results compared to previous methods.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Augmenting collaborative interaction with shared visualization of eye movement and gesture in VR 在虚拟现实中通过共享眼动和手势可视化增强协作互动

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-06-04 DOI: 10.1002/cav.2264

Yang Liu, Song Zhao, Shiwei Cheng

Virtual Reality (VR)-enabled multi-user collaboration has been gradually applied in academic research and industrial applications, but it still has key problems. First, it is often difficult for users to select or manipulate objects in complex three-dimesnional spaces, which greatly affects their operational efficiency. Second, supporting natural communication cues is crucial for cooperation in VR, especially in collaborative tasks, where ambiguous verbal communication cannot effectively assign partners the task of selecting or manipulating objects. To address the above issues, in this paper, we propose a new interaction method, Eye-Gesture Combination Interaction in VR, to enhance the execution of collaborative tasks by sharing the visualization of eye movement and gesture data among partners. We conducted user experiments and showed that using dots to represent eye gaze and virtual hands to represent gestures can help users complete tasks faster than other visualization methods. Finally, we developed a VR multi-user collaborative assembly system. The results of the user study show that sharing gaze points and gestures among users can significantly improve the productivity of collaborating users. Our work can effectively improve the efficiency of multi-user collaborative systems in VR and provide new design guidelines for collaborative systems in VR.

虚拟现实（VR）支持的多用户协作已逐步应用于学术研究和工业领域，但仍存在一些关键问题。首先，用户通常很难在复杂的三维空间中选择或操作对象，这极大地影响了他们的操作效率。其次，支持自然交流线索对 VR 中的合作至关重要，尤其是在协作任务中，模棱两可的语言交流无法有效地为伙伴分配选择或操作物体的任务。针对上述问题，我们在本文中提出了一种新的交互方法--VR 中的眼动手势组合交互，通过在伙伴之间共享可视化的眼动和手势数据来增强协作任务的执行。我们进行了用户实验，结果表明，与其他可视化方法相比，使用圆点表示眼睛注视和虚拟手掌表示手势可以帮助用户更快地完成任务。最后，我们开发了一个虚拟现实多用户协作装配系统。用户研究结果表明，用户之间共享注视点和手势可以显著提高协作用户的工作效率。我们的工作可以有效提高 VR 多用户协作系统的效率，并为 VR 协作系统提供新的设计指南。

{"title":"Augmenting collaborative interaction with shared visualization of eye movement and gesture in VR","authors":"Yang Liu, Song Zhao, Shiwei Cheng","doi":"10.1002/cav.2264","DOIUrl":"https://doi.org/10.1002/cav.2264","url":null,"abstract":"Virtual Reality (VR)-enabled multi-user collaboration has been gradually applied in academic research and industrial applications, but it still has key problems. First, it is often difficult for users to select or manipulate objects in complex three-dimesnional spaces, which greatly affects their operational efficiency. Second, supporting natural communication cues is crucial for cooperation in VR, especially in collaborative tasks, where ambiguous verbal communication cannot effectively assign partners the task of selecting or manipulating objects. To address the above issues, in this paper, we propose a new interaction method, Eye-Gesture Combination Interaction in VR, to enhance the execution of collaborative tasks by sharing the visualization of eye movement and gesture data among partners. We conducted user experiments and showed that using dots to represent eye gaze and virtual hands to represent gestures can help users complete tasks faster than other visualization methods. Finally, we developed a VR multi-user collaborative assembly system. The results of the user study show that sharing gaze points and gestures among users can significantly improve the productivity of collaborating users. Our work can effectively improve the efficiency of multi-user collaborative systems in VR and provide new design guidelines for collaborative systems in VR.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141245708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A double-layer crowd evacuation simulation method based on deep reinforcement learning 基于深度强化学习的双层人群疏散模拟方法

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2024-05-30 DOI: 10.1002/cav.2280

Yong Zhang, Bo Yang, Jianlin Zhu

Existing crowd evacuation simulation methods commonly face challenges of low efficiency in path planning and insufficient realism in pedestrian movement during the evacuation process. In this study, we propose a novel crowd evacuation path planning approach based on the learning curve–deep deterministic policy gradient (LC-DDPG) algorithm. The algorithm incorporates dynamic experience pool and a priority experience sampling strategy, enhancing convergence speed and achieving higher average rewards, thus efficiently enabling global path planning. Building upon this foundation, we introduce a double-layer method for crowd evacuation using deep reinforcement learning. Specifically, within each group, individuals are categorized into leaders and followers. At the top layer, we employ the LC-DDPG algorithm to perform global path planning for the leaders. Simultaneously, at the bottom layer, an enhanced social force model guides the followers to avoid obstacles and follow the leaders during evacuation. We implemented a crowd evacuation simulation platform. Experimental results show that our proposed method has high path planning efficiency and can generate more realistic pedestrian trajectories in different scenarios and crowd sizes.

现有的人群疏散模拟方法普遍面临着路径规划效率低、疏散过程中行人移动不够逼真等难题。在本研究中，我们提出了一种基于学习曲线-深度确定性策略梯度（LC-DDPG）算法的新型人群疏散路径规划方法。该算法结合了动态经验池和优先经验采样策略，提高了收敛速度，获得了更高的平均奖励，从而有效地实现了全局路径规划。在此基础上，我们引入了一种利用深度强化学习进行人群疏散的双层方法。具体来说，在每个群体中，个体被分为领导者和追随者。在顶层，我们采用 LC-DDPG 算法为领导者执行全局路径规划。同时，在底层，一个增强的社会力模型会引导跟随者在疏散过程中避开障碍物并跟随领导者。我们建立了一个人群疏散模拟平台。实验结果表明，我们提出的方法具有很高的路径规划效率，能在不同场景和人群规模下生成更真实的行人轨迹。

{"title":"A double-layer crowd evacuation simulation method based on deep reinforcement learning","authors":"Yong Zhang, Bo Yang, Jianlin Zhu","doi":"10.1002/cav.2280","DOIUrl":"https://doi.org/10.1002/cav.2280","url":null,"abstract":"Existing crowd evacuation simulation methods commonly face challenges of low efficiency in path planning and insufficient realism in pedestrian movement during the evacuation process. In this study, we propose a novel crowd evacuation path planning approach based on the learning curve–deep deterministic policy gradient (LC-DDPG) algorithm. The algorithm incorporates dynamic experience pool and a priority experience sampling strategy, enhancing convergence speed and achieving higher average rewards, thus efficiently enabling global path planning. Building upon this foundation, we introduce a double-layer method for crowd evacuation using deep reinforcement learning. Specifically, within each group, individuals are categorized into leaders and followers. At the top layer, we employ the LC-DDPG algorithm to perform global path planning for the leaders. Simultaneously, at the bottom layer, an enhanced social force model guides the followers to avoid obstacles and follow the leaders during evacuation. We implemented a crowd evacuation simulation platform. Experimental results show that our proposed method has high path planning efficiency and can generate more realistic pedestrian trajectories in different scenarios and crowd sizes.","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0