首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
Weisfeiler-Lehman Kernel Augmented Product Representation for Queries on Large-Scale BIM Scenes 面向大规模BIM场景查询的Weisfeiler-Lehman核增强产品表示
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-26 DOI: 10.1002/cav.70043
Huiqiang Hu, Changyan He, Xiaojun Liu, Jinyuan Jia, Ting Yu

To achieve efficient querying of BIM products in large-scale virtual scenes, this study introduces a Weisfeiler-Lehman (WL) kernel augmented representation for Building Information Modeling(BIM) products based on Product Attributed Graphs (PAGs). Unlike conventional data-driven approaches that demand extensive labeling and preprocessing, our method directly processes raw BIM product data to extract stable semantic and geometric features. Initially, a PAG is constructed to encapsulate product features. Subsequently, a WL kernel enhanced multi-channel node aggregation strategy is employed to integrate BIM product attributes effectively. Leveraging the bijective relationship in graph isomorphism, an unsupervised convergence mechanism based on attribute value differences is established. Experiments demonstrate that our method achieves convergence within an average of 3 iterations, completes graph isomorphism testing in minimal time, and attains an average query accuracy of 95%. This approach outperforms 1-WL and 3-WL methods, especially in handling products with topologically isomorphic but oppositely attributed spaces.

为了实现大规模虚拟场景下BIM产品的高效查询,本研究引入了基于产品属性图(PAGs)的建筑信息建模(BIM)产品的Weisfeiler-Lehman (WL)核增强表示。与需要大量标记和预处理的传统数据驱动方法不同,我们的方法直接处理原始BIM产品数据,以提取稳定的语义和几何特征。最初,构建PAG是为了封装产品特性。随后,采用WL内核增强的多通道节点聚合策略,有效整合BIM产品属性。利用图同构中的双目标关系,建立了一种基于属性值差异的无监督收敛机制。实验表明,该方法平均在3次迭代内实现收敛,在最短时间内完成图同构测试,平均查询准确率达到95%。这种方法优于1-WL和3-WL方法,特别是在处理具有拓扑同构但相反属性空间的产品时。
{"title":"Weisfeiler-Lehman Kernel Augmented Product Representation for Queries on Large-Scale BIM Scenes","authors":"Huiqiang Hu,&nbsp;Changyan He,&nbsp;Xiaojun Liu,&nbsp;Jinyuan Jia,&nbsp;Ting Yu","doi":"10.1002/cav.70043","DOIUrl":"https://doi.org/10.1002/cav.70043","url":null,"abstract":"<div>\u0000 \u0000 <p>To achieve efficient querying of BIM products in large-scale virtual scenes, this study introduces a Weisfeiler-Lehman (WL) kernel augmented representation for Building Information Modeling(BIM) products based on Product Attributed Graphs (PAGs). Unlike conventional data-driven approaches that demand extensive labeling and preprocessing, our method directly processes raw BIM product data to extract stable semantic and geometric features. Initially, a PAG is constructed to encapsulate product features. Subsequently, a WL kernel enhanced multi-channel node aggregation strategy is employed to integrate BIM product attributes effectively. Leveraging the bijective relationship in graph isomorphism, an unsupervised convergence mechanism based on attribute value differences is established. Experiments demonstrate that our method achieves convergence within an average of 3 iterations, completes graph isomorphism testing in minimal time, and attains an average query accuracy of 95%. This approach outperforms 1-WL and 3-WL methods, especially in handling products with topologically isomorphic but oppositely attributed spaces.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Aware Pedestrian Behavior Using Reinforcement Learning in Mixed Traffic 基于强化学习的混合交通风险感知行人行为研究
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-25 DOI: 10.1002/cav.70031
Cheng-En Cai, Sai-Keung Wong, Tzu-Yu Chen

This paper introduces a reinforcement learning method to simulate agents crossing roads in unsignalized, mixed-traffic environments. These agents represent individual pedestrians or small groups. The method ensures that agents adopt safe interactions with nearby dynamic obstacles (bikes, motorcycles, or cars) by considering factors such as conflict zones and post-encroachment times. Risk assessments based on interaction times encourage agents to avoid hazardous behaviors. Additionally, risk-informed reward terms incentivize agents to perform safe actions, while collision penalties deter collisions. The method achieved collision-free crossings and demonstrated normal, conservative, and aggressive pedestrian behaviors in various scenarios. Finally, ablation tests revealed the impact of reward weights, reward terms, and key agent state components. The weights of reward terms can be adjusted to achieve either conservative or aggressive pedestrian crossing behaviors, balancing road crossing efficiency and safety.

本文介绍了一种用于模拟无信号混合交通环境下智能体过马路的强化学习方法。这些代理代表单个行人或小团体。该方法通过考虑冲突区域和入侵后时间等因素,确保智能体与附近的动态障碍物(自行车、摩托车或汽车)进行安全交互。基于互动时间的风险评估鼓励代理人避免危险行为。此外,风险知情的奖励条款激励代理执行安全操作,而碰撞惩罚则阻止碰撞。该方法实现了无碰撞过马路,并在不同场景下展示了正常、保守和攻击性的行人行为。最后,消融测试揭示了奖励权重、奖励条款和关键代理状态组件的影响。通过调整奖励条件的权重,可以实现保守或激进的行人过马路行为,平衡过马路效率和安全性。
{"title":"Risk-Aware Pedestrian Behavior Using Reinforcement Learning in Mixed Traffic","authors":"Cheng-En Cai,&nbsp;Sai-Keung Wong,&nbsp;Tzu-Yu Chen","doi":"10.1002/cav.70031","DOIUrl":"https://doi.org/10.1002/cav.70031","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper introduces a reinforcement learning method to simulate agents crossing roads in unsignalized, mixed-traffic environments. These agents represent individual pedestrians or small groups. The method ensures that agents adopt safe interactions with nearby dynamic obstacles (bikes, motorcycles, or cars) by considering factors such as conflict zones and post-encroachment times. Risk assessments based on interaction times encourage agents to avoid hazardous behaviors. Additionally, risk-informed reward terms incentivize agents to perform safe actions, while collision penalties deter collisions. The method achieved collision-free crossings and demonstrated normal, conservative, and aggressive pedestrian behaviors in various scenarios. Finally, ablation tests revealed the impact of reward weights, reward terms, and key agent state components. The weights of reward terms can be adjusted to achieve either conservative or aggressive pedestrian crossing behaviors, balancing road crossing efficiency and safety.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Motion In-Betweening via Recursive Keyframe Prediction 通过递归关键帧预测运动之间
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-25 DOI: 10.1002/cav.70035
Rui Zeng, Ju Dai, Junxuan Bai, Junjun Pan

Motion in-betweening is a flexible and efficient technique for generating 3-dimensional animations. In this paper, we propose a keyframe-driven method that effectively addresses the pose ambiguity issue and achieves robust in-betweening performance. We introduce a keyframe-driven synthesis framework. At each recursion, the key poses at both ends keep predicting the new one at the midpoint. The recursive breakdown reduces motion ambiguities by simplifying the in-betweening sequence as the integration of short clips. The hybrid positional encoding scales the hidden states to adapt to long- and short-term dependencies. Additionally, we employ a temporal refinement network to capture the local motion relationships, thereby enhancing the consistency of the predicted pose sequence. Through comprehensive evaluations that include both quantitative and qualitative comparisons, the proposed model demonstrates its competitiveness in prediction accuracy and in-betweening flexibility.

中间运动是一种灵活有效的三维动画生成技术。在本文中,我们提出了一种关键帧驱动的方法,有效地解决了姿态模糊问题,并实现了鲁棒的中间性能。我们介绍了一个关键帧驱动的合成框架。在每次递归中,两端的关键姿势都能预测中点的新姿势。递归分解通过简化中间序列作为短片段的集成来减少运动的模糊性。混合位置编码扩展隐藏状态以适应长期和短期依赖关系。此外,我们采用时间优化网络来捕获局部运动关系,从而增强预测姿态序列的一致性。通过定量和定性比较的综合评价,表明该模型在预测精度和中间灵活性方面具有竞争力。
{"title":"Motion In-Betweening via Recursive Keyframe Prediction","authors":"Rui Zeng,&nbsp;Ju Dai,&nbsp;Junxuan Bai,&nbsp;Junjun Pan","doi":"10.1002/cav.70035","DOIUrl":"https://doi.org/10.1002/cav.70035","url":null,"abstract":"<div>\u0000 \u0000 <p>Motion in-betweening is a flexible and efficient technique for generating 3-dimensional animations. In this paper, we propose a keyframe-driven method that effectively addresses the pose ambiguity issue and achieves robust in-betweening performance. We introduce a keyframe-driven synthesis framework. At each recursion, the key poses at both ends keep predicting the new one at the midpoint. The recursive breakdown reduces motion ambiguities by simplifying the in-betweening sequence as the integration of short clips. The hybrid positional encoding scales the hidden states to adapt to long- and short-term dependencies. Additionally, we employ a temporal refinement network to capture the local motion relationships, thereby enhancing the consistency of the predicted pose sequence. Through comprehensive evaluations that include both quantitative and qualitative comparisons, the proposed model demonstrates its competitiveness in prediction accuracy and in-betweening flexibility.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GSFaceMorpher: High-Fidelity 3D Face Morphing via Gaussian Splatting GSFaceMorpher:高保真3D人脸变形通过高斯飞溅
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-23 DOI: 10.1002/cav.70036
Xiwen Shi, Hao Zhao, Yi Jiang, Hao Xu, Ziyi Yang, Yiqian Wu, Qingbiao Wu, Xiaogang Jin

High-fidelity 3D face morphing aims to achieve seamless transitions between realistic 3D facial representations of different identities. Although 3D Gaussian Splatting (3DGS) excels in high-quality rendering, its application to morphing is hindered by the lack of Gaussian primitive correspondence and variations in primitive quantities. To address this, we propose GSFaceMorpher, which is a novel framework for high-fidelity 3D face morphing based on 3DGS. Our method constructs an auxiliary model that bridges the source and target face models by aligning the geometry through Radial Basis Function (RBF) warping and optimizing the appearance in the image space. This auxiliary model enables smooth parameter interpolation, whereas a diffusion-based refinement step enhances critical facial details through attention replacement from the reference faces. Experiments demonstrate that our method produces visually coherent and high-fidelity morphing sequences, significantly outperforming NeRF-based baselines in terms of both quantitative metrics and user preferences. Our work establishes a new benchmark for high-fidelity 3D face morphing with applications in visual effects, animation, and immersive experiences.

高保真3D面部变形旨在实现不同身份的逼真3D面部表现之间的无缝转换。虽然3D高斯飞溅(3DGS)在高质量渲染方面表现优异,但由于缺乏高斯原语对应关系和原语量的变化,阻碍了其在变形中的应用。为了解决这个问题,我们提出了GSFaceMorpher,这是一个基于3DGS的高保真3D人脸变形的新框架。该方法通过径向基函数(RBF)翘曲对齐几何图形并优化图像空间中的外观,构建了一个辅助模型,该模型将源和目标面部模型连接起来。该辅助模型能够实现平滑的参数插值,而基于扩散的细化步骤通过参考面部的注意力替换来增强关键面部细节。实验表明,我们的方法产生了视觉上连贯和高保真的变形序列,在定量指标和用户偏好方面都明显优于基于nerf的基线。我们的工作为高保真3D人脸变形在视觉效果、动画和沉浸式体验方面的应用建立了新的基准。
{"title":"GSFaceMorpher: High-Fidelity 3D Face Morphing via Gaussian Splatting","authors":"Xiwen Shi,&nbsp;Hao Zhao,&nbsp;Yi Jiang,&nbsp;Hao Xu,&nbsp;Ziyi Yang,&nbsp;Yiqian Wu,&nbsp;Qingbiao Wu,&nbsp;Xiaogang Jin","doi":"10.1002/cav.70036","DOIUrl":"https://doi.org/10.1002/cav.70036","url":null,"abstract":"<div>\u0000 \u0000 <p>High-fidelity 3D face morphing aims to achieve seamless transitions between realistic 3D facial representations of different identities. Although 3D Gaussian Splatting (3DGS) excels in high-quality rendering, its application to morphing is hindered by the lack of Gaussian primitive correspondence and variations in primitive quantities. To address this, we propose <i>GSFaceMorpher</i>, which is a novel framework for high-fidelity 3D face morphing based on 3DGS. Our method constructs an auxiliary model that bridges the source and target face models by aligning the geometry through Radial Basis Function (RBF) warping and optimizing the appearance in the image space. This auxiliary model enables smooth parameter interpolation, whereas a diffusion-based refinement step enhances critical facial details through attention replacement from the reference faces. Experiments demonstrate that our method produces visually coherent and high-fidelity morphing sequences, significantly outperforming NeRF-based baselines in terms of both quantitative metrics and user preferences. Our work establishes a new benchmark for high-fidelity 3D face morphing with applications in visual effects, animation, and immersive experiences.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Control Simulation of Multiple Bubbles for Representing Desired Shapes 表示所需形状的多个气泡的控制仿真
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-23 DOI: 10.1002/cav.70037
Naruo Nishio, Syuhei Sato, Kaisei Sakurai, Keiko Nakamoto

This paper presents a control simulation that represents user-desired shapes using multiple connected soap bubbles. A previous method attempted to control a single soap bubble using external forces. However, due to the strong surface tension making spherical babbles, elongated shapes could not be achieved. To address this issue, this paper aims to develop a control simulation that achieves diverse soap bubble shapes by dividing the target shape into connected soap bubbles. In our approach, we first generate an initial soap bubble configuration composed of multiple bubbles to represent the target shape. Then, by applying external forces to each bubble, we simulate the bubbles to maintain their shape along the target form. We use an implicit-function-like representation for the connected soap bubbles and develop a new polygonizer that makes shapes including the internal faces of bubbles. By demonstrating examples with various target shapes such as objects and text, we show the effectiveness of our proposed control method.

本文提出了一个控制仿真,该仿真使用多个连接的肥皂泡来表示用户所需的形状。之前的一种方法试图用外力来控制单个肥皂泡。然而,由于强大的表面张力使球泡沫化,拉长的形状不能实现。为了解决这一问题,本文旨在开发一种控制仿真,通过将目标形状划分为连接的肥皂泡来实现不同的肥皂泡形状。在我们的方法中,我们首先生成由多个气泡组成的初始肥皂泡配置,以表示目标形状。然后,通过对每个气泡施加外力,我们模拟气泡沿目标形状保持其形状。我们对连接的肥皂泡使用类似隐式函数的表示,并开发了一种新的多边形器,可以制作包括气泡内部面在内的形状。通过各种目标形状(如物体和文本)的示例,我们证明了所提出的控制方法的有效性。
{"title":"A Control Simulation of Multiple Bubbles for Representing Desired Shapes","authors":"Naruo Nishio,&nbsp;Syuhei Sato,&nbsp;Kaisei Sakurai,&nbsp;Keiko Nakamoto","doi":"10.1002/cav.70037","DOIUrl":"https://doi.org/10.1002/cav.70037","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper presents a control simulation that represents user-desired shapes using multiple connected soap bubbles. A previous method attempted to control a single soap bubble using external forces. However, due to the strong surface tension making spherical babbles, elongated shapes could not be achieved. To address this issue, this paper aims to develop a control simulation that achieves diverse soap bubble shapes by dividing the target shape into connected soap bubbles. In our approach, we first generate an initial soap bubble configuration composed of multiple bubbles to represent the target shape. Then, by applying external forces to each bubble, we simulate the bubbles to maintain their shape along the target form. We use an implicit-function-like representation for the connected soap bubbles and develop a new polygonizer that makes shapes including the internal faces of bubbles. By demonstrating examples with various target shapes such as objects and text, we show the effectiveness of our proposed control method.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Talk With Socrates: Relation Between Perceived Agent Personality and User Personality in LLM-Based Natural Language Dialogue Using Virtual Reality 与苏格拉底对话:基于llm的虚拟现实自然语言对话中感知代理人格与用户人格的关系
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-22 DOI: 10.1002/cav.70033
Mehmet Efe Sak, Sinan Sonlu, Uğur Güdükbay

Large Language Models (LLMs) offer almost immediate human-like quality responses to user queries. Conversational agent systems support natural language dialogues utilizing LLM backends in combination with Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) technologies, enabling life-like characters in virtual environments. This study investigates the relationship between user personality and perceived agent personality in LLM-based natural language dialogue. We adopt a Virtual Reality (VR) setting where the user can talk with the agent that assumes the role of Socrates, the famous philosopher. To this end, we utilize a three-dimensional (3D) avatar model resembling Socrates and use specific LLM prompts to get stylistic answers from OpenAI's Chat Completions Application Programming Interface (API). Our user study measures the agent's personality and the system's ease of use, quality, realism, and immersion concerning the user's self-reported personality. The results suggest that the user's conscientiousness, extraversion, and emotional stability have a moderate effect on certain personality factors and system qualities. User conscientiousness affects the perceived ease of use, quality, and realism, while user extraversion affects perceived agent conscientiousness, system realism, and immersion. Additionally, the user's emotional stability correlates with perceived extraversion and agreeableness.

大型语言模型(llm)为用户查询提供几乎即时的类似人类质量的响应。会话代理系统利用LLM后端与文本到语音(TTS)和自动语音识别(ASR)技术相结合,支持自然语言对话,在虚拟环境中实现栩栩如生的角色。本研究探讨了基于llm的自然语言对话中用户人格与感知代理人格之间的关系。我们采用虚拟现实(VR)设置,用户可以与扮演著名哲学家苏格拉底的代理进行对话。为此,我们利用类似于苏格拉底的三维(3D)化身模型,并使用特定的LLM提示从OpenAI的聊天完成应用程序编程接口(API)获得风格答案。我们的用户研究测量了代理的个性和系统的易用性、质量、现实性以及与用户自我报告的个性相关的沉浸感。结果表明,用户的责任心、外向性和情绪稳定性对某些人格因素和系统质量有中等影响。用户尽责性影响感知到的易用性、质量和现实性,而用户外向性影响感知到的代理尽责性、系统现实性和沉浸性。此外,用户的情绪稳定性与感知到的外向性和亲和性相关。
{"title":"Talk With Socrates: Relation Between Perceived Agent Personality and User Personality in LLM-Based Natural Language Dialogue Using Virtual Reality","authors":"Mehmet Efe Sak,&nbsp;Sinan Sonlu,&nbsp;Uğur Güdükbay","doi":"10.1002/cav.70033","DOIUrl":"https://doi.org/10.1002/cav.70033","url":null,"abstract":"<div>\u0000 \u0000 <p>Large Language Models (LLMs) offer almost immediate human-like quality responses to user queries. Conversational agent systems support natural language dialogues utilizing LLM backends in combination with Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) technologies, enabling life-like characters in virtual environments. This study investigates the relationship between user personality and perceived agent personality in LLM-based natural language dialogue. We adopt a Virtual Reality (VR) setting where the user can talk with the agent that assumes the role of Socrates, the famous philosopher. To this end, we utilize a three-dimensional (3D) avatar model resembling Socrates and use specific LLM prompts to get stylistic answers from OpenAI's Chat Completions Application Programming Interface (API). Our user study measures the agent's personality and the system's ease of use, quality, realism, and immersion concerning the user's self-reported personality. The results suggest that the user's conscientiousness, extraversion, and emotional stability have a moderate effect on certain personality factors and system qualities. User conscientiousness affects the perceived ease of use, quality, and realism, while user extraversion affects perceived agent conscientiousness, system realism, and immersion. Additionally, the user's emotional stability correlates with perceived extraversion and agreeableness.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MemorIA, an Architecture for Creating Interactive AI Historical Agents in Educational Contexts MemorIA,一个用于在教育环境中创建交互式AI历史代理的架构
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-22 DOI: 10.1002/cav.70032
Antoine Oger, Geoffrey Gorisse, Sylvain Fleury, Olivier Christmann

This article presents the architecture of MemorIA, an integrative system that combines existing AI technologies into a coherent educational framework for creating interactive historical agents, with the aim of fostering students' learning interest. MemorIA generates animated digital portraits of historical figures, synchronizing facial expressions with synthesized speech to enable natural conversations with students. The system leverages NVIDIA Audio2Face for real-time facial animation with first-order motion model for portrait manipulation, achieving fluid interaction through low-latency audio-visual streaming. To assess our architecture in a field situation, we conducted a pilot study in middle school history classes, where students and teachers engaged in direct conversation with a virtual Julius Caesar during Roman history lessons. Students asked questions about ancient Rome, receiving contextually appropriate responses. While qualitative feedback suggests a positive trend toward increased student participation, some weaknesses and ethical considerations emerged. Based on this assessment, we discuss implementation challenges, suggest architectural improvements, and explore potential applications across various disciplines.

本文介绍了MemorIA的架构,这是一个集成系统,将现有的人工智能技术结合到一个连贯的教育框架中,用于创建交互式历史代理,旨在培养学生的学习兴趣。MemorIA生成历史人物的动画数字肖像,将面部表情与合成语音同步,以便与学生进行自然对话。该系统利用NVIDIA Audio2Face进行实时面部动画,并使用一阶运动模型进行肖像操作,通过低延迟视听流实现流畅的交互。为了在实地情况下评估我们的建筑,我们在中学历史课上进行了一项试点研究,学生和老师在罗马历史课上与虚拟的凯撒大帝进行直接对话。学生们问了一些关于古罗马的问题,并得到了符合语境的回答。虽然定性反馈表明学生参与增加的积极趋势,但也出现了一些弱点和道德问题。基于此评估,我们将讨论实现挑战,提出架构改进建议,并探索跨不同学科的潜在应用程序。
{"title":"MemorIA, an Architecture for Creating Interactive AI Historical Agents in Educational Contexts","authors":"Antoine Oger,&nbsp;Geoffrey Gorisse,&nbsp;Sylvain Fleury,&nbsp;Olivier Christmann","doi":"10.1002/cav.70032","DOIUrl":"https://doi.org/10.1002/cav.70032","url":null,"abstract":"<p>This article presents the architecture of MemorIA, an integrative system that combines existing AI technologies into a coherent educational framework for creating interactive historical agents, with the aim of fostering students' learning interest. MemorIA generates animated digital portraits of historical figures, synchronizing facial expressions with synthesized speech to enable natural conversations with students. The system leverages NVIDIA Audio2Face for real-time facial animation with first-order motion model for portrait manipulation, achieving fluid interaction through low-latency audio-visual streaming. To assess our architecture in a field situation, we conducted a pilot study in middle school history classes, where students and teachers engaged in direct conversation with a virtual Julius Caesar during Roman history lessons. Students asked questions about ancient Rome, receiving contextually appropriate responses. While qualitative feedback suggests a positive trend toward increased student participation, some weaknesses and ethical considerations emerged. Based on this assessment, we discuss implementation challenges, suggest architectural improvements, and explore potential applications across various disciplines.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.70032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fuzzy Sampling With Qualified Uniformity Properties for Implicitly Defined Curves and Surfaces 隐定义曲线和曲面具有合格均匀性的模糊采样
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-21 DOI: 10.1002/cav.70022
Mingxiao Hu, Linlin Ge, Xujie Li

Sampled point clouds, particularly with prelabeled annotations and ground truth metrics, are frequently used in computer graphics and machine learning. In this work, we focus on a fuzzy sampling approach for such point clouds with qualified uniformity properties. After abstracting the uniformity requirements, a novel approach to sampling point clouds from implicitly defined curves/surfaces is proposed. The approach deliberately combines techniques including isodeviation dispatch, curvature compensation, and normalized distance blue noise. The experimental results show various sampled point clouds with uniform visual effects and statistical metrics. Moreover, the comparisons in terms of distance, density, and thickness uniformity with state-of-the-art methods exhibit the approach's advantages. Due to its low cost, ground truth, and annotation easiness features, the method will be smoothly applied in deep learning and computer animation.

采样点云,特别是带有预标记注释和地面真值度量的点云,经常用于计算机图形学和机器学习。在这项工作中,我们重点研究了具有合格均匀性的点云的模糊采样方法。在抽象了均匀性要求后,提出了一种从隐式曲线/曲面中采样点云的新方法。该方法结合了等偏差调度、曲率补偿和归一化距离蓝噪声等技术。实验结果表明,不同采样点云具有均匀的视觉效果和统计指标。此外,与最先进的方法在距离、密度和厚度均匀性方面的比较显示了该方法的优势。该方法具有成本低、真实、标注容易等特点,将在深度学习和计算机动画等领域得到很好的应用。
{"title":"Fuzzy Sampling With Qualified Uniformity Properties for Implicitly Defined Curves and Surfaces","authors":"Mingxiao Hu,&nbsp;Linlin Ge,&nbsp;Xujie Li","doi":"10.1002/cav.70022","DOIUrl":"https://doi.org/10.1002/cav.70022","url":null,"abstract":"<div>\u0000 \u0000 <p>Sampled point clouds, particularly with prelabeled annotations and ground truth metrics, are frequently used in computer graphics and machine learning. In this work, we focus on a fuzzy sampling approach for such point clouds with qualified uniformity properties. After abstracting the uniformity requirements, a novel approach to sampling point clouds from implicitly defined curves/surfaces is proposed. The approach deliberately combines techniques including isodeviation dispatch, curvature compensation, and normalized distance blue noise. The experimental results show various sampled point clouds with uniform visual effects and statistical metrics. Moreover, the comparisons in terms of distance, density, and thickness uniformity with state-of-the-art methods exhibit the approach's advantages. Due to its low cost, ground truth, and annotation easiness features, the method will be smoothly applied in deep learning and computer animation.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144100544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Multi-Feature Fusion Shadow Puppet Motifs Generation Based on CSPMotifsGAN and Cultural Heritage Preservation 基于CSPMotifsGAN和文化遗产保护的多特征融合皮影图案生成研究
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-21 DOI: 10.1002/cav.70047
Hui Liang, Rui Wang

As quintessential cultural symbols in traditional shadow puppetry, artistic motifs encapsulate profound historical narratives and serve as vital conduits for intangible cultural heritage preservation. However, this craft confronts existential threats from digital entertainment proliferation and practitioner attrition. To address these challenges, this study proposes CSPMotifsGAN, an enhanced CycleGAN framework for constructing a motif data set through three-stage processing: adaptive denoising, hierarchical classification, and multi-branch feature extraction (contour, texture, color). By integrating adversarial loss, cycle-consistency loss, and identity preservation loss, the model effectively resolves color distortion and textural degradation inherent in conventional CycleGAN. Experimental results demonstrate significant improvements: Fréchet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM), validated through both subjective evaluations and statistical analysis.

作为传统皮影戏中典型的文化符号,艺术图案蕴含着深刻的历史叙事,是非物质文化遗产保护的重要渠道。然而,这门手艺面临着来自数字娱乐扩散和从业者流失的生存威胁。为了解决这些挑战,本研究提出了CSPMotifsGAN,这是一个增强的CycleGAN框架,通过三个阶段的处理来构建motif数据集:自适应去噪、分层分类和多分支特征提取(轮廓、纹理、颜色)。该模型通过集成对抗损失、周期一致性损失和身份保持损失,有效地解决了传统CycleGAN固有的颜色失真和纹理退化问题。实验结果表明:通过主观评价和统计分析验证了fr起始距离(FID)、峰值信噪比(PSNR)和结构相似指数(SSIM)的有效性。
{"title":"Research on Multi-Feature Fusion Shadow Puppet Motifs Generation Based on CSPMotifsGAN and Cultural Heritage Preservation","authors":"Hui Liang,&nbsp;Rui Wang","doi":"10.1002/cav.70047","DOIUrl":"https://doi.org/10.1002/cav.70047","url":null,"abstract":"<div>\u0000 \u0000 <p>As quintessential cultural symbols in traditional shadow puppetry, artistic motifs encapsulate profound historical narratives and serve as vital conduits for intangible cultural heritage preservation. However, this craft confronts existential threats from digital entertainment proliferation and practitioner attrition. To address these challenges, this study proposes CSPMotifsGAN, an enhanced CycleGAN framework for constructing a motif data set through three-stage processing: adaptive denoising, hierarchical classification, and multi-branch feature extraction (contour, texture, color). By integrating adversarial loss, cycle-consistency loss, and identity preservation loss, the model effectively resolves color distortion and textural degradation inherent in conventional CycleGAN. Experimental results demonstrate significant improvements: Fréchet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM), validated through both subjective evaluations and statistical analysis.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Fluoroscopy Guided Robotic Needle Insertion for Radio Frequency Ablation 射频消融术中透视引导机器人插针的研究
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-05-19 DOI: 10.1002/cav.70025
Thuc-Long Ha, Juan Verde, Julien Bert, Hadrien Courtecuisse

This article presents a fluoroscopy image-based registration method along with a comprehensive protocol for robotic needle insertion in radiofrequency ablation (RFA) to treat liver cancer. The proposed method uses real-time fluoroscopic images acquired from a C-ARM system and integrates an inverse finite element (FE) simulation to compute robotic commands for accurate and adaptive needle steering. The registration procedure is fully automated and involves the injection of multiple radiopaque markers into the liver, enabling precise anatomical registration and targeted tumor localization. A key challenge addressed in this work is the integration of this image-based registration with the inverse biomechanical simulation used to guide the robot during insertion. We describe how registration constraints can be mapped onto the surface of the biomechanical model to ensure consistent alignment between image data and robotic actuation. Designed to be adaptable to varying levels of radiologist expertise and applicable across a wide range of tumor locations, this method provides a robust and versatile solution for improving the accuracy and safety of minimally invasive liver cancer treatments.

本文介绍了一种基于透视图像的配准方法以及一种综合方案,用于射频消融(RFA)治疗肝癌的机器人针插入。该方法利用从C-ARM系统获取的实时透视图像,并集成了逆有限元(FE)仿真来计算机器人指令,以实现精确和自适应的针转向。登记过程是完全自动化的,包括向肝脏注射多个不透射线的标记物,从而实现精确的解剖登记和靶向肿瘤定位。在这项工作中解决的一个关键挑战是将这种基于图像的配准与用于在插入过程中引导机器人的逆生物力学模拟相结合。我们描述了如何将配准约束映射到生物力学模型的表面上,以确保图像数据和机器人驱动之间的一致对齐。该方法旨在适应不同水平的放射科医生的专业知识,适用于广泛的肿瘤位置,为提高微创肝癌治疗的准确性和安全性提供了一个强大而通用的解决方案。
{"title":"Toward Fluoroscopy Guided Robotic Needle Insertion for Radio Frequency Ablation","authors":"Thuc-Long Ha,&nbsp;Juan Verde,&nbsp;Julien Bert,&nbsp;Hadrien Courtecuisse","doi":"10.1002/cav.70025","DOIUrl":"https://doi.org/10.1002/cav.70025","url":null,"abstract":"<p>This article presents a fluoroscopy image-based registration method along with a comprehensive protocol for robotic needle insertion in radiofrequency ablation (RFA) to treat liver cancer. The proposed method uses real-time fluoroscopic images acquired from a C-ARM system and integrates an inverse finite element (FE) simulation to compute robotic commands for accurate and adaptive needle steering. The registration procedure is fully automated and involves the injection of multiple radiopaque markers into the liver, enabling precise anatomical registration and targeted tumor localization. A key challenge addressed in this work is the integration of this image-based registration with the inverse biomechanical simulation used to guide the robot during insertion. We describe how registration constraints can be mapped onto the surface of the biomechanical model to ensure consistent alignment between image data and robotic actuation. Designed to be adaptable to varying levels of radiologist expertise and applicable across a wide range of tumor locations, this method provides a robust and versatile solution for improving the accuracy and safety of minimally invasive liver cancer treatments.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.70025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144085081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1