首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
Towards Extended Reality in Emergency Response: Guidelines and Challenges for First Responder Friendly Augmented Interfaces 走向扩展现实的应急响应:指导方针和挑战的第一响应者友好的增强界面
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-11 DOI: 10.1002/cav.70056
Fatih Oztank, Selim Balcisoy

As Extended Reality (XR) technologies continue gaining popularity, various domains seek to integrate them into their workflows to enhance performance and user satisfaction. However, integrating XR technologies into emergency response presents unique challenges. Unlike other fields, such as healthcare, entertainment, or education, emergency response involves physically demanding environments and information-intensive tasks that first responders (FRs) must perform. Augmented reality (AR) head-mounted displays (HMDs) present promising solutions for improving situational awareness and reducing the cognitive load of the FRs. However, limited research has focused on the specific needs of FRs. Moreover, existing studies investigating FR needs have primarily been conducted in controlled laboratory settings, revealing a significant gap in the literature concerning FR requirements in real-life scenarios. This work addresses this gap through a comprehensive user study with subject matter experts (SMEs) and FRs. User studies were conducted after two different real-life scenarios using AR HMDs. To further understand FR needs, we extensively reviewed the literature for similar studies that reported FR needs, explicitly focusing on studies including interviews with SMEs and FRs. Our findings identified key design guidelines for FR-friendly AR interfaces while also highlighting the direction for future research to improve the user experience of the FRs.

随着扩展现实(XR)技术的不断普及,各个领域都在寻求将其集成到工作流中,以提高性能和用户满意度。然而,将XR技术整合到应急响应中存在独特的挑战。与医疗保健、娱乐或教育等其他领域不同,应急响应涉及对物理要求很高的环境和第一响应者(FRs)必须执行的信息密集型任务。增强现实(AR)头戴式显示器(hmd)为改善情景感知和减少FRs认知负荷提供了有希望的解决方案,然而,针对FRs特定需求的研究有限,而且,现有的研究主要是在受控的实验室环境中进行的,这表明有关现实场景中FRs需求的文献存在很大差距。这项工作通过与主题专家(sme)和fr进行全面的用户研究来解决这一差距。用户研究是在使用AR头戴式显示器的两种不同现实场景后进行的。为了进一步了解FR需求,我们广泛地回顾了报告FR需求的类似研究的文献,明确地关注包括对中小企业和FR的访谈的研究。我们的研究结果确定了FR友好的AR界面的关键设计准则,同时也强调了未来研究的方向,以改善FR的用户体验。
{"title":"Towards Extended Reality in Emergency Response: Guidelines and Challenges for First Responder Friendly Augmented Interfaces","authors":"Fatih Oztank,&nbsp;Selim Balcisoy","doi":"10.1002/cav.70056","DOIUrl":"https://doi.org/10.1002/cav.70056","url":null,"abstract":"<div>\u0000 \u0000 <p>As Extended Reality (XR) technologies continue gaining popularity, various domains seek to integrate them into their workflows to enhance performance and user satisfaction. However, integrating XR technologies into emergency response presents unique challenges. Unlike other fields, such as healthcare, entertainment, or education, emergency response involves physically demanding environments and information-intensive tasks that first responders (FRs) must perform. Augmented reality (AR) head-mounted displays (HMDs) present promising solutions for improving situational awareness and reducing the cognitive load of the FRs. However, limited research has focused on the specific needs of FRs. Moreover, existing studies investigating FR needs have primarily been conducted in controlled laboratory settings, revealing a significant gap in the literature concerning FR requirements in real-life scenarios. This work addresses this gap through a comprehensive user study with subject matter experts (SMEs) and FRs. User studies were conducted after two different real-life scenarios using AR HMDs. To further understand FR needs, we extensively reviewed the literature for similar studies that reported FR needs, explicitly focusing on studies including interviews with SMEs and FRs. Our findings identified key design guidelines for FR-friendly AR interfaces while also highlighting the direction for future research to improve the user experience of the FRs.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144256281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Social Force Model-Driven Multi-Agent Generative Adversarial Imitation Learning Framework for Pedestrian Trajectory Prediction 一种改进的社会力模型驱动的多智能体生成对抗模仿学习框架用于行人轨迹预测
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-09 DOI: 10.1002/cav.70058
Wen Zhou, Wangyu Shen, Xinyi Meng

Recently, crowd trajectory prediction has attracted increasing attention. In particular, the simulation of pedestrian movement in scenarios such as crowd evacuation has gained increasing focus. The social force model is a promising and effective method for predicting the stochastic movement of pedestrians. However, individual heterogeneity, group-driven cooperation, and poor self-adaptive environmental interactive capabilities have not been comprehensively considered. This often makes it difficult to reproduce real scenarios. Therefore, a group-enabled social force model-driven multi-agent generative adversarial imitation learning framework, namely, SFMAGAIL, is proposed. Specifically, (1) a group-enabled individual heterogeneity schema is utilized to obtain related expert trajectories, which are fully incorporated into the desire force and group-enabled paradigms; (2) A joint policy is used to exploit the connection between the agents and the environment; and (3) To explore the intrinsic features of expert trajectories, an actor–critic-based multi-agent adversarial imitation learning framework is presented to generate effective trajectories. Finally, extensive experiments based on 2D and 3D virtual scenarios are conducted to validate our method. The results show that our proposed method is superior to the compared methods.

近年来,人群轨迹预测越来越受到人们的关注。特别是在人群疏散等场景中行人运动的模拟越来越受到关注。社会力模型是预测行人随机运动的一种有效方法。然而,个体异质性、群体驱动型合作、自适应环境交互能力差等因素并未得到全面考虑。这通常会使再现真实场景变得困难。因此,本文提出了一个群体支持的社会力量模型驱动的多智能体生成对抗模仿学习框架SFMAGAIL。具体而言,(1)利用群体赋能的个体异质性图式获得相关的专家轨迹,该轨迹充分融入了欲望力范式和群体赋能范式;(2)采用联合策略,利用agent与环境之间的联系;(3)为了探索专家轨迹的内在特征,提出了一种基于行动者批判的多智能体对抗模仿学习框架来生成有效的轨迹。最后,基于二维和三维虚拟场景的大量实验验证了我们的方法。结果表明,本文提出的方法优于比较方法。
{"title":"An Improved Social Force Model-Driven Multi-Agent Generative Adversarial Imitation Learning Framework for Pedestrian Trajectory Prediction","authors":"Wen Zhou,&nbsp;Wangyu Shen,&nbsp;Xinyi Meng","doi":"10.1002/cav.70058","DOIUrl":"https://doi.org/10.1002/cav.70058","url":null,"abstract":"<div>\u0000 \u0000 <p>Recently, crowd trajectory prediction has attracted increasing attention. In particular, the simulation of pedestrian movement in scenarios such as crowd evacuation has gained increasing focus. The social force model is a promising and effective method for predicting the stochastic movement of pedestrians. However, individual heterogeneity, group-driven cooperation, and poor self-adaptive environmental interactive capabilities have not been comprehensively considered. This often makes it difficult to reproduce real scenarios. Therefore, a group-enabled social force model-driven multi-agent generative adversarial imitation learning framework, namely, SFMAGAIL, is proposed. Specifically, (1) a group-enabled individual heterogeneity schema is utilized to obtain related expert trajectories, which are fully incorporated into the desire force and group-enabled paradigms; (2) A joint policy is used to exploit the connection between the agents and the environment; and (3) To explore the intrinsic features of expert trajectories, an actor–critic-based multi-agent adversarial imitation learning framework is presented to generate effective trajectories. Finally, extensive experiments based on 2D and 3D virtual scenarios are conducted to validate our method. The results show that our proposed method is superior to the compared methods.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Multiuser Panoramic Video Transmission in VR: A Machine Learning-Driven Approach VR中优化的多用户全景视频传输:一种机器学习驱动的方法
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-07 DOI: 10.1002/cav.70060
Wei Xun, Songlin Zhang

In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.

在本文中,我们提出了一种机器学习驱动的模型来优化虚拟现实环境中多用户全景视频传输。该模型使用历史头部方向数据和视频显著性信息来预测用户未来的视场(FOV),从而实现基于个人视角的有针对性的视频传输。通过将全景视频分割成小块并应用金字塔编码方案,我们自适应地在用户的fov内传输高质量的内容,同时在外围区域使用低质量的传输。这种方法有效地减少了带宽消耗,同时保持了高质量的观看体验。我们的实验结果表明,将用户视点数据与视频显著性特征相结合可以显著提高长期视点预测精度,从而实现更高效、以用户为中心的传输模型。所提出的方法在增强VR全景视频流的沉浸式体验方面具有很大的潜力,特别是在带宽受限的环境中。
{"title":"Optimized Multiuser Panoramic Video Transmission in VR: A Machine Learning-Driven Approach","authors":"Wei Xun,&nbsp;Songlin Zhang","doi":"10.1002/cav.70060","DOIUrl":"https://doi.org/10.1002/cav.70060","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CLPFusion: A Latent Diffusion Model Framework for Realistic Chinese Landscape Painting Style Transfer CLPFusion:中国写实山水画风格转移的潜在扩散模型框架
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-07 DOI: 10.1002/cav.70053
Jiahui Pan, Frederick W. B. Li, Bailin Yang, Fangzhe Nan

This study focuses on transforming real-world scenery into Chinese landscape painting masterpieces through style transfer. Traditional methods using convolutional neural networks (CNNs) and generative adversarial networks (GANs) often yield inconsistent patterns and artifacts. The rise of diffusion models (DMs) presents new opportunities for realistic image generation, but their inherent noise characteristics make it challenging to synthesize pure white or black images. Consequently, existing DM-based methods struggle to capture the unique style and color information of Chinese landscape paintings. To overcome these limitations, we propose CLPFusion, a novel framework that leverages pre-trained diffusion models for artistic style transfer. A key innovation is the Bidirectional State Space Models-CrossAttention (BiSSM-CA) module, which efficiently learns and retains the distinct styles of Chinese landscape paintings. Additionally, we introduce two latent space feature adjustment methods, Latent-AdaIN and Latent-WCT, to enhance style modulation during inference. Experiments demonstrate that CLPFusion produces more realistic and artistic Chinese landscape paintings than existing approaches, showcasing its effectiveness and uniqueness in the field.

本研究的重点是通过风格转换将现实世界的风景转化为中国山水画的杰作。使用卷积神经网络(cnn)和生成对抗网络(gan)的传统方法经常产生不一致的模式和伪影。扩散模型(DMs)的兴起为真实感图像生成提供了新的机会,但其固有的噪声特性使得合成纯白色或纯黑色图像具有挑战性。因此,现有的基于数据的方法很难捕捉到中国山水画独特的风格和色彩信息。为了克服这些限制,我们提出了CLPFusion,这是一个利用预先训练的扩散模型进行艺术风格转移的新框架。一个关键的创新是双向状态空间模型-交叉注意(BiSSM-CA)模块,它有效地学习和保留了中国山水画的独特风格。此外,我们还引入了两种潜在空间特征调整方法,即latent - adain和latent - wct,以增强推理过程中的风格调制。实验表明,CLPFusion比现有的方法产生的中国山水画更真实、更有艺术性,显示了其在该领域的有效性和独特性。
{"title":"CLPFusion: A Latent Diffusion Model Framework for Realistic Chinese Landscape Painting Style Transfer","authors":"Jiahui Pan,&nbsp;Frederick W. B. Li,&nbsp;Bailin Yang,&nbsp;Fangzhe Nan","doi":"10.1002/cav.70053","DOIUrl":"https://doi.org/10.1002/cav.70053","url":null,"abstract":"<div>\u0000 \u0000 <p>This study focuses on transforming real-world scenery into Chinese landscape painting masterpieces through style transfer. Traditional methods using convolutional neural networks (CNNs) and generative adversarial networks (GANs) often yield inconsistent patterns and artifacts. The rise of diffusion models (DMs) presents new opportunities for realistic image generation, but their inherent noise characteristics make it challenging to synthesize pure white or black images. Consequently, existing DM-based methods struggle to capture the unique style and color information of Chinese landscape paintings. To overcome these limitations, we propose CLPFusion, a novel framework that leverages pre-trained diffusion models for artistic style transfer. A key innovation is the Bidirectional State Space Models-CrossAttention (BiSSM-CA) module, which efficiently learns and retains the distinct styles of Chinese landscape paintings. Additionally, we introduce two latent space feature adjustment methods, Latent-AdaIN and Latent-WCT, to enhance style modulation during inference. Experiments demonstrate that CLPFusion produces more realistic and artistic Chinese landscape paintings than existing approaches, showcasing its effectiveness and uniqueness in the field.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Folding by Skinning 剥皮折叠
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-05 DOI: 10.1002/cav.70055
Chunyang Ma, Lifeng Zhu

We propose a novel method, entitled “Folding by Skinning”, which creatively integrates skinning techniques with folding simulations. This method allows users to specify a two-dimensional crease pattern along with the desired folding angles for each crease. Based on this input, the system computes the final three-dimensional shape of the fold. Rather than employing costly physics-based simulations, we explore the skinning method, noted for its effectiveness in handling the geometry of the folded shape. We recommend extracting the skinning weights directly from the user-defined crease patterns. By combining the obtained skinning weights with the user-input folding angles, the initial shape undergoes dual quaternion skinning to produce the folding result. Users can further optimize the shape using post-processing and targeted filtering of weights to generate more realistic results. Our experimental results demonstrate that “Folding by Skinning” yields high-quality outcomes and offers relatively fast computation, making it an effective tool for computer-aided design, animation, and fabrication applications.

我们提出了一种新的方法,称为“通过蒙皮折叠”,创造性地将蒙皮技术与折叠模拟相结合。该方法允许用户指定二维折痕图以及每个折痕所需的折叠角度。基于这些输入,系统计算出褶皱的最终三维形状。而不是采用昂贵的基于物理的模拟,我们探索蒙皮方法,注意到它在处理折叠形状的几何形状的有效性。我们建议直接从用户定义的折痕图中提取蒙皮权重。通过将获得的蒙皮权重与用户输入的折叠角度相结合,对初始形状进行对偶四元数蒙皮,从而产生折叠结果。用户可以使用后处理和有针对性的权重过滤来进一步优化形状,以产生更逼真的结果。我们的实验结果表明,“通过蒙皮折叠”产生了高质量的结果,并提供了相对快速的计算,使其成为计算机辅助设计,动画和制造应用的有效工具。
{"title":"Folding by Skinning","authors":"Chunyang Ma,&nbsp;Lifeng Zhu","doi":"10.1002/cav.70055","DOIUrl":"https://doi.org/10.1002/cav.70055","url":null,"abstract":"<div>\u0000 \u0000 <p>We propose a novel method, entitled “Folding by Skinning”, which creatively integrates skinning techniques with folding simulations. This method allows users to specify a two-dimensional crease pattern along with the desired folding angles for each crease. Based on this input, the system computes the final three-dimensional shape of the fold. Rather than employing costly physics-based simulations, we explore the skinning method, noted for its effectiveness in handling the geometry of the folded shape. We recommend extracting the skinning weights directly from the user-defined crease patterns. By combining the obtained skinning weights with the user-input folding angles, the initial shape undergoes dual quaternion skinning to produce the folding result. Users can further optimize the shape using post-processing and targeted filtering of weights to generate more realistic results. Our experimental results demonstrate that “Folding by Skinning” yields high-quality outcomes and offers relatively fast computation, making it an effective tool for computer-aided design, animation, and fabrication applications.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144220085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BACH: Bi-Stage Data-Driven Piano Performance Animation for Controllable Hand Motion 巴赫:双阶段数据驱动的钢琴表演动画可控的手运动
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-04 DOI: 10.1002/cav.70044
Jihui Jiao, Rui Zeng, Ju Dai, Junjun Pan

This paper presents a novel framework for generating piano performance animations using a two-stage deep learning model. By using discrete musical score data, the framework transforms sparse control signals into continuous, natural hand motions. Specifically, in the first stage, by incorporating musical temporal context, the keyframe predictor is leveraged to learn keyframe motion guidance. Meanwhile, the second stage synthesizes smooth transitions between these keyframes via an inter-frame sequence generator. Additionally, a Laplacian operator-based motion retargeting technique is introduced, ensuring that the generated animations can be adapted to different digital human models. We demonstrate the effectiveness of the system through an audiovisual multimedia application. Our approach provides an efficient, scalable method for generating realistic piano animations and holds promise for broader applications in animation tasks driven by sparse control signals.

本文提出了一个使用两阶段深度学习模型生成钢琴表演动画的新框架。通过使用离散的乐谱数据,该框架将稀疏的控制信号转换为连续的、自然的手部动作。具体来说,在第一阶段,通过结合音乐时间背景,关键帧预测器被用来学习关键帧运动指导。同时,第二阶段通过帧间序列生成器合成这些关键帧之间的平滑过渡。此外,引入了一种基于拉普拉斯算子的运动重定向技术,确保生成的动画可以适应不同的数字人体模型。通过一个视听多媒体应用,验证了该系统的有效性。我们的方法为生成逼真的钢琴动画提供了一种高效、可扩展的方法,并有望在稀疏控制信号驱动的动画任务中得到更广泛的应用。
{"title":"BACH: Bi-Stage Data-Driven Piano Performance Animation for Controllable Hand Motion","authors":"Jihui Jiao,&nbsp;Rui Zeng,&nbsp;Ju Dai,&nbsp;Junjun Pan","doi":"10.1002/cav.70044","DOIUrl":"https://doi.org/10.1002/cav.70044","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper presents a novel framework for generating piano performance animations using a two-stage deep learning model. By using discrete musical score data, the framework transforms sparse control signals into continuous, natural hand motions. Specifically, in the first stage, by incorporating musical temporal context, the keyframe predictor is leveraged to learn keyframe motion guidance. Meanwhile, the second stage synthesizes smooth transitions between these keyframes via an inter-frame sequence generator. Additionally, a Laplacian operator-based motion retargeting technique is introduced, ensuring that the generated animations can be adapted to different digital human models. We demonstrate the effectiveness of the system through an audiovisual multimedia application. Our approach provides an efficient, scalable method for generating realistic piano animations and holds promise for broader applications in animation tasks driven by sparse control signals.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144214136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interaction With Virtual Objects Using Human Pose and Shape Estimation 使用人体姿态和形状估计与虚拟物体的交互
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-04 DOI: 10.1002/cav.70046
Hong Son Nguyen, DaEun Cheong, Andrew Chalmers, Myoung Gon Kim, Taehyun Rhee, JungHyun Han

In this article, we propose an AR system that facilitates a user's natural interaction with virtual objects in an augmented reality environment. The system consists of three modules: human pose and shape estimation, camera-space calibration, and physics simulation. The first module estimates a user's 3D pose and shape from a single RGB video stream, thereby reducing the system setup cost and broadening potential applications. The camera-space calibration module estimates the user's camera-space position to align the user with the input RGB image. The physics simulation enables seamless and physically natural interaction with virtual objects. Two prototyping applications built upon the system prove an enhancement in the quality of interaction, fostering a more immersive and intuitive user experience.

在本文中,我们提出了一个增强现实系统,促进用户在增强现实环境中与虚拟对象的自然交互。该系统由人体姿态和形状估计、相机空间标定和物理仿真三个模块组成。第一个模块从单个RGB视频流估计用户的3D姿势和形状,从而降低系统设置成本并扩大潜在的应用。相机空间校准模块估计用户的相机空间位置,使用户与输入的RGB图像对齐。物理模拟可以实现与虚拟对象的无缝和物理自然交互。基于该系统构建的两个原型应用程序证明了交互质量的提高,培养了更加身临其境和直观的用户体验。
{"title":"Interaction With Virtual Objects Using Human Pose and Shape Estimation","authors":"Hong Son Nguyen,&nbsp;DaEun Cheong,&nbsp;Andrew Chalmers,&nbsp;Myoung Gon Kim,&nbsp;Taehyun Rhee,&nbsp;JungHyun Han","doi":"10.1002/cav.70046","DOIUrl":"https://doi.org/10.1002/cav.70046","url":null,"abstract":"<p>In this article, we propose an AR system that facilitates a user's natural interaction with virtual objects in an augmented reality environment. The system consists of three modules: human pose and shape estimation, camera-space calibration, and physics simulation. The first module estimates a user's 3D pose and shape from a single RGB video stream, thereby reducing the system setup cost and broadening potential applications. The camera-space calibration module estimates the user's camera-space position to align the user with the input RGB image. The physics simulation enables seamless and physically natural interaction with virtual objects. Two prototyping applications built upon the system prove an enhancement in the quality of interaction, fostering a more immersive and intuitive user experience.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.70046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144214135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Going Further With Vertex Block Descent 进一步使用顶点块下降
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-03 DOI: 10.1002/cav.70039
B. Saillant, F. Zara, F. Jaillet, G. Damiand

Vertex Block Descent (VBD) is a fast and robust method for the real-time simulation of deformable objects using the finite element method. Originally, the method was designed for the popular linear tetrahedral elements. However, these elements have low accuracy and introduce locking artifacts. In this context, we propose an extension of VBD to: (i) support any isoparametric elements and shape function orders; (ii) improve its performance even for high-order elements; and (iii) enhance convergence through the use of sub-stepping. Overall, using other types of elements enables more accurate results while maintaining a computational cost comparable to linear tetrahedra.

顶点块下降法(VBD)是一种快速、鲁棒的可变形物体有限元实时仿真方法。最初,该方法是针对流行的线性四面体单元设计的。然而,这些元素的精度较低,并引入了锁定工件。在这种情况下,我们提出了VBD的扩展:(i)支持任何等参元素和形状函数顺序;(ii)提高其性能,即使是高阶元件;(3)通过使用子步进来增强收敛性。总的来说,使用其他类型的元素可以获得更准确的结果,同时保持与线性四面体相当的计算成本。
{"title":"Going Further With Vertex Block Descent","authors":"B. Saillant,&nbsp;F. Zara,&nbsp;F. Jaillet,&nbsp;G. Damiand","doi":"10.1002/cav.70039","DOIUrl":"https://doi.org/10.1002/cav.70039","url":null,"abstract":"<div>\u0000 \u0000 <p>Vertex Block Descent (VBD) is a fast and robust method for the real-time simulation of deformable objects using the finite element method. Originally, the method was designed for the popular linear tetrahedral elements. However, these elements have low accuracy and introduce locking artifacts. In this context, we propose an extension of VBD to: (i) support any isoparametric elements and shape function orders; (ii) improve its performance even for high-order elements; and (iii) enhance convergence through the use of sub-stepping. Overall, using other types of elements enables more accurate results while maintaining a computational cost comparable to linear tetrahedra.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt 基于Swin变压器和ConvNeXt的双支路强噪声图像去噪网络
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-03 DOI: 10.1002/cav.70030
Chuchao Lin, Changjun Zou, Hangbin Xu

Image denoising plays a vital role in restoring high-quality images from noisy inputs and directly impacts downstream vision tasks. Traditional methods often fail under strong noise, causing detail loss or excessive smoothing. While recent Convolutional Neural Networks-based and Transformer-based models have shown progress, they struggle to jointly capture global structure and preserve local details. To address this, we propose SCNet, a dual-branch fusion network tailored for strong-noise denoising. It combines a Swin Transformer branch for global context modeling and a ConvNeXt branch for fine-grained local feature extraction. Their outputs are adaptively merged via a Feature Fusion Block using joint spatial and channel attention, ensuring semantic consistency and texture fidelity. A multi-scale upsampling module and the Charbonnier loss further improve structural accuracy and visual quality. Extensive experiments on four benchmark datasets show that SCNet outperforms state-of-the-art methods, especially under severe noise, and proves effective in real-world tasks such as mural image restoration.

图像去噪在从噪声输入恢复高质量图像中起着至关重要的作用,并直接影响下游的视觉任务。传统的方法往往在强噪声下失效,造成细节丢失或过度平滑。虽然最近基于卷积神经网络和基于transformer的模型取得了进展,但它们很难同时捕获全局结构并保留局部细节。为了解决这个问题,我们提出了SCNet,一种专为强噪声去噪而设计的双分支融合网络。它结合了用于全局上下文建模的Swin Transformer分支和用于细粒度局部特征提取的ConvNeXt分支。它们的输出通过使用联合空间和通道关注的特征融合块自适应合并,确保语义一致性和纹理保真度。多尺度上采样模块和Charbonnier损失进一步提高了结构精度和视觉质量。在四个基准数据集上进行的大量实验表明,SCNet优于最先进的方法,特别是在严重噪声下,并且在壁画图像恢复等现实任务中被证明是有效的。
{"title":"SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt","authors":"Chuchao Lin,&nbsp;Changjun Zou,&nbsp;Hangbin Xu","doi":"10.1002/cav.70030","DOIUrl":"https://doi.org/10.1002/cav.70030","url":null,"abstract":"<div>\u0000 \u0000 <p>Image denoising plays a vital role in restoring high-quality images from noisy inputs and directly impacts downstream vision tasks. Traditional methods often fail under strong noise, causing detail loss or excessive smoothing. While recent Convolutional Neural Networks-based and Transformer-based models have shown progress, they struggle to jointly capture global structure and preserve local details. To address this, we propose SCNet, a dual-branch fusion network tailored for strong-noise denoising. It combines a Swin Transformer branch for global context modeling and a ConvNeXt branch for fine-grained local feature extraction. Their outputs are adaptively merged via a Feature Fusion Block using joint spatial and channel attention, ensuring semantic consistency and texture fidelity. A multi-scale upsampling module and the Charbonnier loss further improve structural accuracy and visual quality. Extensive experiments on four benchmark datasets show that SCNet outperforms state-of-the-art methods, especially under severe noise, and proves effective in real-world tasks such as mural image restoration.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AIKII: An AI-Enhanced Knowledge Interactive Interface for Knowledge Representation in Educational Games AIKII:用于教育游戏中知识表示的ai增强知识交互界面
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-06-02 DOI: 10.1002/cav.70052
Dake Liu, Huiwen Zhao, Wen Tang, Wenwen Yang

The use of generative AI to create responsive and adaptive game content has attracted considerable interest within the educational game design community, highlighting its potential as a tool for enhancing players' understanding of in-game knowledge. However, designing effective player-AI interaction to support knowledge representation remains unexplored. This paper presents AIKII, an AI-enhanced Knowledge Interaction Interface designed to facilitate knowledge representation in educational games. AIKII employs various interaction channels to represent in-game knowledge and support player engagement. To investigate its effectiveness and user learning experience, we implemented AIKII into The Journey of Poetry, an educational game centered on learning Chinese poetry, and conducted interviews with university students. The results demonstrated that our method fosters contextual and reflective connections between players and in-game knowledge, enhancing player autonomy and immersion.

使用生成式AI来创造响应性和适应性的游戏内容已经引起了教育游戏设计社区的极大兴趣,突出了它作为增强玩家对游戏知识理解的工具的潜力。然而,如何设计有效的玩家- ai交互来支持知识表示仍有待探索。本文介绍了AIKII,一个人工智能增强的知识交互界面,旨在促进教育游戏中的知识表示。AIKII采用多种互动渠道来呈现游戏知识并支持玩家粘性。为了调查AIKII的有效性和用户学习体验,我们将AIKII应用于以学习中国诗歌为中心的教育游戏《诗之旅》中,并对大学生进行了采访。结果表明,我们的方法促进了玩家与游戏知识之间的情境和反思联系,增强了玩家的自主性和沉浸感。
{"title":"AIKII: An AI-Enhanced Knowledge Interactive Interface for Knowledge Representation in Educational Games","authors":"Dake Liu,&nbsp;Huiwen Zhao,&nbsp;Wen Tang,&nbsp;Wenwen Yang","doi":"10.1002/cav.70052","DOIUrl":"https://doi.org/10.1002/cav.70052","url":null,"abstract":"<div>\u0000 \u0000 <p>The use of generative AI to create responsive and adaptive game content has attracted considerable interest within the educational game design community, highlighting its potential as a tool for enhancing players' understanding of in-game knowledge. However, designing effective player-AI interaction to support knowledge representation remains unexplored. This paper presents AIKII, an AI-enhanced Knowledge Interaction Interface designed to facilitate knowledge representation in educational games. AIKII employs various interaction channels to represent in-game knowledge and support player engagement. To investigate its effectiveness and user learning experience, we implemented AIKII into The Journey of Poetry, an educational game centered on learning Chinese poetry, and conducted interviews with university students. The results demonstrated that our method fosters contextual and reflective connections between players and in-game knowledge, enhancing player autonomy and immersion.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144197102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1