Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu
Human pose estimation in videos often uses sampling strategies like sparse uniform sampling and keyframe selection. Sparse uniform sampling can miss spatial-temporal relationships, while keyframe selection using CNNs struggles to fully capture these relationships and is costly. Neither strategy ensures the reliability of pose data from single-frame estimators. To address these issues, this article proposes an efficient and effective hybrid attention adaptive sampling network. This network includes a dynamic attention module and a pose quality attention module, which comprehensively consider the dynamic information and the quality of pose data. Additionally, the network improves efficiency through compact uniform sampling and parallel mechanism of multi-head self-attention. Our network is compatible with various video-based pose estimation frameworks and demonstrates greater robustness in high degree of occlusion, motion blur, and illumination changes, achieving state-of-the-art performance on Sub-JHMDB dataset.
{"title":"Hybrid attention adaptive sampling network for human pose estimation in videos","authors":"Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu","doi":"10.1002/cav.2244","DOIUrl":"https://doi.org/10.1002/cav.2244","url":null,"abstract":"<p>Human pose estimation in videos often uses sampling strategies like sparse uniform sampling and keyframe selection. Sparse uniform sampling can miss spatial-temporal relationships, while keyframe selection using CNNs struggles to fully capture these relationships and is costly. Neither strategy ensures the reliability of pose data from single-frame estimators. To address these issues, this article proposes an efficient and effective hybrid attention adaptive sampling network. This network includes a dynamic attention module and a pose quality attention module, which comprehensively consider the dynamic information and the quality of pose data. Additionally, the network improves efficiency through compact uniform sampling and parallel mechanism of multi-head self-attention. Our network is compatible with various video-based pose estimation frameworks and demonstrates greater robustness in high degree of occlusion, motion blur, and illumination changes, achieving state-of-the-art performance on Sub-JHMDB dataset.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142013622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hangyeol Kang, Maher Ben Moussa, Nadia Magnenat Thalmann
In this work, we describe our approach to developing an intelligent and robust social robotic system for the Nadine social robot platform. We achieve this by integrating large language models (LLMs) and skillfully leveraging the powerful reasoning and instruction-following capabilities of these types of models to achieve advanced human-like affective and cognitive capabilities. This approach is novel compared to the current state-of-the-art LLM-based agents which do not implement human-like long-term memory or sophisticated emotional capabilities. We built a social robot system that enables generating appropriate behaviors through multimodal input processing, bringing episodic memories accordingly to the recognized user, and simulating the emotional states of the robot induced by the interaction with the human partner. In particular, we introduce an LLM-agent frame for social robots, social robotics reasoning and acting, serving as a core component for the interaction module in our system. This design has brought forth the advancement of social robots and aims to increase the quality of human–robot interaction.
{"title":"Nadine: A large language model-driven intelligent social robot with affective capabilities and human-like memory","authors":"Hangyeol Kang, Maher Ben Moussa, Nadia Magnenat Thalmann","doi":"10.1002/cav.2290","DOIUrl":"https://doi.org/10.1002/cav.2290","url":null,"abstract":"<p>In this work, we describe our approach to developing an intelligent and robust social robotic system for the Nadine social robot platform. We achieve this by integrating large language models (LLMs) and skillfully leveraging the powerful reasoning and instruction-following capabilities of these types of models to achieve advanced human-like affective and cognitive capabilities. This approach is novel compared to the current state-of-the-art LLM-based agents which do not implement human-like long-term memory or sophisticated emotional capabilities. We built a social robot system that enables generating appropriate behaviors through multimodal input processing, bringing episodic memories accordingly to the recognized user, and simulating the emotional states of the robot induced by the interaction with the human partner. In particular, we introduce an LLM-agent frame for social robots, social robotics reasoning and acting, serving as a core component for the interaction module in our system. This design has brought forth the advancement of social robots and aims to increase the quality of human–robot interaction.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2290","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141986061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jackson Yang, Xiaoping Che, Chenxin Qu, Xiaofei Di, Haiming Liu
This paper investigates the application of Virtual Reality Exposure Therapy (VRET) to treat agoraphobia, focusing on two pivotal research questions derived from identified gaps in current therapeutic approaches. The first question (RQ1) addresses the development of complex VR environments to enhance therapy's effectiveness by simulating real-world anxiety triggers. The second question (RQ2) examines the differential impact of these VR environments on agoraphobic and nonagoraphobic participants through rigorous comparative analyses using t-tests. Methodologies include advanced data processing techniques for electrodermal activity (EDA) and eye-tracking metrics to assess the anxiety levels induced by these environments. Additionally, qualitative methods such as structured interviews and questionnaires complement these measurements, providing deeper insights into the subjective experiences of participants. Video recordings of sessions using Unity software offer a layer of data, enabling the study to replay and analyze interactions within the VR environment meticulously. The experimental results confirm the efficacy of VR settings in eliciting significant physiological and psychological responses from participants, substantiating the VR scenarios' potential as a therapeutic tool. This study contributes to the broader discourse on the viability and optimization of VR technologies in clinical settings, offering a methodologically sound approach to the practicality and accessibility of exposure therapies for anxiety disorders.
{"title":"Enhancing virtual reality exposure therapy: Optimizing treatment outcomes for agoraphobia through advanced simulation and comparative analysis","authors":"Jackson Yang, Xiaoping Che, Chenxin Qu, Xiaofei Di, Haiming Liu","doi":"10.1002/cav.2291","DOIUrl":"https://doi.org/10.1002/cav.2291","url":null,"abstract":"<p>This paper investigates the application of Virtual Reality Exposure Therapy (VRET) to treat agoraphobia, focusing on two pivotal research questions derived from identified gaps in current therapeutic approaches. The first question (RQ1) addresses the development of complex VR environments to enhance therapy's effectiveness by simulating real-world anxiety triggers. The second question (RQ2) examines the differential impact of these VR environments on agoraphobic and nonagoraphobic participants through rigorous comparative analyses using <i>t</i>-tests. Methodologies include advanced data processing techniques for electrodermal activity (EDA) and eye-tracking metrics to assess the anxiety levels induced by these environments. Additionally, qualitative methods such as structured interviews and questionnaires complement these measurements, providing deeper insights into the subjective experiences of participants. Video recordings of sessions using Unity software offer a layer of data, enabling the study to replay and analyze interactions within the VR environment meticulously. The experimental results confirm the efficacy of VR settings in eliciting significant physiological and psychological responses from participants, substantiating the VR scenarios' potential as a therapeutic tool. This study contributes to the broader discourse on the viability and optimization of VR technologies in clinical settings, offering a methodologically sound approach to the practicality and accessibility of exposure therapies for anxiety disorders.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingyi Gu, Jiajia Dai, Jiazhou Chen, Ke Yan, Jing Huang
Thin-film interference is a significant optical phenomenon. In this study, we employed the transfer matrix method to pre-calculate the reflectance of thin-films at visible light wavelengths. The reflectance is saved as a texture through color space transformation. This advancement has made real-time rendering of thin-film interference feasible. Furthermore, we proposed the implementation of shallow water equations to simulate the morphological evolution of liquid thin-films. This approach facilitates the interpretation and prediction of behaviors and thickness variations in liquid thin-films. We also introduced a viscosity term into the shallow water equations to more accurately simulate the behavior of thin-films, thus facilitating the creation of authentic interference patterns.
{"title":"Real-time simulation of thin-film interference with surface thickness variation using the shallow water equations","authors":"Mingyi Gu, Jiajia Dai, Jiazhou Chen, Ke Yan, Jing Huang","doi":"10.1002/cav.2289","DOIUrl":"https://doi.org/10.1002/cav.2289","url":null,"abstract":"<p>Thin-film interference is a significant optical phenomenon. In this study, we employed the transfer matrix method to pre-calculate the reflectance of thin-films at visible light wavelengths. The reflectance is saved as a texture through color space transformation. This advancement has made real-time rendering of thin-film interference feasible. Furthermore, we proposed the implementation of shallow water equations to simulate the morphological evolution of liquid thin-films. This approach facilitates the interpretation and prediction of behaviors and thickness variations in liquid thin-films. We also introduced a viscosity term into the shallow water equations to more accurately simulate the behavior of thin-films, thus facilitating the creation of authentic interference patterns.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yong Zhang, Yuqing Zhang, Lufei Chen, Baocai Yin, Yongliang Sun
Frontal person images contain the richest detailed features of humans, which can effectively assist in behavioral recognition, virtual dress fitting and other applications. While many remarkable networks are devoted to the person image generation task, most of them need accurate target poses as the network inputs. However, the target pose annotation is difficult and time-consuming. In this work, we proposed a first frontal person image generation network based on the proposed anchor pose set and the generative adversarial network. Specifically, our method first classify a rough frontal pose to the input human image based on the proposed anchor pose set, and regress all key points of the rough frontal pose to estimate an accurate frontal pose. Then, we consider the estimated frontal pose as the target pose, and construct a two-stream generator based on the generative adversarial network to update the person's shape and appearance feature in a crossing way and generate a realistic frontal person image. Experiments on the challenging CMU Panoptic dataset show that our method can generate realistic frontal images from arbitrary-view human images.
{"title":"Frontal person image generation based on arbitrary-view human images","authors":"Yong Zhang, Yuqing Zhang, Lufei Chen, Baocai Yin, Yongliang Sun","doi":"10.1002/cav.2234","DOIUrl":"10.1002/cav.2234","url":null,"abstract":"<p>Frontal person images contain the richest detailed features of humans, which can effectively assist in behavioral recognition, virtual dress fitting and other applications. While many remarkable networks are devoted to the person image generation task, most of them need accurate target poses as the network inputs. However, the target pose annotation is difficult and time-consuming. In this work, we proposed a first frontal person image generation network based on the proposed anchor pose set and the generative adversarial network. Specifically, our method first classify a rough frontal pose to the input human image based on the proposed anchor pose set, and regress all key points of the rough frontal pose to estimate an accurate frontal pose. Then, we consider the estimated frontal pose as the target pose, and construct a two-stream generator based on the generative adversarial network to update the person's shape and appearance feature in a crossing way and generate a realistic frontal person image. Experiments on the challenging CMU Panoptic dataset show that our method can generate realistic frontal images from arbitrary-view human images.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoping Wang, Xiaokun Wang, Yanrui Xu, Yalan Zhang, Chao Yao, Yu Guo, Xiaojuan Ban
This paper introduces a particle-based framework for simulating the behavior of elastoplastic materials and the formation of fractures, grounded in Peridynamic theory. Traditional approaches, such as the Finite Element Method (FEM) and Smoothed Particle Hydrodynamics (SPH), to modeling elastic materials have primarily relied on discretization techniques and continuous constitutive model. However, accurately capturing fracture and crack development in elastoplastic materials poses significant challenges for these conventional models. Our approach integrates a Peridynamic-based elastic model with a density constraint, enhancing stability and realism. We adopt the Von Mises yield criterion and a bond stretch criterion to simulate plastic deformation and fracture formation, respectively. The proposed method stabilizes the elastic model through a density-based position constraint, while plasticity is modeled using the Von Mises yield criterion within the bond of particle paris. Fracturing and the generation of fine fragments are facilitated by the fracture criterion and the application of complementarity operations to the inter-particle connections. Our experimental results demonstrate the efficacy of our framework in realistically depicting a wide range of material behaviors, including elasticity, plasticity, and fracturing, across various scenarios.
本文介绍了一种基于粒子的框架,该框架以周动理论为基础,用于模拟弹塑性材料的行为和断裂的形成。有限元法(FEM)和平滑粒子流体力学(SPH)等弹性材料建模的传统方法主要依赖离散化技术和连续构成模型。然而,准确捕捉弹塑性材料的断裂和裂纹发展对这些传统模型提出了巨大挑战。我们的方法将基于 Peridynamic 的弹性模型与密度约束相结合,增强了稳定性和真实性。我们采用 Von Mises 屈服准则和粘接拉伸准则分别模拟塑性变形和断裂形成。所提出的方法通过基于密度的位置约束来稳定弹性模型,而塑性则是在粒子抛物线的结合部使用冯米塞斯屈服准则来建模的。断裂准则和粒子间连接的互补运算促进了碎裂和细小碎片的产生。实验结果表明,我们的框架能够在各种场景下真实地描述各种材料行为,包括弹性、塑性和断裂。
{"title":"Peridynamic-based modeling of elastoplasticity and fracture dynamics","authors":"Haoping Wang, Xiaokun Wang, Yanrui Xu, Yalan Zhang, Chao Yao, Yu Guo, Xiaojuan Ban","doi":"10.1002/cav.2242","DOIUrl":"https://doi.org/10.1002/cav.2242","url":null,"abstract":"<p>This paper introduces a particle-based framework for simulating the behavior of elastoplastic materials and the formation of fractures, grounded in Peridynamic theory. Traditional approaches, such as the Finite Element Method (FEM) and Smoothed Particle Hydrodynamics (SPH), to modeling elastic materials have primarily relied on discretization techniques and continuous constitutive model. However, accurately capturing fracture and crack development in elastoplastic materials poses significant challenges for these conventional models. Our approach integrates a Peridynamic-based elastic model with a density constraint, enhancing stability and realism. We adopt the Von Mises yield criterion and a bond stretch criterion to simulate plastic deformation and fracture formation, respectively. The proposed method stabilizes the elastic model through a density-based position constraint, while plasticity is modeled using the Von Mises yield criterion within the bond of particle paris. Fracturing and the generation of fine fragments are facilitated by the fracture criterion and the application of complementarity operations to the inter-particle connections. Our experimental results demonstrate the efficacy of our framework in realistically depicting a wide range of material behaviors, including elasticity, plasticity, and fracturing, across various scenarios.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing high-resolution face-swapping works are still challenges in preserving identity consistency while maintaining high visual quality. We present a novel high-resolution face-swapping method GPSwap, which is based on StyleGAN prior. To better preserves identity consistency, the proposed facial feature recombination network fully leverages the properties of both w space and encoders to decouple identities. Furthermore, we presents the image reconstruction module aligns and blends images in FS space, which further supplements facial details and achieves natural blending. It not only improves image resolution but also optimizes visual quality. Extensive experiments and user studies demonstrate that GPSwap is superior to state-of-the-art high-resolution face-swapping methods in terms of image quality and identity consistency. In addition, GPSwap saves nearly 80% of training costs compared to other high-resolution face-swapping works.
{"title":"GPSwap: High-resolution face swapping based on StyleGAN prior","authors":"Dongjin Huang, Chuanman Liu, Jinhua Liu","doi":"10.1002/cav.2238","DOIUrl":"https://doi.org/10.1002/cav.2238","url":null,"abstract":"<p>Existing high-resolution face-swapping works are still challenges in preserving identity consistency while maintaining high visual quality. We present a novel high-resolution face-swapping method GPSwap, which is based on StyleGAN prior. To better preserves identity consistency, the proposed facial feature recombination network fully leverages the properties of both <i>w</i> space and encoders to decouple identities. Furthermore, we presents the image reconstruction module aligns and blends images in <i>FS</i> space, which further supplements facial details and achieves natural blending. It not only improves image resolution but also optimizes visual quality. Extensive experiments and user studies demonstrate that GPSwap is superior to state-of-the-art high-resolution face-swapping methods in terms of image quality and identity consistency. In addition, GPSwap saves nearly 80% of training costs compared to other high-resolution face-swapping works.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiannan Ye, Xiaoxu Meng, Daiyun Guo, Cheng Shang, Haotian Mao, Xubo Yang
As virtual reality display technologies advance, resolutions and refresh rates continue to approach human perceptual limits, presenting a challenge for real-time rendering algorithms. Neural super-resolution is promising in reducing the computation cost and boosting the visual experience by scaling up low-resolution renderings. However, the added workload of running neural networks cannot be neglected. In this article, we try to alleviate the burden by exploiting the foveated nature of the human visual system, in a way that we upscale the coarse input in a heterogeneous manner instead of uniform super-resolution according to the visual acuity decreasing rapidly from the focal point to the periphery. With the help of dynamic and geometric information (i.e., pixel-wise motion vectors, depth, and camera transformation) available inherently in the real-time rendering content, we propose a neural accumulator to effectively aggregate the amortizedly rendered low-resolution visual information from frame to frame recurrently. By leveraging a partition-assemble scheme, we use a neural super-resolution module to upsample the low-resolution image tiles to different qualities according to their perceptual importance and reconstruct the final output adaptively. Perceptually high-fidelity foveated high-resolution frames are generated in real-time, surpassing the quality of other foveated super-resolution methods.
{"title":"Neural foveated super-resolution for real-time VR rendering","authors":"Jiannan Ye, Xiaoxu Meng, Daiyun Guo, Cheng Shang, Haotian Mao, Xubo Yang","doi":"10.1002/cav.2287","DOIUrl":"https://doi.org/10.1002/cav.2287","url":null,"abstract":"<p>As virtual reality display technologies advance, resolutions and refresh rates continue to approach human perceptual limits, presenting a challenge for real-time rendering algorithms. Neural super-resolution is promising in reducing the computation cost and boosting the visual experience by scaling up low-resolution renderings. However, the added workload of running neural networks cannot be neglected. In this article, we try to alleviate the burden by exploiting the foveated nature of the human visual system, in a way that we upscale the coarse input in a heterogeneous manner instead of uniform super-resolution according to the visual acuity decreasing rapidly from the focal point to the periphery. With the help of dynamic and geometric information (i.e., pixel-wise motion vectors, depth, and camera transformation) available inherently in the real-time rendering content, we propose a neural accumulator to effectively aggregate the amortizedly rendered low-resolution visual information from frame to frame recurrently. By leveraging a partition-assemble scheme, we use a neural super-resolution module to upsample the low-resolution image tiles to different qualities according to their perceptual importance and reconstruct the final output adaptively. Perceptually high-fidelity foveated high-resolution frames are generated in real-time, surpassing the quality of other foveated super-resolution methods.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intravenous cannulation (IV) is a common technique used in clinical infusion. This study developed a mixed reality IV cannulation teaching system based on the Hololens2 platform. The paper integrates cognitive-affective theory of learning with media (CATLM) and investigates the cognitive engagement and willingness to use the system from the learners' perspective. Through experimental research on 125 subjects, the variables affecting learners' cognitive engagement and intention to use were determined. On the basis of CATLM, three new mixed reality attributes, immersion, system verisimilitude, and response time, were introduced, and their relationships with cognitive participation and willingness to use were determined. The results show that high immersion of mixed reality technology promotes students' higher cognitive engagement; however, this high immersion does not significantly affect learners' intention to use mixed reality technology for learning. Overall, cognitive and emotional theories are effective in mixed reality environments, and the model has good adaptability. This study provides a reference for the application of mixed reality technology in medical education.
{"title":"Design and development of a mixed reality teaching systems for IV cannulation and clinical instruction","authors":"Wei Xiong, Yingda Peng","doi":"10.1002/cav.2288","DOIUrl":"https://doi.org/10.1002/cav.2288","url":null,"abstract":"<p>Intravenous cannulation (IV) is a common technique used in clinical infusion. This study developed a mixed reality IV cannulation teaching system based on the Hololens2 platform. The paper integrates cognitive-affective theory of learning with media (CATLM) and investigates the cognitive engagement and willingness to use the system from the learners' perspective. Through experimental research on 125 subjects, the variables affecting learners' cognitive engagement and intention to use were determined. On the basis of CATLM, three new mixed reality attributes, immersion, system verisimilitude, and response time, were introduced, and their relationships with cognitive participation and willingness to use were determined. The results show that high immersion of mixed reality technology promotes students' higher cognitive engagement; however, this high immersion does not significantly affect learners' intention to use mixed reality technology for learning. Overall, cognitive and emotional theories are effective in mixed reality environments, and the model has good adaptability. This study provides a reference for the application of mixed reality technology in medical education.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141329416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}