首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
A Virtual Instructor-Led System for Assessing and Guiding Middle School Physics Experiments 虚拟教师主导的中学物理实验评估与指导系统
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-02-04 DOI: 10.1002/cav.70090
Fengming Wang, Zhigeng Pan, Fuchang Liu, Yu Lu

Interactive computer technology is deeply integrated into traditional teaching methods. The traditional teaching of physics experiments in secondary schools suffers from the inability of teachers to provide timely guidance to students, the difficulty of controlling experimental variables, and the lack of uniformity in evaluation criteria. To address the aforementioned issues, we have developed an innovative system to improve secondary school physics education using computer vision-based interaction with virtual humans and sensors. The proposed system captures experimental data in real time so that student performance can be accurately monitored and assessed. Teachers can effortlessly configure experiments through simple coding, while the system leverages a multimodal macrolanguage model to offer contextual feedback and guidance. The system generates a virtual teacher that offers step-by-step guidance and real-time feedback. Usability tests indicate that the system significantly improves student engagement and comprehension of complex physics concepts, highlighting its potential to transform traditional science education. The advantage of real-time assessment and guidance in secondary school physics experiments is that it enables students to grasp abstract concepts in a more intuitive and comprehensible manner.

交互式计算机技术被深深融入传统的教学方法中。传统的中学物理实验教学存在教师不能及时指导学生、实验变量难以控制、评价标准不统一等问题。为了解决上述问题,我们开发了一个创新系统,利用基于计算机视觉的与虚拟人和传感器的交互来改善中学物理教育。该系统可以实时捕获实验数据,从而可以准确地监测和评估学生的表现。教师可以通过简单的编码轻松地配置实验,而系统利用多模态宏语言模型提供上下文反馈和指导。该系统会生成一名虚拟教师,提供分步指导和实时反馈。可用性测试表明,该系统显著提高了学生对复杂物理概念的参与度和理解力,凸显了其改变传统科学教育的潜力。在中学物理实验中进行实时评价和指导的优势在于,可以使学生更加直观、易懂地掌握抽象概念。
{"title":"A Virtual Instructor-Led System for Assessing and Guiding Middle School Physics Experiments","authors":"Fengming Wang,&nbsp;Zhigeng Pan,&nbsp;Fuchang Liu,&nbsp;Yu Lu","doi":"10.1002/cav.70090","DOIUrl":"https://doi.org/10.1002/cav.70090","url":null,"abstract":"<div>\u0000 \u0000 <p>Interactive computer technology is deeply integrated into traditional teaching methods. The traditional teaching of physics experiments in secondary schools suffers from the inability of teachers to provide timely guidance to students, the difficulty of controlling experimental variables, and the lack of uniformity in evaluation criteria. To address the aforementioned issues, we have developed an innovative system to improve secondary school physics education using computer vision-based interaction with virtual humans and sensors. The proposed system captures experimental data in real time so that student performance can be accurately monitored and assessed. Teachers can effortlessly configure experiments through simple coding, while the system leverages a multimodal macrolanguage model to offer contextual feedback and guidance. The system generates a virtual teacher that offers step-by-step guidance and real-time feedback. Usability tests indicate that the system significantly improves student engagement and comprehension of complex physics concepts, highlighting its potential to transform traditional science education. The advantage of real-time assessment and guidance in secondary school physics experiments is that it enables students to grasp abstract concepts in a more intuitive and comprehensible manner.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146139330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Learners' Attention Under Audiovisual Cues in Virtual Reality With a Deep Learning Model 基于深度学习模型的虚拟现实视听提示下学习者注意力预测
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-02-01 DOI: 10.1002/cav.70099
Chen Kang, Kunyan Li

Effective audiovisual cueing can significantly enhance learners' attention to educational resources in the Virtual Reality (VR). However, predicting the impact of multimodal cueing on learners' attention in immersive teaching environments remains a challenging task. To address this, we propose a deep learning model named Attention Prediction Model (APM). This model employs RevFCN to extract visual and auditory cue features and incorporates a tailored Upsample-Aggregation Fusion Module (UAFM) to integrate multimodal representations. Additionally, an SANet is introduced to effectively combine the advantages of spatial and channel attention. Trained on our constructed dataset, APM achieved an attention prediction accuracy of 81.6%. These findings offer both theoretical and practical implications for the application of multimodal cueing in VR-based instructional design.

在虚拟现实(VR)中,有效的视听提示可以显著提高学习者对教育资源的注意力。然而,预测沉浸式教学环境中多模态提示对学习者注意力的影响仍然是一项具有挑战性的任务。为了解决这个问题,我们提出了一个名为注意力预测模型(APM)的深度学习模型。该模型采用RevFCN提取视觉和听觉线索特征,并结合定制的Upsample-Aggregation Fusion Module (UAFM)来整合多模态表示。此外,引入了SANet,有效地结合了空间注意和信道注意的优势。在我们构建的数据集上进行训练,APM的注意力预测准确率达到81.6%。这些发现为多模态线索在基于vr的教学设计中的应用提供了理论和实践意义。
{"title":"Predicting Learners' Attention Under Audiovisual Cues in Virtual Reality With a Deep Learning Model","authors":"Chen Kang,&nbsp;Kunyan Li","doi":"10.1002/cav.70099","DOIUrl":"https://doi.org/10.1002/cav.70099","url":null,"abstract":"<div>\u0000 \u0000 <p>Effective audiovisual cueing can significantly enhance learners' attention to educational resources in the Virtual Reality (VR). However, predicting the impact of multimodal cueing on learners' attention in immersive teaching environments remains a challenging task. To address this, we propose a deep learning model named Attention Prediction Model (APM). This model employs RevFCN to extract visual and auditory cue features and incorporates a tailored Upsample-Aggregation Fusion Module (UAFM) to integrate multimodal representations. Additionally, an SANet is introduced to effectively combine the advantages of spatial and channel attention. Trained on our constructed dataset, APM achieved an attention prediction accuracy of 81.6%. These findings offer both theoretical and practical implications for the application of multimodal cueing in VR-based instructional design.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146139197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a DT-Driven Virtual Reality for Human-Robot Collaborative Safety Education Using Formwork Design 基于模板设计的dt驱动人机协同安全教育虚拟现实系统的开发
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-28 DOI: 10.1002/cav.70098
Jeremy S. Liang, Cindy Lin

The vision of Industry 5.0, which addresses humanistic, adaptable and sustainable method, has evolved rapidly, particularly in digital twin-driven (DT) utilization in industry, for example, BMW factory and Tata steel. Nevertheless, it is no secret that the development of DT-powered virtual reality (VR) environments is hard work, which denotes there is untapped potential. Thus, a reusable DT model is required to speed up the creation and scaling of DT-driven VR environments. The objective of this study is for seeking solutions to make DT-driven VR contexts faster in an industrial environment. In particular, this study presents a way to develop the settings more effectively. The highlight of this study, and related industrial instances, is the development of safety education for human-robot collaboration in manufacturing contexts utilizing DT-driven VR settings. A formwork mode is introduced, which involves specific formworks and the entire framework for disposing the formworks in order to more quickly amend DT-driven VR environments to satisfy the requirements of a particular instance. The formwork is established applying two different industrial instances from manufacturing scenario.

工业5.0的愿景是解决人性化、适应性和可持续的方法,发展迅速,特别是在工业中数字孪生驱动(DT)的应用,例如宝马工厂和塔塔钢铁。然而,dt驱动的虚拟现实(VR)环境的开发是一项艰巨的工作,这已经不是什么秘密了,这表明还有未开发的潜力。因此,需要一个可重用的DT模型来加速DT驱动的VR环境的创建和扩展。本研究的目的是寻求解决方案,使dt驱动的VR环境在工业环境中更快。特别是,本研究提出了一种更有效地开发设置的方法。本研究的亮点,以及相关的工业实例,是利用dt驱动的VR设置在制造环境中开发人机协作的安全教育。介绍了一种模板模式,该模式包括具体的模板和处理模板的整个框架,以便更快地修改dt驱动的VR环境以满足特定实例的要求。模板的建立应用了两个不同的工业实例从制造场景。
{"title":"Development of a DT-Driven Virtual Reality for Human-Robot Collaborative Safety Education Using Formwork Design","authors":"Jeremy S. Liang,&nbsp;Cindy Lin","doi":"10.1002/cav.70098","DOIUrl":"https://doi.org/10.1002/cav.70098","url":null,"abstract":"<div>\u0000 \u0000 <p>The vision of Industry 5.0, which addresses humanistic, adaptable and sustainable method, has evolved rapidly, particularly in digital twin-driven (DT) utilization in industry, for example, BMW factory and Tata steel. Nevertheless, it is no secret that the development of DT-powered virtual reality (VR) environments is hard work, which denotes there is untapped potential. Thus, a reusable DT model is required to speed up the creation and scaling of DT-driven VR environments. The objective of this study is for seeking solutions to make DT-driven VR contexts faster in an industrial environment. In particular, this study presents a way to develop the settings more effectively. The highlight of this study, and related industrial instances, is the development of safety education for human-robot collaboration in manufacturing contexts utilizing DT-driven VR settings. A formwork mode is introduced, which involves specific formworks and the entire framework for disposing the formworks in order to more quickly amend DT-driven VR environments to satisfy the requirements of a particular instance. The formwork is established applying two different industrial instances from manufacturing scenario.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146140161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Dance Generation With Multi-Granularity Style Control and Text Guidance 具有多粒度风格控制和文本引导的多模态舞蹈生成
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-25 DOI: 10.1002/cav.70097
Mengmeng Wang, Lili Wan, Bo Peng, Wanru Xu, Shenghui Wang

Dance generation is a significant research area in computer arts and artificial intelligence. This study proposes a novel framework to enhance dance controllability and personalization through multimodal and multi-granularity control. The framework establishes global choreographic control of long sequences via music and dance style factors, while accommodating local style variations. Simultaneously, it enables fine-grained local control using style, text, and temporal factors for motion refinement. We develop two cross-modal Transformers: the LS-M2D model merges music and dance style features for local style-controllable dance generation, and the LT-SM2D model integrates textual guidance with music and dance style features for time-constrained local control. Experimental results demonstrate enhanced motion quality, effective multi-granularity style control, and precise text-guided flexibility. This provides valuable technical support for personalized intelligent dance generation systems.

舞蹈生成是计算机艺术和人工智能领域的一个重要研究领域。本研究提出了一个新的框架,通过多模态和多粒度的控制来增强舞蹈的可控性和个性化。该框架通过音乐和舞蹈风格因素建立了对长序列的全局编舞控制,同时适应了地方风格的变化。同时,它支持使用样式、文本和时间因素进行精细的局部控制,以进行运动细化。我们开发了两个跨模态变压器:LS-M2D模型合并了音乐和舞蹈风格特征,用于本地风格可控的舞蹈生成,LT-SM2D模型将文本引导与音乐和舞蹈风格特征集成在一起,用于时间约束的本地控制。实验结果表明,该方法提高了运动质量,有效的多粒度风格控制和精确的文本引导灵活性。这为个性化智能舞蹈生成系统提供了宝贵的技术支持。
{"title":"Multimodal Dance Generation With Multi-Granularity Style Control and Text Guidance","authors":"Mengmeng Wang,&nbsp;Lili Wan,&nbsp;Bo Peng,&nbsp;Wanru Xu,&nbsp;Shenghui Wang","doi":"10.1002/cav.70097","DOIUrl":"https://doi.org/10.1002/cav.70097","url":null,"abstract":"<div>\u0000 \u0000 <p>Dance generation is a significant research area in computer arts and artificial intelligence. This study proposes a novel framework to enhance dance controllability and personalization through multimodal and multi-granularity control. The framework establishes global choreographic control of long sequences via music and dance style factors, while accommodating local style variations. Simultaneously, it enables fine-grained local control using style, text, and temporal factors for motion refinement. We develop two cross-modal Transformers: the LS-M2D model merges music and dance style features for local style-controllable dance generation, and the LT-SM2D model integrates textual guidance with music and dance style features for time-constrained local control. Experimental results demonstrate enhanced motion quality, effective multi-granularity style control, and precise text-guided flexibility. This provides valuable technical support for personalized intelligent dance generation systems.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146136325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Knowledge Visualization Method Based on Knowledge Cube for Virtual Reality Learning 基于知识立方体的虚拟现实学习知识可视化方法
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-25 DOI: 10.1002/cav.70096
Yi Lin, Jingjing Chen, Feng Chen, Zijie Zheng, Jieming Ke

Advances in microelectronic components and high-speed networks have enabled the widespread application of virtual reality (VR) technology in education. However, insufficient attention to knowledge visualization in VR learning has resulted in disorganized knowledge structures, comprehension difficulties, and mismatches between user experience and learning achievement. Therefore, we propose a Knowledge Cube (KC)-based visualization method to standardize knowledge encoding in VR learning. During courseware development, the instructor defines discrete knowledge as Events, organizes them into Event Groups, and populates data into a KC model to generate VR courseware. In subsequent VR learning, when learners search for task-relevant knowledge using the provided retrieval method, the KC model presents the corresponding events within interactive scenarios according to its predefined structure. Comparative experiments on different knowledge visualization methods revealed that, in VR learning, the KC method outperforms other VR approaches in both learning performance and efficiency. This method effectively guided learners to focus on the learning content and optimized the knowledge encoding in VR learning. This provides an operational framework for knowledge encoding in VR courseware design and emphasizes the importance of supporting effective learning behaviors over merely pursuing immersion, presenting a new perspective for refining the design approach of VR courseware.

微电子元件和高速网络的进步使虚拟现实技术在教育中的广泛应用成为可能。然而,由于在VR学习中对知识可视化重视不够,导致知识结构混乱,理解困难,用户体验与学习成果不匹配。因此,我们提出了一种基于知识立方体(Knowledge Cube, KC)的可视化方法来规范VR学习中的知识编码。在课件开发过程中,讲师将离散知识定义为事件,将其组织到事件组中,并将数据填充到KC模型中以生成VR课件。在后续的VR学习中,当学习者使用提供的检索方法搜索任务相关知识时,KC模型根据预定义的结构在交互场景中呈现相应的事件。不同知识可视化方法的对比实验表明,在VR学习中,KC方法在学习性能和效率上都优于其他VR方法。该方法有效地引导学习者关注学习内容,优化了VR学习中的知识编码。这为VR课件设计中的知识编码提供了一个可操作的框架,强调了支持有效学习行为的重要性,而不仅仅是追求沉浸,为完善VR课件的设计方法提供了一个新的视角。
{"title":"A Knowledge Visualization Method Based on Knowledge Cube for Virtual Reality Learning","authors":"Yi Lin,&nbsp;Jingjing Chen,&nbsp;Feng Chen,&nbsp;Zijie Zheng,&nbsp;Jieming Ke","doi":"10.1002/cav.70096","DOIUrl":"https://doi.org/10.1002/cav.70096","url":null,"abstract":"<div>\u0000 \u0000 <p>Advances in microelectronic components and high-speed networks have enabled the widespread application of virtual reality (VR) technology in education. However, insufficient attention to knowledge visualization in VR learning has resulted in disorganized knowledge structures, comprehension difficulties, and mismatches between user experience and learning achievement. Therefore, we propose a Knowledge Cube (KC)-based visualization method to standardize knowledge encoding in VR learning. During courseware development, the instructor defines discrete knowledge as Events, organizes them into Event Groups, and populates data into a KC model to generate VR courseware. In subsequent VR learning, when learners search for task-relevant knowledge using the provided retrieval method, the KC model presents the corresponding events within interactive scenarios according to its predefined structure. Comparative experiments on different knowledge visualization methods revealed that, in VR learning, the KC method outperforms other VR approaches in both learning performance and efficiency. This method effectively guided learners to focus on the learning content and optimized the knowledge encoding in VR learning. This provides an operational framework for knowledge encoding in VR courseware design and emphasizes the importance of supporting effective learning behaviors over merely pursuing immersion, presenting a new perspective for refining the design approach of VR courseware.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146136324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced Sign Language Translation: A Holistic Network for Hand Gesture Recognition Using Deep Learning 高级手语翻译:使用深度学习的手势识别整体网络
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-20 DOI: 10.1002/cav.70084
S. L. Reeja, P. S. Deepthi, T. Soumya

Sign language recognition (SLR) requires interpreting dynamic hand gestures with complex variations in shape, orientation, motion, and spatial configuration. Conventional models such as U-Net and ResNet offer strengths in segmentation and feature extraction, respectively, but face critical limitations. U-Net struggles with retaining fine spatial details in cluttered backgrounds and lacks temporal modeling, while ResNet can lose motion continuity and suffers from vanishing gradient issues in deeper architectures. To overcome these challenges, we propose the holistic sign language interpretation network (HSLIN), a novel deep learning framework tailored for Indian sign language (ISL) recognition. HSLIN incorporates three key innovations: Uniformed frame isolation and augmentation (UFIA) for standardized preprocessing and noise removal, synaptic gesture movement analysis (SGMA) for capturing detailed motion using keypoint detection and optical flow, and a hybrid architecture combining U-Net-based segmentation with an enhanced ResNet-TC50V2 backbone. The novelty lies in fusing spatial precision with deep temporal modeling through bottleneck layers and temporal convolutional layers (TCL), enabling the model to effectively learn gesture patterns over time. Experimental results on the ISL-CSLTR dataset demonstrate that the proposed method achieves an accuracy of 99.9%, a precision of 100%, recall of 99.9%, and an F1-score of 100% across 14 word-level sign classes. Furthermore, an ablation study confirms the critical role of each architectural component in achieving optimal performance. These outcomes clearly establish the robustness, efficiency, and uniqueness of the proposed HSLIN framework, positioning it as a powerful solution for real-world ISL recognition and communication accessibility for the deaf and hard-of-hearing community.

手语识别(SLR)需要解释具有复杂形状、方向、运动和空间配置变化的动态手势。U-Net和ResNet等传统模型分别在分割和特征提取方面具有优势,但面临着严重的局限性。U-Net难以在杂乱的背景中保留精细的空间细节,缺乏时间建模,而ResNet可能会失去运动连续性,并且在更深的架构中遭受梯度消失的问题。为了克服这些挑战,我们提出了整体手语解释网络(HSLIN),这是一种为印度手语(ISL)识别量身定制的新型深度学习框架。HSLIN包含三个关键创新:用于标准化预处理和去噪的统一帧隔离和增强(UFIA),用于使用关键点检测和光流捕获详细运动的突触手势运动分析(SGMA),以及将基于u - net的分割与增强的ResNet-TC50V2主干相结合的混合架构。新颖之处在于通过瓶颈层和时间卷积层(TCL)将空间精度与深度时间建模融合在一起,使模型能够随着时间的推移有效地学习手势模式。在ls - csltr数据集上的实验结果表明,该方法在14个词级符号类别上的准确率为99.9%,精密度为100%,召回率为99.9%,f1分数为100%。此外,一项消融研究证实了每个架构组件在实现最佳性能方面的关键作用。这些结果清楚地确立了所提出的HSLIN框架的鲁棒性、效率和独特性,将其定位为一个强大的解决方案,用于聋人和听障群体的现实世界的ISL识别和交流可及性。
{"title":"Advanced Sign Language Translation: A Holistic Network for Hand Gesture Recognition Using Deep Learning","authors":"S. L. Reeja,&nbsp;P. S. Deepthi,&nbsp;T. Soumya","doi":"10.1002/cav.70084","DOIUrl":"https://doi.org/10.1002/cav.70084","url":null,"abstract":"<div>\u0000 \u0000 <p>Sign language recognition (SLR) requires interpreting dynamic hand gestures with complex variations in shape, orientation, motion, and spatial configuration. Conventional models such as U-Net and ResNet offer strengths in segmentation and feature extraction, respectively, but face critical limitations. U-Net struggles with retaining fine spatial details in cluttered backgrounds and lacks temporal modeling, while ResNet can lose motion continuity and suffers from vanishing gradient issues in deeper architectures. To overcome these challenges, we propose the holistic sign language interpretation network (HSLIN), a novel deep learning framework tailored for Indian sign language (ISL) recognition. HSLIN incorporates three key innovations: Uniformed frame isolation and augmentation (UFIA) for standardized preprocessing and noise removal, synaptic gesture movement analysis (SGMA) for capturing detailed motion using keypoint detection and optical flow, and a hybrid architecture combining U-Net-based segmentation with an enhanced ResNet-TC50V2 backbone. The novelty lies in fusing spatial precision with deep temporal modeling through bottleneck layers and temporal convolutional layers (TCL), enabling the model to effectively learn gesture patterns over time. Experimental results on the ISL-CSLTR dataset demonstrate that the proposed method achieves an accuracy of 99.9%, a precision of 100%, recall of 99.9%, and an F1-score of 100% across 14 word-level sign classes. Furthermore, an ablation study confirms the critical role of each architectural component in achieving optimal performance. These outcomes clearly establish the robustness, efficiency, and uniqueness of the proposed HSLIN framework, positioning it as a powerful solution for real-world ISL recognition and communication accessibility for the deaf and hard-of-hearing community.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146091171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to “Artist-Directable Motion Generation Using Overlapping Minimum Jerk Trajectories for Interactive Embodied Social Agent Adapting to Dynamic Environments” 修正“使用重叠最小震动轨迹为互动具身社会主体适应动态环境的艺术家定向运动生成”
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-16 DOI: 10.1002/cav.70093

H Sato, H Mitake, and S HasegawaArtist-Directable Motion Generation Using Overlapping Minimum Jerk Trajectories for Interactive Embodied Social Agent Adapting to Dynamic Environments,” Computer Animation and Virtual Worlds 36, no. 6 (2025): e70075.

The name of one of the authors was incorrect as “Shouichi Hasegawa.” The correct spelling is “Shoichi Hasegawa.”

We apologize for this error.

H Sato, H Mitake, S Hasegawa,“基于重叠最小震动轨迹的交互式具身社会代理的可指导运动生成”,《计算机动画与虚拟世界》第36期。6 (2025): e70075。其中一位作者的名字是“长谷川寿一”(Shouichi Hasegawa)。正确的拼写是“Shoichi Hasegawa”。我们为这个错误道歉。
{"title":"Correction to “Artist-Directable Motion Generation Using Overlapping Minimum Jerk Trajectories for Interactive Embodied Social Agent Adapting to Dynamic Environments”","authors":"","doi":"10.1002/cav.70093","DOIUrl":"https://doi.org/10.1002/cav.70093","url":null,"abstract":"<p>\u0000 <span>H Sato</span>, <span>H Mitake</span>, and <span>S Hasegawa</span> “ <span>Artist-Directable Motion Generation Using Overlapping Minimum Jerk Trajectories for Interactive Embodied Social Agent Adapting to Dynamic Environments</span>,” <i>Computer Animation and Virtual Worlds</i> <span>36</span>, no. <span>6</span> (<span>2025</span>): e70075.</p><p>The name of one of the authors was incorrect as “Shouichi Hasegawa.” The correct spelling is “Shoichi Hasegawa.”</p><p>We apologize for this error.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.70093","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SinMDGan: A Hybrid Deep Learning Framework for Single Motion Synthesis Using Diffusion-GAN Models SinMDGan:使用扩散- gan模型进行单运动合成的混合深度学习框架
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-16 DOI: 10.1002/cav.70091
Qiang Chen, Binsong Zuo, Tingsong Lu, Yuming Fang, Xiaolu Mu, Chao Cai, Xiaogang Jin

Generating diverse and realistic movements has long been a central challenge in computer graphics. Generative Adversarial Networks (GANs) remain a compelling solution due to their ability to perform well even with limited training data. However, traditional GANs generate samples directly, which can lead to the omission of certain data patterns. To address this limitation, we introduce SinMDGan, a hybrid deep learning framework for single-motion synthesis that leverages a Diffusion-GAN model. Our approach integrates the strengths of GANs, which capture global motion characteristics, with diffusion techniques, which refine local details, ensuring both authenticity and diversity in generated movements. Unlike conventional cascaded GANs, our framework employs a single generator-discriminator pair, utilizing different diffusion time steps to synthesize novel and diverse motions from a single short sequence. Experimental evaluations demonstrate the effectiveness of our model in achieving stable data distribution coverage and enhancing output diversity. Additionally, we showcase various applications, including motion composition and long-sequence generation, highlighting the versatility of our approach.

生成多样化和逼真的运动一直是计算机图形学的核心挑战。生成对抗网络(GANs)仍然是一个引人注目的解决方案,因为它们即使在有限的训练数据下也能表现良好。然而,传统的gan直接生成样本,这可能导致某些数据模式的遗漏。为了解决这一限制,我们引入了SinMDGan,这是一种利用扩散gan模型进行单运动合成的混合深度学习框架。我们的方法整合了gan的优势,捕获全局运动特征,扩散技术,细化局部细节,确保生成运动的真实性和多样性。与传统的级联gan不同,我们的框架采用单个生成器-鉴别器对,利用不同的扩散时间步长从单个短序列合成新颖和多样化的运动。实验验证了该模型在实现稳定的数据分布覆盖和增强输出多样性方面的有效性。此外,我们还展示了各种应用程序,包括运动合成和长序列生成,突出了我们方法的多功能性。
{"title":"SinMDGan: A Hybrid Deep Learning Framework for Single Motion Synthesis Using Diffusion-GAN Models","authors":"Qiang Chen,&nbsp;Binsong Zuo,&nbsp;Tingsong Lu,&nbsp;Yuming Fang,&nbsp;Xiaolu Mu,&nbsp;Chao Cai,&nbsp;Xiaogang Jin","doi":"10.1002/cav.70091","DOIUrl":"https://doi.org/10.1002/cav.70091","url":null,"abstract":"<div>\u0000 \u0000 <p>Generating diverse and realistic movements has long been a central challenge in computer graphics. Generative Adversarial Networks (GANs) remain a compelling solution due to their ability to perform well even with limited training data. However, traditional GANs generate samples directly, which can lead to the omission of certain data patterns. To address this limitation, we introduce <i>SinMDGan</i>, a hybrid deep learning framework for single-motion synthesis that leverages a Diffusion-GAN model. Our approach integrates the strengths of GANs, which capture global motion characteristics, with diffusion techniques, which refine local details, ensuring both authenticity and diversity in generated movements. Unlike conventional cascaded GANs, our framework employs a single generator-discriminator pair, utilizing different diffusion time steps to synthesize novel and diverse motions from a single short sequence. Experimental evaluations demonstrate the effectiveness of our model in achieving stable data distribution coverage and enhancing output diversity. Additionally, we showcase various applications, including motion composition and long-sequence generation, highlighting the versatility of our approach.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-Driven High-Quality 3D Human Generation via Variational Gradient Estimation and Latent Reward Models 文本驱动的高质量3D人类生成通过变分梯度估计和潜在奖励模型
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-08 DOI: 10.1002/cav.70089
Pengfei Zhou, Xukun Shen, Yong Hu

Recent advances in Score Distillation Sampling (SDS) have enabled text-driven 3D human generation, yet the standard classifier-free guidance (CFG) framework struggles with semantic misalignment and texture oversaturation due to limited model capacity. We propose a novel framework that decouples conditional and unconditional guidance via a dual-model strategy: A pretrained diffusion model ensures geometric stability, while a preference-tuned latent reward model enhances semantic fidelity. To further refine noise estimation, we introduce a lightweight U-shaped Swin Transformer (U-Swin) that regularizes predicted noise against the reward model, reducing gradient bias and local artifacts. Additionally, we design a time-varying noise weighting mechanism to dynamically balance the two guidance signals during denoising, improving stability and texture realism. Extensive experiments show that our method significantly improves alignment with textual descriptions, enhances texture details, and outperforms state-of-the-art baselines in both visual quality and semantic consistency.

分数蒸馏采样(SDS)的最新进展使文本驱动的3D人体生成成为可能,但由于模型容量有限,标准的无分类器指导(CFG)框架存在语义不对齐和纹理过饱和的问题。我们提出了一个新的框架,通过双模型策略来解耦条件和无条件引导:预训练的扩散模型确保几何稳定性,而偏好调整的潜在奖励模型增强语义保真度。为了进一步改进噪声估计,我们引入了一个轻量级的u形Swin变压器(U-Swin),它根据奖励模型对预测的噪声进行正则化,减少梯度偏差和局部伪像。此外,我们设计了一种时变的噪声加权机制,在去噪过程中动态平衡两个制导信号,提高了稳定性和纹理真实感。大量实验表明,我们的方法显著改善了与文本描述的对齐,增强了纹理细节,并且在视觉质量和语义一致性方面优于最先进的基线。
{"title":"Text-Driven High-Quality 3D Human Generation via Variational Gradient Estimation and Latent Reward Models","authors":"Pengfei Zhou,&nbsp;Xukun Shen,&nbsp;Yong Hu","doi":"10.1002/cav.70089","DOIUrl":"https://doi.org/10.1002/cav.70089","url":null,"abstract":"<div>\u0000 \u0000 <p>Recent advances in Score Distillation Sampling (SDS) have enabled text-driven 3D human generation, yet the standard classifier-free guidance (CFG) framework struggles with semantic misalignment and texture oversaturation due to limited model capacity. We propose a novel framework that decouples conditional and unconditional guidance via a dual-model strategy: A pretrained diffusion model ensures geometric stability, while a preference-tuned latent reward model enhances semantic fidelity. To further refine noise estimation, we introduce a lightweight U-shaped Swin Transformer (U-Swin) that regularizes predicted noise against the reward model, reducing gradient bias and local artifacts. Additionally, we design a time-varying noise weighting mechanism to dynamically balance the two guidance signals during denoising, improving stability and texture realism. Extensive experiments show that our method significantly improves alignment with textual descriptions, enhances texture details, and outperforms state-of-the-art baselines in both visual quality and semantic consistency.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Environmental Design Elements in Library Spaces: A Virtual Reality Study of Psychophysiological Responses to Color, Material, and Lighting in Built Environments 图书馆空间中的环境设计元素:对建筑环境中色彩、材料和照明的心理生理反应的虚拟现实研究
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2026-01-07 DOI: 10.1002/cav.70092
Mengyan Lin, Ning Li

This study explores how environmental design elements in library spaces influence human psychophysiological responses using virtual reality (VR). Thirty participants experienced VR simulations of library reading areas, with variations in wall color, flooring material, and lighting intensity, whereas electroencephalography (EEG) and galvanic skin response (GSR) measured physiological reactions alongside subjective ratings. Moderate lighting (20,000–30,000 cd) minimized arousal and supported attention, while white walls enhanced relaxation via increased alpha brain activity. Green plant walls slightly boosted attention-related beta activity, and wood flooring was rated highest for comfort and naturalness. VR enabled precise control of design variables, advancing environmental psychology research. These findings offer evidence-based guidelines for designing public spaces like libraries to enhance user well-being and cognitive performance, with implications for educational and public buildings.

本研究利用虚拟现实技术探讨了图书馆空间中的环境设计元素如何影响人的心理生理反应。30名参与者体验了图书馆阅读区域的VR模拟,墙壁颜色、地板材料和照明强度都有所不同,而脑电图(EEG)和皮肤电反应(GSR)则测量了生理反应和主观评分。适度的照明(20,000-30,000 cd)最大限度地减少唤醒和支持注意力,而白墙通过增加α脑活动来增强放松。绿色植物墙略微提高了与注意力相关的β活动,木地板在舒适度和自然性方面被评为最高。VR能够精确控制设计变量,推进环境心理学研究。这些发现为图书馆等公共空间的设计提供了基于证据的指导,以提高用户的幸福感和认知能力,对教育和公共建筑也有启示。
{"title":"Environmental Design Elements in Library Spaces: A Virtual Reality Study of Psychophysiological Responses to Color, Material, and Lighting in Built Environments","authors":"Mengyan Lin,&nbsp;Ning Li","doi":"10.1002/cav.70092","DOIUrl":"https://doi.org/10.1002/cav.70092","url":null,"abstract":"<div>\u0000 \u0000 <p>This study explores how environmental design elements in library spaces influence human psychophysiological responses using virtual reality (VR). Thirty participants experienced VR simulations of library reading areas, with variations in wall color, flooring material, and lighting intensity, whereas electroencephalography (EEG) and galvanic skin response (GSR) measured physiological reactions alongside subjective ratings. Moderate lighting (20,000–30,000 cd) minimized arousal and supported attention, while white walls enhanced relaxation via increased alpha brain activity. Green plant walls slightly boosted attention-related beta activity, and wood flooring was rated highest for comfort and naturalness. VR enabled precise control of design variables, advancing environmental psychology research. These findings offer evidence-based guidelines for designing public spaces like libraries to enhance user well-being and cognitive performance, with implications for educational and public buildings.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145930986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1