首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
Skeleton-Based Motion Recognition for Labanotation Generation Based on the Fusion of Neural Networks 基于神经网络融合的骨骼运动识别
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-08 DOI: 10.1002/cav.70073
Jiasheng Du, Jiaji Wang, Jianpo Li

Labanotation is a scientific method for documenting dance movements that has been widely adopted globally. Existing methods for Labanotation action recognition perform poorly in handling complex movements and integrating spatiotemporal information. To address this, we propose a multi-branch spatiotemporal fusion network with attention mechanisms aimed at accurately recognizing Labanotation actions from motion capture data. Initially, we convert motion capture data into three-dimensional coordinates and extract skeleton vector features. Subsequently, we enhance feature representation by extracting temporal difference features and skeleton angle features from the skeleton vectors. These features are processed using gated recurrent units and residual networks to effectively integrate spatiotemporal information. Finally, attention mechanisms are applied in the model to differentiate the importance of different positions in the features. This method effectively models spatiotemporal relationships, thereby improving the accuracy of Labanotation action recognition. We conducted experiments on two segmented motion capture datasets, demonstrating the effectiveness of each module. Compared to existing methods, our approach shows superior performance and strong generalization ability. Given the relative simplicity of upper limb action recognition, our focus primarily lies on lower limb action recognition. Notably, this marks the first application of skeleton angle features in the field of Labanotation action recognition.

Labanotation是一种记录舞蹈动作的科学方法,已被全球广泛采用。现有的Labanotation动作识别方法在处理复杂动作和整合时空信息方面表现不佳。为了解决这个问题,我们提出了一个具有注意机制的多分支时空融合网络,旨在从动作捕捉数据中准确识别Labanotation动作。首先,我们将运动捕捉数据转换成三维坐标并提取骨架矢量特征。随后,我们通过从骨架向量中提取时间差特征和骨架角特征来增强特征表示。利用门控递归单元和残差网络对这些特征进行处理,有效地整合了时空信息。最后,在模型中应用注意机制来区分特征中不同位置的重要性。该方法有效地建立了时空关系模型,从而提高了标注动作识别的准确性。我们在两个分段运动捕捉数据集上进行了实验,验证了每个模块的有效性。与现有方法相比,我们的方法表现出了优越的性能和较强的泛化能力。鉴于上肢动作识别相对简单,我们的重点主要放在下肢动作识别上。值得注意的是,这标志着骨架角度特征在标注动作识别领域的首次应用。
{"title":"Skeleton-Based Motion Recognition for Labanotation Generation Based on the Fusion of Neural Networks","authors":"Jiasheng Du,&nbsp;Jiaji Wang,&nbsp;Jianpo Li","doi":"10.1002/cav.70073","DOIUrl":"https://doi.org/10.1002/cav.70073","url":null,"abstract":"<div>\u0000 \u0000 <p>Labanotation is a scientific method for documenting dance movements that has been widely adopted globally. Existing methods for Labanotation action recognition perform poorly in handling complex movements and integrating spatiotemporal information. To address this, we propose a multi-branch spatiotemporal fusion network with attention mechanisms aimed at accurately recognizing Labanotation actions from motion capture data. Initially, we convert motion capture data into three-dimensional coordinates and extract skeleton vector features. Subsequently, we enhance feature representation by extracting temporal difference features and skeleton angle features from the skeleton vectors. These features are processed using gated recurrent units and residual networks to effectively integrate spatiotemporal information. Finally, attention mechanisms are applied in the model to differentiate the importance of different positions in the features. This method effectively models spatiotemporal relationships, thereby improving the accuracy of Labanotation action recognition. We conducted experiments on two segmented motion capture datasets, demonstrating the effectiveness of each module. Compared to existing methods, our approach shows superior performance and strong generalization ability. Given the relative simplicity of upper limb action recognition, our focus primarily lies on lower limb action recognition. Notably, this marks the first application of skeleton angle features in the field of Labanotation action recognition.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 5","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145012127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
More Than Following: Introducing Reversing Behavior for Irregular-Aware Traffic Simulation by Interactive Editing 更重要的是:通过交互式编辑引入不规则感知交通模拟的倒车行为
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-05 DOI: 10.1002/cav.70071
Yi Han, He Wang, Xiaogang Jin

Though current traffic simulation methods can produce impressive results, reversing behavior is always ignored, potentially reducing the diversity and plausibility of simulation data. Furthermore, while common traffic behaviors like following-the-leader and lane changing can be easily simulated, efficiently generating irregular cases in a human-in-the-loop manner with specific motions based on user desires is less discussed. To address the gap, we present a novel interactive traffic editing and simulation framework that enables users to regulate vehicles via simple inputs to introduce reversing and generate desired trajectory data with both car-following and irregular driving behaviors. With key states specified, lane-level navigation, including forward/backward directions, is planned through heuristic search. The customized navigation brings the vehicles' new trajectories with both car-following and reversing, and their surrounding neighbors are also adjusted accordingly. To provide smooth and plausible motions after editing, vehicles are updated via the optimization-based simulation method, which takes vehicle kinematics, self-motivation, path keeping, collision avoidance, and special interaction rules into account. We demonstrate that our framework can generate uncommon traffic cases and validate it through extensive experiments.

虽然目前的交通模拟方法可以产生令人印象深刻的结果,倒车行为总是被忽略,潜在地降低了模拟数据的多样性和合理性。此外,虽然常见的交通行为,如跟随领导者和变道可以很容易地模拟,但基于用户需求的特定运动,以人在环的方式有效地生成不规则情况的讨论较少。为了解决这一差距,我们提出了一种新的交互式交通编辑和仿真框架,使用户能够通过简单的输入来调节车辆,以引入倒车,并生成具有车辆跟随和不规则驾驶行为的所需轨迹数据。指定关键状态后,通过启发式搜索规划车道级导航,包括向前/向后方向。定制化导航带来了车辆的新轨迹,包括跟车和倒车,并且它们周围的邻居也会相应地进行调整。为了保证编辑后的运动流畅可信,车辆更新采用了基于优化的仿真方法,该方法考虑了车辆运动学、自激励、路径保持、避碰和特殊交互规则。我们证明了我们的框架可以生成不常见的流量案例,并通过大量的实验验证了它。
{"title":"More Than Following: Introducing Reversing Behavior for Irregular-Aware Traffic Simulation by Interactive Editing","authors":"Yi Han,&nbsp;He Wang,&nbsp;Xiaogang Jin","doi":"10.1002/cav.70071","DOIUrl":"https://doi.org/10.1002/cav.70071","url":null,"abstract":"<div>\u0000 \u0000 <p>Though current traffic simulation methods can produce impressive results, reversing behavior is always ignored, potentially reducing the diversity and plausibility of simulation data. Furthermore, while common traffic behaviors like following-the-leader and lane changing can be easily simulated, efficiently generating irregular cases in a human-in-the-loop manner with specific motions based on user desires is less discussed. To address the gap, we present a novel interactive traffic editing and simulation framework that enables users to regulate vehicles via simple inputs to introduce reversing and generate desired trajectory data with both car-following and irregular driving behaviors. With key states specified, lane-level navigation, including forward/backward directions, is planned through heuristic search. The customized navigation brings the vehicles' new trajectories with both car-following and reversing, and their surrounding neighbors are also adjusted accordingly. To provide smooth and plausible motions after editing, vehicles are updated via the optimization-based simulation method, which takes vehicle kinematics, self-motivation, path keeping, collision avoidance, and special interaction rules into account. We demonstrate that our framework can generate uncommon traffic cases and validate it through extensive experiments.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 5","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144999030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integration Method for Generating Complete Hierarchical Layouts From Incomplete Virtual Scene Trees 不完整虚拟场景树生成完整分层布局的集成方法
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-12 DOI: 10.1002/cav.70069
Emir Cogo, Ehlimana Cogo, Damir Pozderac, Razija Turčinhodžić Mulahasanović, Selma Rizvić

Procedural modeling methods are used to automatically generate virtual scenes. There is a large number of available top-down methods for generating partial content for specific purposes. However, little research was done on enabling the generation of content in the presence of manually modeled elements, from the bottom-up direction, or without significant assistance from the user. No existing approach provides a platform that can combine the results of different methods, which leaves them isolated. This paper presents an integration approach that generates complete virtual space organizations by combining the usage of top-down and bottom-up procedural generation of content, with support for the placement of manually modeled content. The integration is made possible by using shape conversion to match the input and output shape types of different methods. The evaluation of the proposed approach was performed on a 2D polygon dataset by using four different scenarios, validating that it works as intended. Additional testing was performed by using a case study of organizing 3D virtual space around the manually modeled element of virtual heritage Tašlihan to demonstrate all capabilities of the integration approach and the different outputs depending on the level of user interaction and the desired results.

采用程序化建模方法自动生成虚拟场景。有许多自顶向下的方法可用于为特定目的生成部分内容。然而,很少有人研究在手工建模元素存在的情况下,从自下而上的方向,或者在没有用户大量帮助的情况下,如何生成内容。没有一种现有的方法能够提供一个平台,将不同方法的结果结合起来,从而使它们相互隔离。本文提出了一种集成方法,通过结合使用自顶向下和自底向上的程序生成内容来生成完整的虚拟空间组织,并支持手动建模内容的放置。通过使用形状转换来匹配不同方法的输入和输出形状类型,使集成成为可能。通过使用四种不同的场景,在二维多边形数据集上对所提出的方法进行了评估,验证了它的预期效果。通过使用围绕虚拟遗产Tašlihan的手工建模元素组织3D虚拟空间的案例研究,进行了额外的测试,以展示集成方法的所有功能以及根据用户交互水平和期望结果的不同输出。
{"title":"Integration Method for Generating Complete Hierarchical Layouts From Incomplete Virtual Scene Trees","authors":"Emir Cogo,&nbsp;Ehlimana Cogo,&nbsp;Damir Pozderac,&nbsp;Razija Turčinhodžić Mulahasanović,&nbsp;Selma Rizvić","doi":"10.1002/cav.70069","DOIUrl":"https://doi.org/10.1002/cav.70069","url":null,"abstract":"<div>\u0000 \u0000 <p>Procedural modeling methods are used to automatically generate virtual scenes. There is a large number of available top-down methods for generating partial content for specific purposes. However, little research was done on enabling the generation of content in the presence of manually modeled elements, from the bottom-up direction, or without significant assistance from the user. No existing approach provides a platform that can combine the results of different methods, which leaves them isolated. This paper presents an integration approach that generates complete virtual space organizations by combining the usage of top-down and bottom-up procedural generation of content, with support for the placement of manually modeled content. The integration is made possible by using shape conversion to match the input and output shape types of different methods. The evaluation of the proposed approach was performed on a 2D polygon dataset by using four different scenarios, validating that it works as intended. Additional testing was performed by using a case study of organizing 3D virtual space around the manually modeled element of virtual heritage Tašlihan to demonstrate all capabilities of the integration approach and the different outputs depending on the level of user interaction and the desired results.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144815096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Dragline Operational Area Visualization and Positional Analysis for Its Effective Guidance Through Immersive Virtual Reality Inspection 通过沉浸式虚拟现实检测增强曳绳作业区域可视化和位置分析,实现曳绳作业区域的有效引导
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-04 DOI: 10.1002/cav.70067
Piyush Singh, V. M. S. R. Murthy, Dheeraj Kumar, Simit Raval

This study presents an innovative approach to dragline mine operations using 3D virtual reality technology, specifically developed at the virtual reality mines simulation facility at IIT (ISM), Dhanbad. Focused on Northern Coalfields Limited, Singrauli, this research opens up the use of immersive VR for virtual inspection and analysis of dragline operations, aiding its effective deployment. The methodology involves capturing geo-spatial data using drones and integrating it into a commissioned dragline simulator workbench. The integration process combines drone imagery to accurately reconstruct the physical mine site. Enhanced by the Structure-from-Motion technique, the resulting model is a photorealistic point cloud, offering unprecedented visualization accuracy. The developed system enables complex manipulation of datasets, including functions such as scaling, rotation, and translation. It also includes geometric measurement tools for determining length, area, and volume, essential for precise operational planning of draglines. The application of Structure-from-Motion-MultiView Stereo technology is particularly noteworthy for its role in tracking safety concerns and monitoring dragline progress with respect to the stipulated balance diagram. The proposed approach surpasses traditional mine visualization methods, providing superior tools for onsite planning and comprehensive asset management and significant contributions for establishing a new benchmark for monitoring and visualizing the dragline excavation process in the mining industry.

本研究提出了一种利用3D虚拟现实技术进行拖缆采矿作业的创新方法,该技术是由印度理工学院(ISM)的虚拟现实矿山模拟设施专门开发的。该研究以singgrauli北部煤田有限公司为重点,开辟了使用沉浸式VR进行虚拟检查和分析曳绳作业的方法,帮助其有效部署。该方法包括使用无人机捕获地理空间数据,并将其集成到委托的拖绳模拟器工作台中。该集成过程结合无人机图像,精确地重建物理矿区。通过增强结构-从运动技术,得到的模型是一个逼真的点云,提供前所未有的可视化精度。开发的系统能够对数据集进行复杂的操作,包括缩放、旋转和转换等功能。它还包括用于确定长度,面积和体积的几何测量工具,这对于精确的拖绳操作规划至关重要。结构-从运动-多视图立体技术的应用特别值得注意,因为它在跟踪安全问题和根据规定的平衡图监测拖缆进度方面的作用。该方法超越了传统的矿山可视化方法,为现场规划和综合资产管理提供了优越的工具,为建立采矿行业曳绳开挖过程监控和可视化的新标杆做出了重要贡献。
{"title":"Enhancing Dragline Operational Area Visualization and Positional Analysis for Its Effective Guidance Through Immersive Virtual Reality Inspection","authors":"Piyush Singh,&nbsp;V. M. S. R. Murthy,&nbsp;Dheeraj Kumar,&nbsp;Simit Raval","doi":"10.1002/cav.70067","DOIUrl":"https://doi.org/10.1002/cav.70067","url":null,"abstract":"<div>\u0000 \u0000 <p>This study presents an innovative approach to dragline mine operations using 3D virtual reality technology, specifically developed at the virtual reality mines simulation facility at IIT (ISM), Dhanbad. Focused on Northern Coalfields Limited, Singrauli, this research opens up the use of immersive VR for virtual inspection and analysis of dragline operations, aiding its effective deployment. The methodology involves capturing geo-spatial data using drones and integrating it into a commissioned dragline simulator workbench. The integration process combines drone imagery to accurately reconstruct the physical mine site. Enhanced by the Structure-from-Motion technique, the resulting model is a photorealistic point cloud, offering unprecedented visualization accuracy. The developed system enables complex manipulation of datasets, including functions such as scaling, rotation, and translation. It also includes geometric measurement tools for determining length, area, and volume, essential for precise operational planning of draglines. The application of Structure-from-Motion-MultiView Stereo technology is particularly noteworthy for its role in tracking safety concerns and monitoring dragline progress with respect to the stipulated balance diagram. The proposed approach surpasses traditional mine visualization methods, providing superior tools for onsite planning and comprehensive asset management and significant contributions for establishing a new benchmark for monitoring and visualizing the dragline excavation process in the mining industry.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144773698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An HBIM Framework and Virtual Reconstruction for the Preservation of Confucian Temple Heritage 孔庙遗产保护的HBIM框架与虚拟重构
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-30 DOI: 10.1002/cav.70068
Jin Liu

The current 3D modeling methods for analysis, documentation, and preservation of cultural heritage sites are tedious, and the realistic expression ability of the model is insufficient. It is urgent to break through the bottleneck of professional model creation and architectural layout analysis by non-professional users. Based on the study of architectural regulations of Confucian temples, this paper presents an HBIM (Heritage Building Information Modeling) framework for management and spatial analysis of the historical buildings within the space of Confucian temples. This paper solves the problems of low degree of automation in 3D modeling and lack of detail expression ability in virtual architectural layout analysis for Confucian temples. The framework can be an important tool to analyze the architectural space form. Through case studies of Confucian temples in Qufu and Beijing, the impact of the study on enhancing non-professional users' modeling experience is revealed. The results of the study strengthen the digital presentation of structure and hierarchy of ancient Chinese society, thus become the starting point of further research for Confucian temples all over the world.

目前用于文物遗址分析、文献记录和保护的三维建模方法繁琐,模型的逼真表达能力不足。突破非专业用户专业模型创建和建筑布局分析的瓶颈是当务之急。本文在研究孔庙建筑规范的基础上,提出了一个HBIM (Heritage Building Information Modeling)框架,对孔庙空间内的历史建筑进行管理和空间分析。本文解决了孔庙虚拟建筑布局分析中存在的三维建模自动化程度低、细节表达能力不足的问题。框架是分析建筑空间形态的重要工具。通过对曲阜孔庙和北京孔庙的案例分析,揭示了本研究对提升非专业用户造型体验的影响。研究结果加强了中国古代社会结构和等级的数字化呈现,从而成为世界各地孔庙进一步研究的起点。
{"title":"An HBIM Framework and Virtual Reconstruction for the Preservation of Confucian Temple Heritage","authors":"Jin Liu","doi":"10.1002/cav.70068","DOIUrl":"https://doi.org/10.1002/cav.70068","url":null,"abstract":"<div>\u0000 \u0000 <p>The current 3D modeling methods for analysis, documentation, and preservation of cultural heritage sites are tedious, and the realistic expression ability of the model is insufficient. It is urgent to break through the bottleneck of professional model creation and architectural layout analysis by non-professional users. Based on the study of architectural regulations of Confucian temples, this paper presents an HBIM (Heritage Building Information Modeling) framework for management and spatial analysis of the historical buildings within the space of Confucian temples. This paper solves the problems of low degree of automation in 3D modeling and lack of detail expression ability in virtual architectural layout analysis for Confucian temples. The framework can be an important tool to analyze the architectural space form. Through case studies of Confucian temples in Qufu and Beijing, the impact of the study on enhancing non-professional users' modeling experience is revealed. The results of the study strengthen the digital presentation of structure and hierarchy of ancient Chinese society, thus become the starting point of further research for Confucian temples all over the world.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144740158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Yolov8-HAC: Safety Helmet Detection Model for Complex Underground Coal Mine Scene Yolov8-HAC:煤矿井下复杂场景安全帽检测模型
IF 1.7 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-29 DOI: 10.1002/cav.70051
Rui Liu, Fangbo Lu, Wanchuang Luo, Tianjian Cao, Hailian Xue, Meili Wang

The underground coal mine working environment is complicated, and the detection of safety helmet wearing is vital for assuring worker safety. This article proposes an improved YOLOv8n safety helmet detection model, YOLOv8-HAC, to address the issues of coexisting strong light exposure and low illumination, equipment occlusions that result in partial target loss, and the missed detection of small targets due to limited surveillance perspectives in underground coal mines. The model substitutes the suggested HAC-Net for the C2f module in YOLOv8n's backbone network to improve feature extraction and detection performance for targets with motion blur and low-resolution images. To improve detection stability in complicated situations and lessen background interference, the AGC-Block module is also included for dynamic feature selection. Additionally, a tiny target detection layer is included to increase the long-range identification rate of tiny safety helmets. According to experimental data, the enhanced model outperforms existing popular object detection algorithms, with a mAP of 94.8% and a recall rate of 90.4%. This demonstrates how well the suggested approach works to identify safety helmets in situations with complicated lighting and low-resolution photos.

煤矿井下作业环境复杂,安全帽佩戴情况的检测对保障作业人员安全至关重要。针对煤矿井下强光照射与低照度共存、设备遮挡导致部分目标丢失、监控视角有限导致小目标漏检等问题,提出改进的YOLOv8n安全帽检测模型YOLOv8-HAC。该模型将建议的HAC-Net替代YOLOv8n骨干网中的C2f模块,以提高对运动模糊和低分辨率图像目标的特征提取和检测性能。为了提高复杂情况下的检测稳定性和减少背景干扰,还包括AGC-Block模块,用于动态特征选择。此外,还增加了微小目标检测层,提高了微型安全帽的远程识别率。实验数据表明,增强模型优于现有流行的目标检测算法,mAP为94.8%,召回率为90.4%。这证明了建议的方法如何很好地在复杂的照明和低分辨率照片的情况下识别安全帽。
{"title":"Yolov8-HAC: Safety Helmet Detection Model for Complex Underground Coal Mine Scene","authors":"Rui Liu,&nbsp;Fangbo Lu,&nbsp;Wanchuang Luo,&nbsp;Tianjian Cao,&nbsp;Hailian Xue,&nbsp;Meili Wang","doi":"10.1002/cav.70051","DOIUrl":"https://doi.org/10.1002/cav.70051","url":null,"abstract":"<div>\u0000 \u0000 <p>The underground coal mine working environment is complicated, and the detection of safety helmet wearing is vital for assuring worker safety. This article proposes an improved YOLOv8n safety helmet detection model, YOLOv8-HAC, to address the issues of coexisting strong light exposure and low illumination, equipment occlusions that result in partial target loss, and the missed detection of small targets due to limited surveillance perspectives in underground coal mines. The model substitutes the suggested HAC-Net for the C2f module in YOLOv8n's backbone network to improve feature extraction and detection performance for targets with motion blur and low-resolution images. To improve detection stability in complicated situations and lessen background interference, the AGC-Block module is also included for dynamic feature selection. Additionally, a tiny target detection layer is included to increase the long-range identification rate of tiny safety helmets. According to experimental data, the enhanced model outperforms existing popular object detection algorithms, with a mAP of 94.8% and a recall rate of 90.4%. This demonstrates how well the suggested approach works to identify safety helmets in situations with complicated lighting and low-resolution photos.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144725587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Real-Time Virtual-Real Fusion Rendering Framework in Cloud-Edge Environments 云边缘环境下的实时虚实融合渲染框架
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-21 DOI: 10.1002/cav.70049
Yuxi Zhou, Bowen Gao, Hongxin Zhang, Wei Chen, Xiaoliang Luo, Lvchun Wang

This paper introduces a cloud-edge collaborative framework for real-time virtual-real fusion rendering in augmented reality (AR). By integrating Visual Simultaneous Localization and Mapping (VSLAM) with Neural Radiance Fields (NeRF), the proposed method achieves high-fidelity virtual object placement and shadow synthesis in real-world scenes. The cloud server handles computationally intensive tasks, including offline NeRF-based 3D reconstruction and online illumination estimation, while edge devices perform real-time data acquisition, SLAM-based plane detection, and rendering. To enhance realism, the system employs an improved soft shadow generation technique that dynamically adjusts shadow parameters based on light source information. Experimental results across diverse indoor environments demonstrate the system's effectiveness, with consistent real-time performance, accurate illumination estimation, and high-quality shadow rendering. The proposed method reduces the computational burden on edge devices, enabling immersive AR experiences on resource-constrained hardware, such as mobile and wearable devices.

介绍了一种用于增强现实(AR)中实时虚实融合渲染的云边缘协作框架。该方法将视觉同步定位和映射(VSLAM)与神经辐射场(NeRF)相结合,实现了真实场景中高保真的虚拟物体放置和阴影合成。云服务器处理计算密集型任务,包括基于nerf的离线3D重建和在线照明估计,而边缘设备执行实时数据采集、基于slam的平面检测和渲染。为了增强真实感,系统采用改进的软阴影生成技术,根据光源信息动态调整阴影参数。不同室内环境的实验结果证明了该系统的有效性,具有一致的实时性能,准确的照明估计和高质量的阴影渲染。该方法减少了边缘设备的计算负担,在资源受限的硬件(如移动和可穿戴设备)上实现沉浸式AR体验。
{"title":"A Real-Time Virtual-Real Fusion Rendering Framework in Cloud-Edge Environments","authors":"Yuxi Zhou,&nbsp;Bowen Gao,&nbsp;Hongxin Zhang,&nbsp;Wei Chen,&nbsp;Xiaoliang Luo,&nbsp;Lvchun Wang","doi":"10.1002/cav.70049","DOIUrl":"https://doi.org/10.1002/cav.70049","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper introduces a cloud-edge collaborative framework for real-time virtual-real fusion rendering in augmented reality (AR). By integrating Visual Simultaneous Localization and Mapping (VSLAM) with Neural Radiance Fields (NeRF), the proposed method achieves high-fidelity virtual object placement and shadow synthesis in real-world scenes. The cloud server handles computationally intensive tasks, including offline NeRF-based 3D reconstruction and online illumination estimation, while edge devices perform real-time data acquisition, SLAM-based plane detection, and rendering. To enhance realism, the system employs an improved soft shadow generation technique that dynamically adjusts shadow parameters based on light source information. Experimental results across diverse indoor environments demonstrate the system's effectiveness, with consistent real-time performance, accurate illumination estimation, and high-quality shadow rendering. The proposed method reduces the computational burden on edge devices, enabling immersive AR experiences on resource-constrained hardware, such as mobile and wearable devices.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Retrieval-Augmented Generation System for Accurate and Contextual Historical Analysis: AI-Agent for the Annals of the Joseon Dynasty 用于准确和上下文历史分析的检索-增强生成系统:朝鲜王朝编年史的AI-Agent
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-20 DOI: 10.1002/cav.70048
Jeong Ha Lee, Ghazanfar Ali, Jae-In Hwang

In this article, we propose an AI-agent that integrates a large language model (LLM) with a retrieval-augmented generation (RAG) system to deliver reliable historical information from the Annals of the Joseon Dynasty through both objective facts and contextual analysis, achieving significant performance improvements over existing models. For an AI-agent using the Annals of the Joseon Dynasty to deliver reliable historical information, clear source citations and systematic analysis are essential. The Annals, an official record spanning 472 years (1392–1897), offer a dense, chronological account of daily events and state administration that shaped Korea's cultural, political, and social foundations. We propose integrating a LLM with a RAG system to generate highly accurate responses based on this extensive dataset. This approach provides both objective information about historical figures and events from specific periods and subjective contextual analysis of the era, helping users gain a broader understanding. Our experiments demonstrate improvements of approximately 23 to 50 points on a 100-point scale compared with the GPT-4o and OpenAI AI-Assistant v2 models.

在本文中,我们提出了一种人工智能代理,它将大型语言模型(LLM)与检索增强生成(RAG)系统集成在一起,通过客观事实和上下文分析提供来自朝鲜王朝编年史的可靠历史信息,实现了对现有模型的显着性能改进。要想利用《朝鲜实录》提供可靠的历史信息,必须明确出处和系统分析。《编年史》是一部跨越472年(1392-1897)的官方记录,对塑造韩国文化、政治和社会基础的日常事件和国家管理进行了密集的、按时间顺序的描述。我们建议将LLM与RAG系统集成,以基于此广泛的数据集生成高度准确的响应。这种方法既提供了特定时期的历史人物和事件的客观信息,又提供了对时代的主观语境分析,有助于用户获得更广泛的理解。我们的实验表明,与gpt - 40和OpenAI AI-Assistant v2模型相比,在100分的尺度上,改进了大约23到50分。
{"title":"A Retrieval-Augmented Generation System for Accurate and Contextual Historical Analysis: AI-Agent for the Annals of the Joseon Dynasty","authors":"Jeong Ha Lee,&nbsp;Ghazanfar Ali,&nbsp;Jae-In Hwang","doi":"10.1002/cav.70048","DOIUrl":"https://doi.org/10.1002/cav.70048","url":null,"abstract":"<div>\u0000 \u0000 <p>In this article, we propose an AI-agent that integrates a large language model (LLM) with a retrieval-augmented generation (RAG) system to deliver reliable historical information from the Annals of the Joseon Dynasty through both objective facts and contextual analysis, achieving significant performance improvements over existing models. For an AI-agent using the Annals of the Joseon Dynasty to deliver reliable historical information, clear source citations and systematic analysis are essential. The Annals, an official record spanning 472 years (1392–1897), offer a dense, chronological account of daily events and state administration that shaped Korea's cultural, political, and social foundations. We propose integrating a LLM with a RAG system to generate highly accurate responses based on this extensive dataset. This approach provides both objective information about historical figures and events from specific periods and subjective contextual analysis of the era, helping users gain a broader understanding. Our experiments demonstrate improvements of approximately 23 to 50 points on a 100-point scale compared with the GPT-4o and OpenAI AI-Assistant v2 models.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144673041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Botanical-Based Simulation of Fruit Shape Change During Growth 基于植物学的果实生长形态变化模拟
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-16 DOI: 10.1002/cav.70064
Yixin Xu, Shiguang Liu

Fruit growth is an interesting time-lapse process. The simulation of this process using computer graphics technology can have many applications in areas such as films, games, agriculture, etc. Although there are some methods to model the shape of the fruit, it is challenging to accurately simulate its growth process and include shape changes. We propose a botanical-based framework to address this problem. By combining the growth pattern function and the exponential model in botany, we propose a mesh scaling method that can accurately simulate the fruit volume increase. Specifically, the RGR (relative growth rate) in the exponential model is automatically calculated according to the user's input growth pattern function or real size data. In addition, we model and simulate fruit shape changes by integrating axial, longitudinal, and latitudinal shape parameters into the RGR function. Various defective fruits can be simulated by adjusting these parameters. Inspired by the principle of root curvature, we propose a deformation technique-based approach in conjunction with our volume increase approach to simulate the bending growth of fruits such as cucumber. Various experiments show that our framework can effectively simulate the growth process of a wide range of fruits with shape change or bending.

水果生长是一个有趣的延时过程。利用计算机图形技术对这一过程进行模拟,可以在电影、游戏、农业等领域得到广泛应用。虽然有一些方法来模拟水果的形状,但准确地模拟其生长过程并包括形状变化是具有挑战性的。我们提出了一个基于植物学的框架来解决这个问题。将生长模式函数与植物学指数模型相结合,提出了一种能够准确模拟果实体积增长的网格缩放方法。具体来说,指数模型中的RGR(相对增长率)是根据用户输入的增长模式函数或实际规模数据自动计算出来的。此外,我们通过将轴向、纵向和纬度形状参数整合到RGR函数中,对果实形状变化进行建模和模拟。通过调整这些参数,可以模拟出各种不良果实。受根曲率原理的启发,我们提出了一种基于变形技术的方法,结合我们的体积增加方法来模拟黄瓜等水果的弯曲生长。各种实验表明,我们的框架可以有效地模拟各种形状变化或弯曲的水果的生长过程。
{"title":"Botanical-Based Simulation of Fruit Shape Change During Growth","authors":"Yixin Xu,&nbsp;Shiguang Liu","doi":"10.1002/cav.70064","DOIUrl":"https://doi.org/10.1002/cav.70064","url":null,"abstract":"<div>\u0000 \u0000 <p>Fruit growth is an interesting time-lapse process. The simulation of this process using computer graphics technology can have many applications in areas such as films, games, agriculture, etc. Although there are some methods to model the shape of the fruit, it is challenging to accurately simulate its growth process and include shape changes. We propose a botanical-based framework to address this problem. By combining the growth pattern function and the exponential model in botany, we propose a mesh scaling method that can accurately simulate the fruit volume increase. Specifically, the RGR (relative growth rate) in the exponential model is automatically calculated according to the user's input growth pattern function or real size data. In addition, we model and simulate fruit shape changes by integrating axial, longitudinal, and latitudinal shape parameters into the RGR function. Various defective fruits can be simulated by adjusting these parameters. Inspired by the principle of root curvature, we propose a deformation technique-based approach in conjunction with our volume increase approach to simulate the bending growth of fruits such as cucumber. Various experiments show that our framework can effectively simulate the growth process of a wide range of fruits with shape change or bending.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoPadSAR: A Spatial Augmented Reality Interaction Approach for Collaborative Design via Pad-Based Cross-Device Interaction CoPadSAR:基于pad的跨设备交互的协同设计空间增强现实交互方法
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-07-16 DOI: 10.1002/cav.70065
Keming Chen, Qihao Yang, Qingshu Yuan, Jin Xu, Zhengwei Yao, Zhigeng Pan

Augmented reality (AR) is a technology that superimposes digital information onto the real world. As one of the three major forms of AR, spatial augmented reality (SAR) projects virtual content into public spaces, making it accessible to collaborators. Due to its shared large display area, SAR has significant potential for application in collaborative design. However, existing SAR interaction methods may suffer from inefficiencies and poor collaborative experiences. To address this issue, CoPadSAR, a Pad-based cross-device interaction method, is proposed. It can map 2D operations from each Pad onto 3D objects within the SAR environment, allowing users to collaborate using multiple Pads. Moreover, a prototype is presented that supports collaborative painting, annotation, and object creation. Furthermore, a comparative study involving 40 participants (20 pairs) is conducted. The results indicate CoPadSAR reveals better group performance than controller-based, gesture, and tangible interactions. It has greater usability and provides a better collaborative experience. The interviews further confirm the user preference for it. This study contributes to expanding the application of SAR in collaborative design.

增强现实(AR)是一种将数字信息叠加到现实世界中的技术。作为AR的三大形式之一,空间增强现实(SAR)将虚拟内容投射到公共空间中,使其可供合作者使用。由于其共享的大显示面积,SAR在协同设计中具有巨大的应用潜力。然而,现有的SAR交互方法可能存在效率低下和协作体验差的问题。针对这一问题,提出了一种基于pad的跨设备交互方法CoPadSAR。它可以将每个Pad的2D操作映射到SAR环境中的3D对象上,允许用户使用多个Pad进行协作。此外,还提出了一个支持协作绘制、注释和对象创建的原型。在此基础上,对40名参与者(20对)进行了比较研究。结果表明,CoPadSAR显示出比基于控制器、手势和有形交互更好的群体性能。它具有更高的可用性,并提供了更好的协作体验。访谈进一步证实了用户对它的偏好。本研究有助于拓展SAR在协同设计中的应用。
{"title":"CoPadSAR: A Spatial Augmented Reality Interaction Approach for Collaborative Design via Pad-Based Cross-Device Interaction","authors":"Keming Chen,&nbsp;Qihao Yang,&nbsp;Qingshu Yuan,&nbsp;Jin Xu,&nbsp;Zhengwei Yao,&nbsp;Zhigeng Pan","doi":"10.1002/cav.70065","DOIUrl":"https://doi.org/10.1002/cav.70065","url":null,"abstract":"<div>\u0000 \u0000 <p>Augmented reality (AR) is a technology that superimposes digital information onto the real world. As one of the three major forms of AR, spatial augmented reality (SAR) projects virtual content into public spaces, making it accessible to collaborators. Due to its shared large display area, SAR has significant potential for application in collaborative design. However, existing SAR interaction methods may suffer from inefficiencies and poor collaborative experiences. To address this issue, CoPadSAR, a Pad-based cross-device interaction method, is proposed. It can map 2D operations from each Pad onto 3D objects within the SAR environment, allowing users to collaborate using multiple Pads. Moreover, a prototype is presented that supports collaborative painting, annotation, and object creation. Furthermore, a comparative study involving 40 participants (20 pairs) is conducted. The results indicate CoPadSAR reveals better group performance than controller-based, gesture, and tangible interactions. It has greater usability and provides a better collaborative experience. The interviews further confirm the user preference for it. This study contributes to expanding the application of SAR in collaborative design.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1