首页 > 最新文献

Computer Animation and Virtual Worlds最新文献

英文 中文
Augmented Reality-Based Interactive Scheme for Robot-Assisted Percutaneous Renal Puncture Navigation
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-07 DOI: 10.1002/cav.70009
Yiwei Zhuang, Shuyi Wang, Hua Xie, Wei Qing, Haoliang Li, Yuhan Shen, Yichun Shen

In this paper, we present an Augmented Reality (AR)-based application combined with a robotic system for percutaneous renal puncture navigation interaction and demonstrate its technical feasibility. Our system provides an intuitive interaction scheme between the surgeon and the robot without the need for traditional external input devices, and applies an image-target-based 3D registration scheme to transform the coordinate system between Hololens2 and the robot without using additional tracking devices. Users can visualize the abdominal puncture phantom and obtain 3D depth information of the lesion site by wearing Hololens2 and control the robot directly using buttons or gestures. To investigate the accuracy and feasibility of the proposed interaction scheme, six subjects were recruited to complete 3D registration alignment accuracy experiments, and puncture positioning accuracy experiments using ultrasound unaided navigation, AR unaided navigation and AR robotic navigation. The results showed that the average alignment error of 3D registration was 3.61 ± 1.05 mm. The average positioning errors of ultrasound freehand navigation, AR freehand navigation and AR robotic navigation were 7.67 ± 2.00 mm, 6.13 ± 1.07 mm and 5.52 ± 0.37 mm, respectively; the average puncture times were 34.86 ± 1.67 s, 22.40 ± 2.07 s, and 29.41 ± 1.37 s.

{"title":"Augmented Reality-Based Interactive Scheme for Robot-Assisted Percutaneous Renal Puncture Navigation","authors":"Yiwei Zhuang,&nbsp;Shuyi Wang,&nbsp;Hua Xie,&nbsp;Wei Qing,&nbsp;Haoliang Li,&nbsp;Yuhan Shen,&nbsp;Yichun Shen","doi":"10.1002/cav.70009","DOIUrl":"https://doi.org/10.1002/cav.70009","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we present an Augmented Reality (AR)-based application combined with a robotic system for percutaneous renal puncture navigation interaction and demonstrate its technical feasibility. Our system provides an intuitive interaction scheme between the surgeon and the robot without the need for traditional external input devices, and applies an image-target-based 3D registration scheme to transform the coordinate system between Hololens2 and the robot without using additional tracking devices. Users can visualize the abdominal puncture phantom and obtain 3D depth information of the lesion site by wearing Hololens2 and control the robot directly using buttons or gestures. To investigate the accuracy and feasibility of the proposed interaction scheme, six subjects were recruited to complete 3D registration alignment accuracy experiments, and puncture positioning accuracy experiments using ultrasound unaided navigation, AR unaided navigation and AR robotic navigation. The results showed that the average alignment error of 3D registration was 3.61 ± 1.05 mm. The average positioning errors of ultrasound freehand navigation, AR freehand navigation and AR robotic navigation were 7.67 ± 2.00 mm, 6.13 ± 1.07 mm and 5.52 ± 0.37 mm, respectively; the average puncture times were 34.86 ± 1.67 s, 22.40 ± 2.07 s, and 29.41 ± 1.37 s.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143362755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced Gesture Recognition Method Based on Fractional Fourier Transform and Relevance Vector Machine for Smart Home Appliances
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-30 DOI: 10.1002/cav.70011
Xie Hong-qin, Zhao Yuan-yuan

Addressing the challenges of low feature extraction dimensions and insufficient distinct information for gesture differentiation for smart home appliances, this article proposed an innovative gesture recognition algorithm, integrating fractional Fourier transform (FrFT) with relevance vector machine (RVM). The process involves using FrFT to transform raw gesture data into the fractional domain, thereby expanding the dimensions of information extraction. Subsequently, high-dimensional feature vectors are created from fractional domain, and RVM classifiers are employed for joint optimization of feature selection and classification decision functions, achieving optimal classification performance. A dataset was constructed using five different types of gestures recorded on the TI millimeter-wave radar platform to validate the effectiveness of this method. The experimental results demonstrate that the RVM selected the optimal FrFT order of 0.6, with the best feature set comprising fractional spectral entropy, peak factor, and second-order central moment. Recognition rates for each gesture exceeded 96.2%, with an average rate of 98.5%. This performance surpasses three comparative methods in both recognition accuracy and real-time processing, indicating high potential for future applications.

{"title":"Advanced Gesture Recognition Method Based on Fractional Fourier Transform and Relevance Vector Machine for Smart Home Appliances","authors":"Xie Hong-qin,&nbsp;Zhao Yuan-yuan","doi":"10.1002/cav.70011","DOIUrl":"https://doi.org/10.1002/cav.70011","url":null,"abstract":"<div>\u0000 \u0000 <p>Addressing the challenges of low feature extraction dimensions and insufficient distinct information for gesture differentiation for smart home appliances, this article proposed an innovative gesture recognition algorithm, integrating fractional Fourier transform (FrFT) with relevance vector machine (RVM). The process involves using FrFT to transform raw gesture data into the fractional domain, thereby expanding the dimensions of information extraction. Subsequently, high-dimensional feature vectors are created from fractional domain, and RVM classifiers are employed for joint optimization of feature selection and classification decision functions, achieving optimal classification performance. A dataset was constructed using five different types of gestures recorded on the TI millimeter-wave radar platform to validate the effectiveness of this method. The experimental results demonstrate that the RVM selected the optimal FrFT order of 0.6, with the best feature set comprising fractional spectral entropy, peak factor, and second-order central moment. Recognition rates for each gesture exceeded 96.2%, with an average rate of 98.5%. This performance surpasses three comparative methods in both recognition accuracy and real-time processing, indicating high potential for future applications.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual Expansion and Real-Time Calibration for Pan-Tilt-Zoom Cameras Assisted by Panoramic Models
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1002/cav.70015
Liangliang Cai, Zhong Zhou

Pan-tilt-zoom (PTZ) cameras, which dynamically adjust their field of view (FOV), are pervasive in large-scale scenes, such as train stations, squares, and airports. In real scenarios, PTZ cameras are required to quickly judge their directions using contextual clues from the surrounding environment. To achieve this goal, some research projects camera videos into three-dimensional (3D) models or panoramas and allows operators to establish spatial relationships. However, these works face several challenges in terms of real-time processing, localization accuracy, and realistic reference. To address this problem, a visual expansion and real-time calibration for PTZ cameras assisted by panoramic models is proposed. The calibration method consists of three parts: Providing a real environment background by building a panoramic model, meeting the needs of real-time processing by establishing a PTZ camera motion estimation model and achieving high-precision alignment between PTZ images and panoramic models using only two feature point pairs. Our methods were validated using both the public and our Scene dataset. The experimental results indicate that our method outperforms other state-of-the-art methods in terms of real-time processing, accuracy, and robustness.

{"title":"Visual Expansion and Real-Time Calibration for Pan-Tilt-Zoom Cameras Assisted by Panoramic Models","authors":"Liangliang Cai,&nbsp;Zhong Zhou","doi":"10.1002/cav.70015","DOIUrl":"https://doi.org/10.1002/cav.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>Pan-tilt-zoom (PTZ) cameras, which dynamically adjust their field of view (FOV), are pervasive in large-scale scenes, such as train stations, squares, and airports. In real scenarios, PTZ cameras are required to quickly judge their directions using contextual clues from the surrounding environment. To achieve this goal, some research projects camera videos into three-dimensional (3D) models or panoramas and allows operators to establish spatial relationships. However, these works face several challenges in terms of real-time processing, localization accuracy, and realistic reference. To address this problem, a visual expansion and real-time calibration for PTZ cameras assisted by panoramic models is proposed. The calibration method consists of three parts: Providing a real environment background by building a panoramic model, meeting the needs of real-time processing by establishing a PTZ camera motion estimation model and achieving high-precision alignment between PTZ images and panoramic models using only two feature point pairs. Our methods were validated using both the public and our Scene dataset. The experimental results indicate that our method outperforms other state-of-the-art methods in terms of real-time processing, accuracy, and robustness.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creating an Anthropomorphic Folktale Animal: A Pilot Study on Character Design Creativity Derived From Autonomous Behavior Generation Powered by Reinforcement Learning
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1002/cav.70013
Hongju Yang, Seung Wan Hong

Popular in fantasy films, games, and extended reality, anthropomorphic animals often rely on animator creativity and real animal observation for behavior visualization. This artistic approach captures emotional traits but lacks uncovering diverse, unanticipated behaviors beyond creators' concepts. To enrich character design, this study employs reinforcement learning (RL) agent simulation to explore the autonomous behavior and unexpected responses of the nine-tailed Fox Sister from Korean folklore. As a method, the agent, with a physics-based controller and skeletal joints, uses hybrid action control to transition between bipedal and quadrupedal actions based on the environment. In result, RL character frequently exhibits behavioral shifts, including unexpected actions in response to training steps and terrain complexities like slopes and hurdles, distinguishing them from animation-based finite-state machines. Additionally, this study validates impacts of RL character on character design creativity. To investigate such unknown impacts, this study conducts a comparative pilot study that recruits five character designers under use and nonuse scenario of RL character. Analysis indicates that RL character promotes creativity of character design, conceptualization, and development of scenario and character's attribute. This study highlights RL's potential for visualizing diverse inspirational behaviors of folkloric creatures by simulating interactions between body structure, motion, and environment.

{"title":"Creating an Anthropomorphic Folktale Animal: A Pilot Study on Character Design Creativity Derived From Autonomous Behavior Generation Powered by Reinforcement Learning","authors":"Hongju Yang,&nbsp;Seung Wan Hong","doi":"10.1002/cav.70013","DOIUrl":"https://doi.org/10.1002/cav.70013","url":null,"abstract":"<div>\u0000 \u0000 <p>Popular in fantasy films, games, and extended reality, anthropomorphic animals often rely on animator creativity and real animal observation for behavior visualization. This artistic approach captures emotional traits but lacks uncovering diverse, unanticipated behaviors beyond creators' concepts. To enrich character design, this study employs reinforcement learning (RL) agent simulation to explore the autonomous behavior and unexpected responses of the nine-tailed Fox Sister from Korean folklore. As a method, the agent, with a physics-based controller and skeletal joints, uses hybrid action control to transition between bipedal and quadrupedal actions based on the environment. In result, RL character frequently exhibits behavioral shifts, including unexpected actions in response to training steps and terrain complexities like slopes and hurdles, distinguishing them from animation-based finite-state machines. Additionally, this study validates impacts of RL character on character design creativity. To investigate such unknown impacts, this study conducts a comparative pilot study that recruits five character designers under use and nonuse scenario of RL character. Analysis indicates that RL character promotes creativity of character design, conceptualization, and development of scenario and character's attribute. This study highlights RL's potential for visualizing diverse inspirational behaviors of folkloric creatures by simulating interactions between body structure, motion, and environment.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Model Approach for Attention Prediction in Gaming Environments for Autistic Children
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1002/cav.70010
P. Valarmathi, A. Packialatha

Autism spectrum disorder (ASD) is a neurological condition that affects an individual's mental development. This research work implements a multimodality input-based virtual reality (VR)-enabled attention prediction approach in gaming for children with autism. Initially, the multimodal inputs such as face image, electroencephalogram (EEG) signal, and data are individually processed by both the preprocessing and feature extraction procedures. Subsequently, a hybrid classification model with classifiers such as improved deep convolutional neural network (IDCNN) and long short term memory (LSTM) is utilized in expression detection by concatenating the resultant features obtained from the feature extraction procedure. Here, the conventional deep convolutional neural network (DCNN) approach is improved by a novel block-knowledge-based processing with a proposed sine-hinge loss function. Finally, an improved weighted mutual information process is employed in attention prediction. Moreover, this proposed attention prediction model is analyzed by simulation and experimental analyses. The effectiveness of the proposed model is significantly proved by the experimental results obtained from various analyses.

{"title":"A Multi-Model Approach for Attention Prediction in Gaming Environments for Autistic Children","authors":"P. Valarmathi,&nbsp;A. Packialatha","doi":"10.1002/cav.70010","DOIUrl":"https://doi.org/10.1002/cav.70010","url":null,"abstract":"<div>\u0000 \u0000 <p>Autism spectrum disorder (ASD) is a neurological condition that affects an individual's mental development. This research work implements a multimodality input-based virtual reality (VR)-enabled attention prediction approach in gaming for children with autism. Initially, the multimodal inputs such as face image, electroencephalogram (EEG) signal, and data are individually processed by both the preprocessing and feature extraction procedures. Subsequently, a hybrid classification model with classifiers such as improved deep convolutional neural network (IDCNN) and long short term memory (LSTM) is utilized in expression detection by concatenating the resultant features obtained from the feature extraction procedure. Here, the conventional deep convolutional neural network (DCNN) approach is improved by a novel block-knowledge-based processing with a proposed sine-hinge loss function. Finally, an improved weighted mutual information process is employed in attention prediction. Moreover, this proposed attention prediction model is analyzed by simulation and experimental analyses. The effectiveness of the proposed model is significantly proved by the experimental results obtained from various analyses.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WDANet: Exploring Stylized Animation via Diffusion Model for Woodcut-Style Design
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-08 DOI: 10.1002/cav.70007
Yangchunxue Ou, Jingjun Xu

Stylized animation strives for innovation and bold visual creativity. Integrating the inherent strong visual impact and color contrast of woodcut style into such animations is both appealing and challenging, especially during the design phase. Traditional woodcut methods, hand-drawing, and previous computer-aided techniques face challenges such as dwindling design inspiration, lengthy production times, and complex adjustment procedures. To address these issues, we propose a novel network framework, the Woodcut-style Design Assistant Network (WDANet). Our research is the first to use diffusion models to streamline the woodcut-style design process. We curate the Woodcut-62 dataset, which features works from 62 renowned historical artists, to train WDANet in capturing and learning the aesthetic nuances of woodcut prints. WDANet, based on the denoising U-Net, effectively decouples content and style features. It allows users to input or slightly modify a text description to quickly generate accurate, high-quality woodcut-style designs, saving time and offering flexibility. Quantitative and qualitative analyses, along with user studies, confirm that WDANet outperforms current state-of-the-art methods in generating woodcut-style images, demonstrating its value as a design aid.

{"title":"WDANet: Exploring Stylized Animation via Diffusion Model for Woodcut-Style Design","authors":"Yangchunxue Ou,&nbsp;Jingjun Xu","doi":"10.1002/cav.70007","DOIUrl":"https://doi.org/10.1002/cav.70007","url":null,"abstract":"<div>\u0000 \u0000 <p>Stylized animation strives for innovation and bold visual creativity. Integrating the inherent strong visual impact and color contrast of woodcut style into such animations is both appealing and challenging, especially during the design phase. Traditional woodcut methods, hand-drawing, and previous computer-aided techniques face challenges such as dwindling design inspiration, lengthy production times, and complex adjustment procedures. To address these issues, we propose a novel network framework, the Woodcut-style Design Assistant Network (WDANet). Our research is the first to use diffusion models to streamline the woodcut-style design process. We curate the Woodcut-62 dataset, which features works from 62 renowned historical artists, to train WDANet in capturing and learning the aesthetic nuances of woodcut prints. WDANet, based on the denoising U-Net, effectively decouples content and style features. It allows users to input or slightly modify a text description to quickly generate accurate, high-quality woodcut-style designs, saving time and offering flexibility. Quantitative and qualitative analyses, along with user studies, confirm that WDANet outperforms current state-of-the-art methods in generating woodcut-style images, demonstrating its value as a design aid.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143113147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel View Synthesis Based on Similar Perspective
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-07 DOI: 10.1002/cav.70006
Wenkang Huang

Neural radiance fields (NeRF) technology has garnered significant attention due to its exceptional performance in generating high-quality novel view images. In this study, we propose an innovative method that leverages the similarity between views to enhance the quality of novel view image generation. Initially, a pre-trained NeRF model generates an initial novel view image, which is subsequently compared and subjected to feature transfer with the most similar reference view from the training dataset. Following this, the reference view that is most similar to the initial novel view is selected from the training dataset. We designed a texture transfer module that employs a strategy progressing from coarse-to-fine, effectively integrating salient features from the reference view into the initial image, thus producing more realistic novel view images. By using similar views, this approach not only improves the quality of novel perspective images but also incorporates the training dataset as a dynamic information pool into the novel view integration process. This allows for the continuous acquisition and utilization of useful information from the training data throughout the synthesis process. Extensive experimental validation shows that using similar views to provide scene information significantly outperforms existing neural rendering techniques in enhancing the realism and accuracy of novel view images.

{"title":"Novel View Synthesis Based on Similar Perspective","authors":"Wenkang Huang","doi":"10.1002/cav.70006","DOIUrl":"https://doi.org/10.1002/cav.70006","url":null,"abstract":"<div>\u0000 \u0000 <p>Neural radiance fields (NeRF) technology has garnered significant attention due to its exceptional performance in generating high-quality novel view images. In this study, we propose an innovative method that leverages the similarity between views to enhance the quality of novel view image generation. Initially, a pre-trained NeRF model generates an initial novel view image, which is subsequently compared and subjected to feature transfer with the most similar reference view from the training dataset. Following this, the reference view that is most similar to the initial novel view is selected from the training dataset. We designed a texture transfer module that employs a strategy progressing from coarse-to-fine, effectively integrating salient features from the reference view into the initial image, thus producing more realistic novel view images. By using similar views, this approach not only improves the quality of novel perspective images but also incorporates the training dataset as a dynamic information pool into the novel view integration process. This allows for the continuous acquisition and utilization of useful information from the training data throughout the synthesis process. Extensive experimental validation shows that using similar views to provide scene information significantly outperforms existing neural rendering techniques in enhancing the realism and accuracy of novel view images.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Body Part Segmentation of Anime Characters 动漫人物的身体部位分割
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-17 DOI: 10.1002/cav.2295
Zhenhua Ou, Xueting Liu, Chengze Li, Zhenkun Wen, Ping Li, Zhijian Gao, Huisi Wu

Semantic segmentation is an important approach to present the perceptual semantic understanding of an image, which is of significant usage in various applications. Especially, body part segmentation is designed for segmenting body parts of human characters to assist different editing tasks, such as style editing, pose transfer, and animation production. Since segmentation requires pixel-level precision in semantic labeling, classic heuristics-based methods generally have unstable performance. With the deployment of deep learning, a great step has been taken in segmenting body parts of human characters in natural photographs. However, the existing models are purely trained on natural photographs and generally obtain incorrect segmentation results when applied on anime character images, due to the large visual gap between training data and testing data. In this article, we present a novel approach to achieving body part segmentation of cartoon characters via a pose-based graph-cut formulation. We demonstrate the use of the acquired body part segmentation map in various image editing tasks, including conditional generation, style manipulation, pose transfer, and video-to-anime.

语义分割是呈现图像感知语义理解的一种重要方法,在各种应用中有着重要的用途。尤其是身体部位分割,是为了对人物的身体部位进行分割,以辅助不同的编辑任务,如风格编辑、姿势转换、动画制作等。由于语义标注对切分精度要求很高,传统的启发式方法通常性能不稳定。随着深度学习的部署,在自然照片中人物身体部位的分割方面迈出了一大步。然而,由于训练数据和测试数据之间的视觉差距较大,现有的模型纯粹是在自然照片上进行训练,在应用于动漫人物图像时,通常会得到不正确的分割结果。在本文中,我们提出了一种新的方法,通过基于姿态的图形切割公式来实现卡通人物的身体部位分割。我们演示了在各种图像编辑任务中使用获得的身体部位分割图,包括条件生成,风格操作,姿势转移和视频到动画。
{"title":"Body Part Segmentation of Anime Characters","authors":"Zhenhua Ou,&nbsp;Xueting Liu,&nbsp;Chengze Li,&nbsp;Zhenkun Wen,&nbsp;Ping Li,&nbsp;Zhijian Gao,&nbsp;Huisi Wu","doi":"10.1002/cav.2295","DOIUrl":"https://doi.org/10.1002/cav.2295","url":null,"abstract":"<div>\u0000 \u0000 <p>Semantic segmentation is an important approach to present the perceptual semantic understanding of an image, which is of significant usage in various applications. Especially, body part segmentation is designed for segmenting body parts of human characters to assist different editing tasks, such as style editing, pose transfer, and animation production. Since segmentation requires pixel-level precision in semantic labeling, classic heuristics-based methods generally have unstable performance. With the deployment of deep learning, a great step has been taken in segmenting body parts of human characters in natural photographs. However, the existing models are purely trained on natural photographs and generally obtain incorrect segmentation results when applied on anime character images, due to the large visual gap between training data and testing data. In this article, we present a novel approach to achieving body part segmentation of cartoon characters via a pose-based graph-cut formulation. We demonstrate the use of the acquired body part segmentation map in various image editing tasks, including conditional generation, style manipulation, pose transfer, and video-to-anime.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and Incremental 3D Model Renewal for Urban Scenes With Appearance Changes 具有外观变化的城市场景的快速增量3D模型更新
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-11 DOI: 10.1002/cav.70004
Yuan Xiong, Zhong Zhou

Urban 3D models with high-resolution details are the basis of various mixed reality and geographic information systems. Fast and accurate urban reconstruction from aerial photographs has attracted intense attention. Existing methods exploit multi-view geometry information from landscape patterns with similar illumination conditions and terrain appearance. In practice, urban models become obsolete over time due to human activities. Mainstream reconstruction pipelines rebuild the whole scene even if the main part of them remains unchanged. This paper proposes a novel wrapping-based incremental modeling framework to reuse existing models and renew them with new meshes efficiently. The paper illustrates a pose optimization method with illumination-based augmentation and virtual bundle adjustment. Besides, a high-performance wrapping-based meshing method is proposed for fast reconstruction. Experimental results show that the proposed method can achieve higher performance and quality than state-of-the-art methods.

具有高分辨率细节的城市3D模型是各种混合现实和地理信息系统的基础。快速准确的航拍城市重建引起了人们的高度关注。现有的方法利用具有相似光照条件和地形外观的景观模式的多视图几何信息。在实践中,由于人类活动,城市模式随着时间的推移而过时。主流重建管道重建整个场景,即使其主要部分保持不变。本文提出了一种新的基于包装的增量建模框架,以有效地重用现有模型并使用新网格进行更新。提出了一种基于光照增强和虚拟束平差的姿态优化方法。此外,提出了一种基于包络的高性能网格划分方法,以实现快速重构。实验结果表明,该方法比现有方法具有更高的性能和质量。
{"title":"Fast and Incremental 3D Model Renewal for Urban Scenes With Appearance Changes","authors":"Yuan Xiong,&nbsp;Zhong Zhou","doi":"10.1002/cav.70004","DOIUrl":"https://doi.org/10.1002/cav.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Urban 3D models with high-resolution details are the basis of various mixed reality and geographic information systems. Fast and accurate urban reconstruction from aerial photographs has attracted intense attention. Existing methods exploit multi-view geometry information from landscape patterns with similar illumination conditions and terrain appearance. In practice, urban models become obsolete over time due to human activities. Mainstream reconstruction pipelines rebuild the whole scene even if the main part of them remains unchanged. This paper proposes a novel wrapping-based incremental modeling framework to reuse existing models and renew them with new meshes efficiently. The paper illustrates a pose optimization method with illumination-based augmentation and virtual bundle adjustment. Besides, a high-performance wrapping-based meshing method is proposed for fast reconstruction. Experimental results show that the proposed method can achieve higher performance and quality than state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diverse Motions and Responses in Crowd Simulation 人群模拟中的各种运动和反应
IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-26 DOI: 10.1002/cav.70002
Yiwen Ma, Tingting Liu, Zhen Liu

A challenge in crowd simulation is to generate diverse pedestrian motions in virtual environments. Nowadays, there is a greater emphasis on the diversity and authenticity of pedestrian movements in crowd simulation, while most traditional models primarily focus on collision avoidance and motion continuity. Recent studies have enhanced realism through data-driven approaches that exploit the movement patterns of pedestrians from real data for trajectory prediction. However, they have not taken into account the body-part motions of pedestrians. Differing from these approaches, we innovatively utilize learning-based character motion and physics animation to enhance the diversity of pedestrian motions in crowd simulation. The proposed method can provide a promising avenue for more diverse crowds and is realized by a novel framework that deeply integrates motion synthesis and physics animation with crowd simulation. The framework consists of three main components: the learning-based motion generator, which is responsible for generating diverse character motions; the hybrid simulation, which ensures the physical realism of pedestrian motions; and the velocity-based interface, which assists in integrating navigation algorithms with the motion generator. Experiments have been conducted to verify the effectiveness of the proposed method in different aspects. The visual results demonstrate the feasibility of our approach.

在人群仿真中,如何在虚拟环境中生成多样化的行人运动是一项挑战。如今,在人群仿真中,人们更加强调行人运动的多样性和真实性,而大多数传统模型则主要关注避免碰撞和运动的连续性。最近的研究通过数据驱动方法,利用真实数据中的行人运动模式进行轨迹预测,从而增强了逼真度。然而,这些方法并没有考虑到行人的身体部位运动。与这些方法不同,我们创新性地利用基于学习的角色运动和物理动画来增强人群模拟中行人运动的多样性。所提出的方法为实现更多样化的人群提供了一条很有前景的途径,它是通过一个新颖的框架实现的,该框架将运动合成和物理动画与人群仿真进行了深度整合。该框架由三个主要部分组成:基于学习的运动生成器,负责生成多样化的角色运动;混合模拟,确保行人运动的物理真实性;以及基于速度的界面,协助将导航算法与运动生成器集成。为了验证所提方法在不同方面的有效性,我们进行了实验。直观的结果证明了我们方法的可行性。
{"title":"Diverse Motions and Responses in Crowd Simulation","authors":"Yiwen Ma,&nbsp;Tingting Liu,&nbsp;Zhen Liu","doi":"10.1002/cav.70002","DOIUrl":"https://doi.org/10.1002/cav.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>A challenge in crowd simulation is to generate diverse pedestrian motions in virtual environments. Nowadays, there is a greater emphasis on the diversity and authenticity of pedestrian movements in crowd simulation, while most traditional models primarily focus on collision avoidance and motion continuity. Recent studies have enhanced realism through data-driven approaches that exploit the movement patterns of pedestrians from real data for trajectory prediction. However, they have not taken into account the body-part motions of pedestrians. Differing from these approaches, we innovatively utilize learning-based character motion and physics animation to enhance the diversity of pedestrian motions in crowd simulation. The proposed method can provide a promising avenue for more diverse crowds and is realized by a novel framework that deeply integrates motion synthesis and physics animation with crowd simulation. The framework consists of three main components: the learning-based motion generator, which is responsible for generating diverse character motions; the hybrid simulation, which ensures the physical realism of pedestrian motions; and the velocity-based interface, which assists in integrating navigation algorithms with the motion generator. Experiments have been conducted to verify the effectiveness of the proposed method in different aspects. The visual results demonstrate the feasibility of our approach.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Animation and Virtual Worlds
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1