Neural Radiance Field (NeRF) can render complex 3D scenes with viewpoint-dependent effects. However, less work has been devoted to exploring its limitations in high-resolution environments, especially when upscaled to ultra-high resolution (e.g., 4k). Specifically, existing NeRF-based methods face severe limitations in reconstructing high-resolution real scenes, for example, a large number of parameters, misalignment of the input data, and over-smoothing of details. In this paper, we present a novel and effective framework, called De-NeRF, based on NeRF and deformable convolutional network, to achieve high-fidelity view synthesis in ultra-high resolution scenes: (1) marrying the deformable convolution unit which can solve the problem of misaligned input of the high-resolution data. (2) Presenting a density sparse voxel-based approach which can greatly reduce the training time while rendering results with higher accuracy. Compared to existing high-resolution NeRF methods, our approach improves the rendering quality of high-frequency details and achieves better visual effects in 4K high-resolution scenes.
{"title":"De-NeRF: Ultra-high-definition NeRF with deformable net alignment","authors":"Jianing Hou, Runjie Zhang, Zhongqi Wu, Weiliang Meng, Xiaopeng Zhang, Jianwei Guo","doi":"10.1002/cav.2240","DOIUrl":"https://doi.org/10.1002/cav.2240","url":null,"abstract":"<p>Neural Radiance Field (NeRF) can render complex 3D scenes with viewpoint-dependent effects. However, less work has been devoted to exploring its limitations in high-resolution environments, especially when upscaled to ultra-high resolution (e.g., 4k). Specifically, existing NeRF-based methods face severe limitations in reconstructing high-resolution real scenes, for example, a large number of parameters, misalignment of the input data, and over-smoothing of details. In this paper, we present a novel and effective framework, called <i>De-NeRF</i>, based on NeRF and deformable convolutional network, to achieve high-fidelity view synthesis in ultra-high resolution scenes: (1) marrying the deformable convolution unit which can solve the problem of misaligned input of the high-resolution data. (2) Presenting a density sparse voxel-based approach which can greatly reduce the training time while rendering results with higher accuracy. Compared to existing high-resolution NeRF methods, our approach improves the rendering quality of high-frequency details and achieves better visual effects in 4K high-resolution scenes.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Streamlines are a popular method of choice in many flow visualization techniques due to their simplicity and intuitiveness. This paper presents a novel streamline seeding method, which is tailored for visualizing unsteady flow in augmented reality (AR). Our method prioritizes visualizing the visible part of the flow field to enhance the flow representation's quality and reduce the computational cost. Being an image-based method, it evenly samples 2D seeds from the screen space. Then, a ray is fired toward each 2D seed, and the on-the-ray point, which has the largest entropy, is selected. It is taken as the 3D seed for a streamline. By advecting such 3D seeds in the velocity field, which is continuously updated in real time, the unsteady flow is visualized more naturally, and the temporal coherence is achieved with no extra efforts. Our method is tested using an AR application for visualizing airflow from a virtual air conditioner. Comparison with the baseline methods shows that our method is suitable for visualizing unsteady flow in AR.
流线因其简单直观而成为许多流动可视化技术的首选方法。本文介绍了一种新颖的流线播种方法,该方法专为增强现实(AR)中的非稳态流动可视化而量身定制。我们的方法优先可视化流场的可见部分,以提高流动表示的质量并降低计算成本。作为一种基于图像的方法,它从屏幕空间中均匀采样二维种子。然后,向每个二维种子发射一条射线,并选择射线上熵值最大的点。将其作为流线的三维种子。通过在实时不断更新的速度场中平移这样的三维种子,可以更自然地可视化非稳态流动,并且无需额外的努力即可实现时间上的一致性。我们使用一个 AR 应用程序对我们的方法进行了测试,该应用程序用于可视化虚拟空调的气流。与基线方法的比较表明,我们的方法适合在 AR 中实现非稳态流的可视化。
{"title":"Screen-space Streamline Seeding Method for Visualizing Unsteady Flow in Augmented Reality","authors":"Hyunmo Kang, JungHyun Han","doi":"10.1002/cav.2250","DOIUrl":"https://doi.org/10.1002/cav.2250","url":null,"abstract":"<p>Streamlines are a popular method of choice in many flow visualization techniques due to their simplicity and intuitiveness. This paper presents a novel streamline seeding method, which is tailored for visualizing unsteady flow in augmented reality (AR). Our method prioritizes visualizing the visible part of the flow field to enhance the flow representation's quality and reduce the computational cost. Being an image-based method, it evenly samples 2D seeds from the screen space. Then, a ray is fired toward each 2D seed, and the on-the-ray point, which has the largest entropy, is selected. It is taken as the 3D seed for a streamline. By advecting such 3D seeds in the velocity field, which is continuously updated in real time, the unsteady flow is visualized more naturally, and the temporal coherence is achieved with no extra efforts. Our method is tested using an AR application for visualizing airflow from a virtual air conditioner. Comparison with the baseline methods shows that our method is suitable for visualizing unsteady flow in AR.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconstructing the three-dimensional (3D) shape and texture of the face from a single image is a significant and challenging task in computer vision and graphics. In recent years, learning-based reconstruction methods have exhibited outstanding performance, but their effectiveness is severely constrained by the scarcity of available training data with 3D annotations. To address this issue, we present the PR3D (Precise and Realistic 3D face reconstruction) method, which consists of high-precision shape reconstruction based on semi-supervised learning and high-fidelity texture reconstruction based on StyleGAN2. In shape reconstruction, we use in-the-wild face images and 3D annotated datasets to train the auxiliary encoder and the identity encoder, encoding the input image into parameters of FLAME (a parametric 3D face model). Simultaneously, a novel semi-supervised hybrid landmark loss is designed to more effectively learn from in-the-wild face images and 3D annotated datasets. Furthermore, to meet the real-time requirements in practical applications, a lightweight shape reconstruction model called fast-PR3D is distilled through teacher–student learning. In texture reconstruction, we propose a texture extraction method based on face reenactment in StyleGAN2 style space, extracting texture from the source and reenacted face images to constitute a facial texture map. Extensive experiments have demonstrated the state-of-the-art performance of our method.
{"title":"PR3D: Precise and realistic 3D face reconstruction from a single image","authors":"Zhangjin Huang, Xing Wu","doi":"10.1002/cav.2254","DOIUrl":"https://doi.org/10.1002/cav.2254","url":null,"abstract":"<p>Reconstructing the three-dimensional (3D) shape and texture of the face from a single image is a significant and challenging task in computer vision and graphics. In recent years, learning-based reconstruction methods have exhibited outstanding performance, but their effectiveness is severely constrained by the scarcity of available training data with 3D annotations. To address this issue, we present the PR3D (Precise and Realistic 3D face reconstruction) method, which consists of high-precision shape reconstruction based on semi-supervised learning and high-fidelity texture reconstruction based on StyleGAN2. In shape reconstruction, we use in-the-wild face images and 3D annotated datasets to train the auxiliary encoder and the identity encoder, encoding the input image into parameters of FLAME (a parametric 3D face model). Simultaneously, a novel semi-supervised hybrid landmark loss is designed to more effectively learn from in-the-wild face images and 3D annotated datasets. Furthermore, to meet the real-time requirements in practical applications, a lightweight shape reconstruction model called fast-PR3D is distilled through teacher–student learning. In texture reconstruction, we propose a texture extraction method based on face reenactment in StyleGAN2 style space, extracting texture from the source and reenacted face images to constitute a facial texture map. Extensive experiments have demonstrated the state-of-the-art performance of our method.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhigeng Pan, Hongyi Ren, Chang Liu, Ming Chen, Mithun Mukherjee, Wenzhen Yang
Within the field of human–computer interaction, data gloves play an essential role in establishing a connection between virtual and physical environments for the realization of digital human. To enhance the credibility of human-virtual hand interactions, we aim to develop a system incorporating a data glove-embedded technology. Our proposed system collects a wide range of information (temperature, bending, and pressure of fingers) that arise during natural interactions and afterwards reproduce them within the virtual environment. Furthermore, we implement a novel traversal polling technique to facilitate the streamlined aggregation of multi-channel sensors. This mitigates the hardware complexity of the embedded system. The experimental results indicate that the data glove demonstrates a high degree of precision in acquiring real-time hand interaction information, as well as effectively displaying hand posture in real-time using Unity3D. The data glove's lightweight and compact design facilitates its versatile utilization in virtual reality interactions.
{"title":"Design of a lightweight and easy-to-wear hand glove with multi-modal tactile perception for digital human","authors":"Zhigeng Pan, Hongyi Ren, Chang Liu, Ming Chen, Mithun Mukherjee, Wenzhen Yang","doi":"10.1002/cav.2258","DOIUrl":"https://doi.org/10.1002/cav.2258","url":null,"abstract":"<p>Within the field of human–computer interaction, data gloves play an essential role in establishing a connection between virtual and physical environments for the realization of digital human. To enhance the credibility of human-virtual hand interactions, we aim to develop a system incorporating a data glove-embedded technology. Our proposed system collects a wide range of information (temperature, bending, and pressure of fingers) that arise during natural interactions and afterwards reproduce them within the virtual environment. Furthermore, we implement a novel traversal polling technique to facilitate the streamlined aggregation of multi-channel sensors. This mitigates the hardware complexity of the embedded system. The experimental results indicate that the data glove demonstrates a high degree of precision in acquiring real-time hand interaction information, as well as effectively displaying hand posture in real-time using Unity3D. The data glove's lightweight and compact design facilitates its versatile utilization in virtual reality interactions.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyu Li, Meng Yang, Chao Yang, Jianglang Kang, Xiang Suo, Weiliang Meng, Zhen Li, Lijuan Mao, Bin Sheng, Jun Qi
We propose a comprehensive soccer match video analysis pipeline tailored for broadcast footage, which encompasses three pivotal stages: soccer field localization, player tracking, and soccer ball detection. Firstly, we introduce sports camera calibration to seamlessly map soccer field images from match videos onto a standardized two-dimensional soccer field template. This addresses the challenge of consistent analysis across video frames amid continuous camera angle changes. Secondly, given challenges such as occlusions, high-speed movements, and dynamic camera perspectives, obtaining accurate position data for players and the soccer ball is non-trivial. To mitigate this, we curate a large-scale, high-precision soccer ball detection dataset and devise a robust detection model, which achieved the