Creating realistic human avatars from monocular RGB videos is a long-standing and challenging problem. Existing implicit NeRF-based methods typically lack explicit geometric information in feature representation. Although 3D Gaussian Splatting (3DGS) has recently emerged as an explicit point-cloud-based alternative, information about geometric details like normal information is still missing in such an unstructured representation. In this paper, we present NEGS-Avatar, a novel approach to modeling animatable 2D human avatars from monocular videos using 3DGS. Our method incorporates normal information into 3D Gaussians as a learnable property to construct directed 3DGS to improve body appearance modeling. The normal information, along with other properties like positions, rotations and scales, is predicted based on the given body pose to model pose-dependent non-rigid deformation. The Gaussians are then transformed into actor posed space using linear blend skinning to realize pose animation. In addition, we develop a locality-aware adaptive density control strategy, which utilizes normal variance in local areas to facilitate effective Gaussain densification. Last but not the least, we propose to separate the specular and diffuse components for color prediction, thereby forming a more accurate, interpretable, and controllable appearance prediction model. Experimental results demonstrate that NEGS-Avatar achieves state-of-the-art performance both qualitatively and quantitatively, especially in the details of the clothing surface. The code is available at https://github.com/Zheng-ZD/NEGS-Avatar.git.
扫码关注我们
求助内容:
应助结果提醒方式:
