Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.58 KB

2407.03204.md

File metadata and controls

5 lines (3 loc) · 2.58 KB

Expressive Gaussian Human Avatars from Monocular RGB Video

Nuanced expressiveness, particularly through fine-grained hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations. In this work, we focus on investigating the expressiveness of human avatars when learned from monocular RGB video; a setting that introduces new challenges in capturing and animating fine-grained details. To this end, we introduce EVA, a drivable human model that meticulously sculpts fine details based on 3D Gaussians and SMPL-X, an expressive parametric human model. Focused on enhancing expressiveness, our work makes three key contributions. First, we highlight the critical importance of aligning the SMPL-X model with RGB frames for effective avatar learning. Recognizing the limitations of current SMPL-X prediction methods for in-the-wild videos, we introduce a plug-and-play module that significantly ameliorates misalignment issues. Second, we propose a context-aware adaptive density control strategy, which is adaptively adjusting the gradient thresholds to accommodate the varied granularity across body parts. Last but not least, we develop a feedback mechanism that predicts per-pixel confidence to better guide the learning of 3D Gaussians. Extensive experiments on two benchmarks demonstrate the superiority of our framework both quantitatively and qualitatively, especially on the fine-grained hand and facial details.

在数字人类表示中,通过精细的手部和面部表情来表达细微的情感变化对增强逼真度和生动性至关重要。本文着重探讨从单目RGB视频中学习人类化身的表现力,这种设置在捕捉和动画化精细细节方面面临新的挑战。为此,我们引入了EVA,一个可驾驶的人类模型,根据3D高斯模型和SMPL-X(一种富有表现力的参数化人体模型)精心雕刻细节。我们的工作集中于增强表现力,提出了三个关键贡献。首先,我们强调将SMPL-X模型与RGB帧对齐的重要性,以实现有效的化身学习。鉴于当前SMPL-X预测方法在野外视频中的局限性,我们引入了一个即插即用的模块,显著改善了对齐问题。其次,我们提出了一种上下文感知的自适应密度控制策略,根据不同身体部位的细粒度调整梯度阈值。最后,我们开发了一个反馈机制,预测每个像素的置信度,更好地指导3D高斯模型的学习。在两个基准数据集上的广泛实验显示,我们的框架在数量和质量上均表现出优越性,尤其是在处理手部和面部细节方面。