DynamicAvatars: Accurate Dynamic Facial Avatars Reconstruction and Precise Editing with Diffusion Models
Generating and editing dynamic 3D head avatars are crucial tasks in virtual reality and film production. However, existing methods often suffer from facial distortions, inaccurate head movements, and limited fine-grained editing capabilities. To address these challenges, we present DynamicAvatars, a dynamic model that generates photorealistic, moving 3D head avatars from video clips and parameters associated with facial positions and expressions. Our approach enables precise editing through a novel prompt-based editing model, which integrates user-provided prompts with guiding parameters derived from large language models (LLMs). To achieve this, we propose a dual-tracking framework based on Gaussian Splatting and introduce a prompt preprocessing module to enhance editing stability. By incorporating a specialized GAN algorithm and connecting it to our control module, which generates precise guiding parameters from LLMs, we successfully address the limitations of existing methods. Additionally, we develop a dynamic editing strategy that selectively utilizes specific training datasets to improve the efficiency and adaptability of the model for dynamic editing tasks.
生成和编辑动态 3D 头像是虚拟现实和电影制作中的关键任务。然而,现有方法通常存在面部失真、头部运动不准确以及精细编辑能力有限等问题。为了解决这些挑战,我们提出了 DynamicAvatars,一种动态模型,可根据视频片段和与面部位置及表情相关的参数生成逼真、动态的 3D 头像。 我们的方法通过一种新颖的基于提示的编辑模型实现精确编辑,该模型将用户提供的提示与由大型语言模型(LLMs)生成的指导参数相结合。为此,我们提出了基于高斯投影的双重跟踪框架,并引入提示预处理模块以增强编辑稳定性。通过结合一种专门设计的生成对抗网络(GAN)算法,并将其连接到我们的控制模块中(该模块利用 LLM 生成精确的指导参数),我们成功解决了现有方法的局限性。 此外,我们开发了一种动态编辑策略,该策略选择性地利用特定的训练数据集,提高了模型在动态编辑任务中的效率和适应性。实验结果表明,DynamicAvatars 在生成和编辑动态 3D 头像方面实现了高精度和灵活性,为虚拟现实和影视制作提供了强大的工具支持。