We embark on the age-old quest: unveiling the hidden dimensions of objects from mere glimpses of their visible parts. To address this, we present Vista3D, a framework that realizes swift and consistent 3D generation within a mere 5 minutes. At the heart of Vista3D lies a two-phase approach: the coarse phase and the fine phase. In the coarse phase, we rapidly generate initial geometry with Gaussian Splatting from a single image. In the fine phase, we extract a Signed Distance Function (SDF) directly from learned Gaussian Splatting, optimizing it with a differentiable isosurface representation. Furthermore, it elevates the quality of generation by using a disentangled representation with two independent implicit functions to capture both visible and obscured aspects of objects. Additionally, it harmonizes gradients from 2D diffusion prior with 3D-aware diffusion priors by angular diffusion prior composition. Through extensive evaluation, we demonstrate that Vista3D effectively sustains a balance between the consistency and diversity of the generated 3D objects.
我们着手解决一个古老的难题:通过仅能看到物体的一部分来揭示其隐藏的维度。为此,我们提出了Vista3D,一个能够在短短5分钟内实现快速且一致的3D生成框架。Vista3D的核心采用了两阶段方法:粗略阶段和精细阶段。在粗略阶段,我们通过单张图像快速生成初步几何形态,使用高斯散点技术。在精细阶段,我们从学习到的高斯散点中直接提取符号距离函数(SDF),并通过可微等值面表示进行优化。此外,Vista3D通过使用两组独立的隐式函数对可见和隐藏部分进行解耦表示,进一步提升了生成质量。同时,它通过角度扩散先验组合,将来自2D扩散模型的梯度与3D感知的扩散先验相结合。通过广泛的评估,我们证明了Vista3D能够在生成的3D物体的一致性和多样性之间有效地保持平衡。