Zero-1-to-3 (Image to 3D)
3D Reconstruction from a Single Image
Zero-1-to-3 (Image to 3D)
We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image.
Hugging Face Demo:
https://huggingface.co/spaces/ysharma/Zero123PlusDemo
To perform novel view synthesis in this under-constrained setting, we capitalize on the geometric priors that large-scale diffusion models learn about natural images.
Our 'conditional diffusion model' uses a synthetic dataset to learn controls of the relative camera viewpoint, which allow new images to be generated of the same object under a specified camera transformation.
Even though it is trained on a synthetic dataset, our model retains a strong zero-shot generalization ability to out-of-distribution datasets as well as in-the-wild images, including impressionist paintings.
Our 'viewpoint-conditioned diffusion approach' can further be used for the task of 3D reconstruction from a single image. Qualitative and quantitative experiments show that our method significantly outperforms state-of-the-art single-view 3D reconstruction and novel view synthesis models by leveraging Internet-scale pre-training.
Last updated