DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS

Gunahnkr

2 min readMay 3, 2021

0503

Figure 1: The first column shows images generated by off-the-shelf 2D GANs trained on RGB images only, while the rest show that our method can unsupervisedly reconstruct 3D shape (viewed in 3D mesh, surface normal, and texture) given a single 2D image by exploiting the geometric cues contained in GANs. The last two columns depicts 3D-aware image manipulation effects (rotation and relighting) enabled by our framework. More results are provided in the Appendix.

Figure 2: Framework outline. Starting with an initial ellipsoid 3D shape (viewed in the surface normal), our approach renders various ‘pseudo samples’ with different viewpoints and lighting conditions. GAN inversion is applied to these samples to obtain the ‘projected samples’, which are used as the ground truth of the rendering process to refine the initial 3D shape. This process is repeated until more precise results are obtained.

Figure 3: Method overview. (a) Given a single image, Step 1 initializes the depth with ellipsoid (viewed in surface normal), and optimizes the albedo network A. (b) Step 2 uses the depth and albedo to render ‘pseudo samples’ with various random viewpoint and lighting conditions, and conducts GAN inversion to them to obtain the ‘projected samples’. © Step 3 refines the depth map by optimizing (V, L,D, A) networks to reconstruct the projected samples. The refined depth and models are used as the new initialization to repeat the above steps.

StyleGAN2-ADA consists

Mapping network: maps the latent vector z in the input space Z to an intermediate latent vector.
Synthesis network(G): map W to the output image

follow the photo-geometric autoencoding design in Wu et al. (2020). For an image I ∈ R 3×H×W, we adopt a function that is designed to predict four factors (d, a, v, l)

D: Depth map

A: Albedo image

V: Viewpoint

L: Light direction

Methodology

Using a Weak Shape Prior — make the observation that many objects including faces and cars have a somewhat convex shape prior
Sampling and Projecting to the GAN Image Manifold — create “pseudo samples” by sampling a number of random viewpoints and lighting directions. GAN inversion: reconstruct them with the GAN generator (GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator.)
Learning the 3D Shape —

DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS

Written by Gunahnkr