comparion between vae and gan

Similarity

  • Both aim at constructing a model that can map latent variabels $z$ to object data $p_{data}$. specifically, the trained model $mathbf{X} = g(mathbf{Z})$ where $Z$ is general distribution such as normal distribution and gaussian distribution, $mathbf{X}$ represents the probability distribution of training data. So they both aim at distribution transformation.

  • Both have generative problem that it’s difficult to obtain distribution expression of generated distribution and true distribution, only samples from two distribution are available.

  • $KL$ divergence only be applied to calculte distribution differene on the condition that complete distribution expressions are provided, so it is unapplicable in this scenario.

Difference

  • Measurement method: manmade measurement rule for VAE while this measurement rule of GAN is trained by neural network.

  • GAN: proposed to leverage deep neural networks to measure distribution difference because there isn’t suitable measurement method.

  • VAE: adopt a roundabout skill to leverage $KL$ divergence.

VAE

Important points

'VAE'

  • Notes: each sample has a gaussian distribution constructed by multivariate Gaussion and then obtain $z$ by sampling it from this distribution:
    begin{equation}
    log q_{phi}left(mathbf{z} | mathbf{x}^{(i)}right)=log mathcal{N}left(mathbf{z} ; boldsymbol{mu}^{(i)}, boldsymbol{sigma}^{2(i)} mathbf{I}right)
    end{equation}

  • Notes: $log boldsymbol{mu_k} = f_1(x_k), boldsymbol{sigma}^{2(k)} = f_2(x_k)$ which are both fitted by neural network.

  • Noise from constructed $z$ can be calculated from $sigma$ which can be controlled to zero, so noise takes no effect.

  • Generative ability is based on the condition that all $p(ZX)$ close to gaussion distribution.

begin{equation}
p(Z)=sum_{X} p(Z | X) p(X)=sum_{X} mathcal{N}(0, I) p(X)=mathcal{N}(0, I) sum_{X} p(X)=mathcal{N}(0, I)
end{equation}

so $p(Z)$ subjects to normal distribution, which satisfy the prior.

  • how: intruducing reconstruction loss:
    direct method:
    begin{equation}
    mathcal{L}_{mu}=left|f_{1}left(X_{k}right)right|^{2}
    end{equation}

    begin{equation}
    mathcal{L}_{sigma^{2}}=left|f_{2}left(X_{k}right)right|^{2}
    end{equation}

    it’s difficult to measure those loss, so it’s reasonable to introduce $KL$ divegence between standard gaussion distribution and independent gaussion distribution $K Lleft(Nleft(mu, sigma^{wedge}right) | N(0, l)right)$:
    begin{equation}
    mathcal{L}_{mu, sigma^{2}}=frac{1}{2} sum_{i=1}^{d}left(mu_{(i)}^{2}+sigma_{(i)}^{2}-log sigma_{(i)}^{2}-1right)
    end{equation}
    where $d$ is the dimension of $z$

Essence of VAE

  • Two encoder: $f_1$ for $mu$ while $f_2$ for $sigma$

  • Reconstruction process: the loss of decoder assume that there is no noise; Sample $z$ process: the loss of encoder assume that there is gaussion noise.

GAN

Important points

  • GAN is used to map normal distribution $p(z)$ into specfic distribution $p(x)$.