Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latent Space #8

Open
ghost opened this issue Jul 18, 2019 · 5 comments
Open

Latent Space #8

ghost opened this issue Jul 18, 2019 · 5 comments

Comments

@ghost
Copy link

ghost commented Jul 18, 2019

Hi,
I noticed the paper said the trained latent space is a mixed Gaussian distribution with trainable variance and expectation:

In particular, we propose a reparameterization of the latent space as a Mixture- of-Gaussians model.

However, it seems that in the script the latent space applied here is a uniform distribution with trainable variance and expectation:
display_z = np.random.uniform(-1.0, 1.0, [batchsize, z_dim]).astype(np.float32)
I don't quite understand this inconsistency.

@swami1995
Copy link

Hi Bob,

thanks for pointing out the issue. The distribution used for training in mnist is actually defined in the following line. It is indeed the standard normal distribution :

batch_z = np.random.normal(0, 1.0, [batchsize, z_dim]).astype(np.float32)

The line you pointed out is actually initializing the variable for evaluation and that's most probably a bug in our code. I think that bug was a result of some experiments that we were doing after submission.. but the results in the paper correspond to the case where display_z was sampled from the standard normal distribution as well. We will correct this bug in the repository soon.

However, the codes for the other datasets (cifar-10 and sketches) don't have that bug. Feel free to use them as is.

Thanks for your interest in our paper.

@ghost
Copy link
Author

ghost commented Jul 22, 2019

Thank you very much for your reply. I've fixed it as you said and it worked well as the paper had presented.
However I have one more question concerning the optimization codes. I noticed that two parameters, t1 and thres are used to control the range of generator loss, where t1 is used to control thres and thres directly controls generator loss. I found it a particularly delicate controlling method for GAN but I can't figure out how it was developed to fit the model. Could you please give my some tuition on this issue?

@swami1995
Copy link

Hi Bob,

I essentially used those variables to provide a curriculum during training. thres was used to decide whether to update the generator v/s the discriminator. This was decided based on the generator loss. Simultaneously, the value of thres was increased/decreased after each iteration of generator/discriminator to ensure that one of them doesn't get overtrained. t1 was just a constant that provided a lower bound for thres and was heuristically chosen.

Hope that helped with some of the intuition. However I would not recommend using these heursitics. You'd be better off using the more modern GAN frameworks to stabilize training as opposed to relying on these heuristics.

@ghost
Copy link
Author

ghost commented Jul 29, 2019

I think I generally has grasped your intuition. Thank you very much for helping me figure out what's happening here!

@TanmDL
Copy link

TanmDL commented Sep 24, 2019

I am new in Tensorflow. While I was running the toy dataset code, I got this error "ValueError: Variable g_z already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?" how do I fixed it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants