You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following the original VAT paper, consistency_func in hparams.py should be reverse_kl for VAT, although it is set to forward_kl in your code.
The adversarial noise r in VAT is obtained by maximizing D_KL(p(y|x)||p(y|x+r)), however, the consistency loss D_KL(p(y|x+r)||p(y|x)) is used when consistency_func=forward_kl. It matters because of the asymmetricity of KL divergence, I think.
The text was updated successfully, but these errors were encountered:
IMO, consistency_func cannot be a hyper-parameter to be tuned (it's a part of the VAT model). If you consider consistency_func to be a hyper-parameter, it should be noted in Table 4 of your NIPS'18 paper.
For instance, I compared the VAT+EntMin with the following two settings in the CIFAR10-4000 scenario:
Setting-A: consistency_func=forward_kl and max_cons_multiplier=0.3 (original parameters)
Setting-B: consistency_func=reverse_kl and max_cons_multiplier=1.0 (modified parameters)
As a result, I observed that VAT+EntMin with Setting-B outperformed that with Setting-A about 2% in test error rates (11.7% vs 13.7%). Of course it is a result of a single run, so I do not insist that Setting-B outperforms Setting-A in general.
Thank you for your code.
Following the original VAT paper,
consistency_func
in hparams.py should bereverse_kl
for VAT, although it is set toforward_kl
in your code.The adversarial noise r in VAT is obtained by maximizing D_KL(p(y|x)||p(y|x+r)), however, the consistency loss D_KL(p(y|x+r)||p(y|x)) is used when
consistency_func=forward_kl
. It matters because of the asymmetricity of KL divergence, I think.The text was updated successfully, but these errors were encountered: