MUNIT-keras

A keras (tensorflow) reimplementation of MUNIT: Multimodal Unsupervised Image-to-Image Translation

Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz

~~Use group normalization instead of layer normalization in upscaling blocks.~~
- Model using group norm (group=8) failed on reconstructing edge images of edges2shoe dataset.
Use mixup technique for training.
Input/Output size is defaulted 128x128.
Use only 3 res blocks (instead of 4) as default in content encoder/decoder in order to reduce training time.
- However, I'm worrying that this decreases the receptive field size so that the output quality becomes worse.
Upscaling blocks use conv2d having kernel_size = 3 instead of 4.

Edges2shoes (config. 1)
- Cyclic reconstruction loss weight = 1 for the first 80k iters and 0.3 for the rest.
- Input/Output size: 64x64.
- Training iterations: ~130k.
- Optimization: Use mixup technique for the first 80k iters.
Edges2shoes (config. 2)
- Cyclic reconstruction loss weight = 10
- Input/Output size: 64x64.
- Training iterations: ~70k.
- Optimization: Use mixup technique for the entire training process.
- Model performed better on guided translation (generated more detail and clearer edges) when using high reconstruction loss?

Code heavily inspired by official MUNIT pytorch implementation. Also borrow code from eridgd and tjwei.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.gitignore		.gitignore
LICENSE		LICENSE
MUNIT_64x64.jpg		MUNIT_64x64.jpg
MUNIT_64x64_cycrec10.jpg		MUNIT_64x64_cycrec10.jpg
MUNIT_keras.ipynb		MUNIT_keras.ipynb
README.md		README.md
edges2shoes_preprocess.ipynb		edges2shoes_preprocess.ipynb

Provide feedback