Skip to content

Text-to-Image generation using Attention GAN

License

Notifications You must be signed in to change notification settings

shaivaldalal/AttnGAN

 
 

Repository files navigation

AttnGAN

Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).

Dependencies

  • Python==2.7.12
  • torch==0.4.1
  • torchfile==0.1.0
  • torchvision==0.2.1
  • scikit-image==0.14.1
  • pandas==0.19.1
  • easydict==1.6
  • nltk==3.2.2

In addition, please add the project folder to PYTHONPATH

Data

Download our preprocessed metadata for coco dataset and extract the images to data/coco/

Training

  • Pre-train DAMSM models:
    • For coco dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1
  • Train AttnGAN models:
    • For coco dataset: python main.py --cfg cfg/coco_attn2.yml --gpu 3
  • *.yml files are example configuration files for training/evaluation our models.
  • Note: The GPU parameter simply enables while its absence disables the GPU. The code is parallel by default.

Pretrained Model

Data Sampling

  • Configure the sampling.py scrit in the code folder to point to directories of your choice
  • The script shall sample the 10000 samples each from training, validation and test data without replacement

Validation and Custom Image Generation

  • Validation Image Generation
    • Modify the B_VALIDATION flag to True in eval_coco.yml
    • Run python main.py --cfg cfg/eval_coco.yml --gpu 1 to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved to models/
  • Custom Image Generation
    • Input your own sentence in "./data/example_captions.txt" if you want to generate images from customized sentences.

Evaluation

  • We use the Fréchet Inception Distance (FID) to compute an interpretable evaluation metric. The concept was first introduced in 'GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium' by Martin Heusel et al. Available here

Examples generated by AttnGAN [Blog]

bird example coco example

Creating an API

Evaluation code embedded into a callable containerized API is included in the eval\ folder.

Citing AttnGAN

If you find AttnGAN useful in your research, please consider citing:

@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}

Reference

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Shell 1.3%
  • Lua 0.3%