Skip to content

mattiasxu/master_project_github

Repository files navigation

Code and examples for Master Thesis

Implementations

Implementations of VQVAE and PixelSNAIL found in models directory.

lit_x.py models are models ported to PyTorch Lightning modules for training etc.

VQVAE: Hierarchical Quantized Autoencoders [https://arxiv.org/abs/2002.08111]

PixelSNAIL: An Improved Autoregressive Generative Model [https://arxiv.org/abs/1712.09763]

Examples

Reconstructions with Hierarchical VQVAE

16 256x256 frames in 10 FPS decoded and encoded.

Codebook size is 512, and assuming 8-bit color channels, 3x16x256x256x8 is encoded into 4x32x32xlog(512) + 8x32x32xlog(512), which is a 98.8% reduction.

Reconstruction using only top/bottom encoding

The lower dimensioned top encoding takes care of more general and global features, like coloring. The higher dimensioned bottom encoding takes care of more detailed features. Here, the input (left) is decoded first using only the top encoding (middle), then by the bottom encoding (right).

PixelSNAIL

Top PixelSNAIL

Example showing ancestral sampling conditioned on 8 frames. 8 new frames are generated. Left is decoded from original encoding, middle is decoded from the generated top encoding and right is encoded from the generated top encoding and the bottom encoding (not generated!)

This example shows a converged 34M parameter model, compared to the 50M parameters above. Here, the generated frames look very static, and are similar to the last conditioning frame.

Bottom PixelSNAIL

Generated bottom encodings conditioned on generated top encodings.

No visible differences between condition on generated top encodings and matching encodings. Examples below show generated bottom encodings conditioned and decoded with the matching top encoding.

Hierarchical PixelSNAIL

Examples from the Hierarchical PixelSNAIL.

About

Code and examples for Master thesis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages