In this lecture, we will be learning about two different topics in deep learning: self-supervised learning (SSL) and generative models.
- Historical Review (
AlexNet
,DQN
,Attention
,Adam
,GAN
,ResNet
,Transformer
,Pretrained Model
,SSL
) - Good Old Fashioned SSL (
Jigsaw
,BiGAN
,RotNet
,Auto-Encoding Transform
,DeepCluster
,Single Image SSL
) - Convnet-based SSL (
DrLIM
,Contrastive Predictive Coding
,SimCLR
,MoCo
,BYOL
,SimCLRv2
,SwAV
,Barlow Twins
) - Transformer-based SSL (
Transformer
,ViT
,Swin Transformer
,DINO
,EsViT
) - Language-domain SSL (
GPT
,GPT-2
,BERT
,RoBERTa
,ALBERT
,GPT-3
) - Generative Model 1 (
NADE
,PixelRNN
,PixelCNN
) - Generative Model 2 (
VAE
,WAE
,GAN
,PlanarFlow
) - Generative Model 3 (
DDPM
) - Generative Model 4 (
DDIM
) - Generative Model 5 (
InfoGAN
,VQ-VAE
,VQ-VAE2
) - Generative Model 6 (
ADM
,CFG
,GLIDE
,DALL-E2
)
Jigsaw
: "Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles," 2017BiGAN
: "ADVERSARIAL FEATURE LEARNING," 2017RotNet
: "UNSUPERVISED REPRESENTATION LEARNING BY PREDICTING IMAGE ROTATIONS," 2018Auto-Encoding Transform
: "AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data," 2019DeepCluster
: "Deep Clustering for Unsupervised Learning of Visual Features," 2019Single Image SSL
: "A CRITICAL ANALYSIS OF SELF-SUPERVISION, WHAT WE CAN LEARN FROM A SINGLE IMAGE," 2020DrLIM
: "Dimensionality Reduction by Learning an Invariant Mapping," 2006Contrastive Predictive Coding
: "Representation Learning with Contrastive Predictive Coding," 2019SimCLR
: "A Simple Framework for Contrastive Learning of Visual Representations," 2020MoCo
: "Momentum Contrast for Unsupervised Visual Representation Learning," 2020BYOL
: "Bootstrap Your Own Latent A New Approach to Self-Supervised Learning," 2020SimCLRv2
: "Big Self-Supervised Models are Strong Semi-Supervised Learners," 2020SwAV
: "Unsupervised Learning of Visual Features by Contrasting Cluster Assignments," 2021Barlow Twins
: "Barlow Twins: Self-Supervised Learning via Redundancy Reduction," 2021Transformer
: "Attention is All You Need," 2017ViT
: "AN IMAGE IS WORTH 16 X 16 WORDS :TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE," 2021Swin Transformer
: "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," 2021DINO
: "Emerging Properties in Self-Supervised Vision Transformers," 2021EsViT
: "Efficient Self-supervised Vision Transformers for Representation Learning," 2021GPT
: "Improving Language Understanding by Generative Pre-Training," 2018GPT-2
: "Language Models are Unsupervised Multitask Learners," 2018BERT
: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2019RoBERTa
: "RoBERTa: A Robustly Optimized BERT Pretraining Approach," 2019ALBERT
: "ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS," 2020GPT-3
: "Language Models are Few-Shot Learners," 2020NADE
: "Neural Autoregressive Distribution Estimation." 2016PixelRNN
: "Pixel Recurrent Neural Networks," 2016PixelCNN
: "Conditional Image Generation with PixelCNN Decoders," 2016VAE
: "Auto-Encoding Variational Bayes," 2013WAE
: "Wasserstein Auto-Encoders," 2017GAN
: "Generative Adversarial Networks," 2014PlanarFlow
: "Variational Inference with Normalizing Flows," 2016DDPM
: "Denoising Diffusion Probabilistic Models," 2020InfoGAN
: "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets," 2016VQ-VAE
: "Neural Discrete Representation Learning," 2018VQ-VAE2
: "Generating Diverse High-Fidelity Images with VQ-VAE-2," 2019DDIM
: "DENOISING DIFFUSION IMPLICIT MODELS," 2020IDDPM
: "Improved Denoising Diffusion Probabilistic Models," 2021ADM
: "Diffusion Models Beat GANs on Image Synthesis," 2021CFG
: "Classifier-Free Diffusion Guidance," 2021BART
: "ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis," 2021DiffusionGAN
: "TACKLING THE GENERATIVE LEARNING TRILEMMA WITH DENOISING DIFFUSION GANS," 2021GLIDE
: "GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models," 2022DALL-E2
: "Hierarchical Text-Conditional Image Generation with CLIP Latents," 2022