Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 1.94 KB

5-Modeling-Objectives.md

File metadata and controls

21 lines (13 loc) · 1.94 KB

Chapter 5: Modeling Objectives

Language models are generally trained using a log-likelihood/cross-entropy objective, but there are several other useful objectives for representation learning and generative modeling.

This chapter is included for breadth, but it can be skipped if you want to focus on language models. It is not that detailed because I am not that familiar with these topics.

Recommended reading

  • Contrastive Representation Learning - Contrastive objectives are an important class of training objectives that are used for representation learning. The first section of this post, entitled "Contrastive Training Objectives", covers some popular contrastive objectives.

Optional reading

  • CLIP - A contrastive image-text model.
  • Diffusion models - A training objective that works well for image generation, used by DALL-E 2.
  • VAE tutorial - VAEs are a type of latent variable model with an encoder-decoder structure.
  • VAE variants - An overview of the some VAE variants, including VQ-VAE.
  • GAN tutorial - Generative adversarial networks are an older objective that can also be used for image generation. This tutorial also has an accompanying video.

Suggested exercise

Train a contrastive CNN on MNIST using a contrastive objective of your choice. Hold out a few examples from each class from the training set. Once your model has finished training, use your held-out examples to measure the classification accuracy of your model: for each test example, measure the similarity of the test example to each held-out example, and choose the class with the highest average similarity.