Prompt: "Make an image with 10 people in the middle acting like explores and holding a red "Bristol" flag. Around them, I want a Llama, a Mamba, and a Transformer. Give it an Indiana Jones feel."
Some of the dates are probably incorrect because I still haven't learned how to read a calendar.
Tutorials, videos, courses and notes that were shared at various times during the reading group.
- Attention is all you need - Lukasz Kaiser Masterclass
- Berkeley Deep Neural Networks Course and related video
- Stable Diffusions
- The Creativity of Text-to-Image Generation
- Deep Learning Foundations to Stable Diffusions - Jeremy Howard video
- The Illustrated Stable Diffusion
- Cross Attention augmented U-net
- The Illustrated GPT-2
- On the Expressivity Role of LayerNorm in Transformers' Attention
- The Annotated S4
- AlphaFold Meets Flow Matching for Generating Protein Ensembles
- Repo - Formal Algorithms for Transformers
- But what is a GPT? Visual intro to transformers - 3B1B
- Attention in Transformers, visually explained - 3B1B
- Transformers
- An Introduction to Transformers by Turner is perhaps the best way to start with attention, it is simple, to the point and gives you the "shape" of each array, which is helpful at first.
- After Turner's paper, I would read A survey of transformers. There are typos but I found that this paper was helpful in giving a general overview: each paper implements them slightly differently and that made it very confusing for me.
- You will likely still be confused after this paper, so I would recommend Formal Algorithm for Transformers.
- I would then watch various of Andrey Karpathy's videos to consolidate understanding such as this and this and this.
- Diffusions
- To do
You can suggest new papers here.