Skip to content

vivek-rd/tinystoriesGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Objective

The aim of this project is to familiarize myself with few things -

  1. Training a GPT style decoder only model which generates grammatically correct sentences.
  2. Learn how to use CUDA enabled environment for running PyTorch models.
  3. Understand the ins and outs of transformer architecture.

TODO

  • Introduce wandb logging
  • Fix the attention masking bug in scaled dot produc attention
  • Add the space token in tiny tokenizer

About

Train GPT style model on tinystories dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published