simple-GPT

A project where we fine-tune on fineweb-edu10B with a replica of the GPT 2 model with the hyperparameters from GPT 3 and the tokenizer from GPT 4. Trained on 8 A100 SXM4 (40GB) GPUs on Lambda Labs for one epoch, approx. 3 hours training time.

This project is inspired by

by Andrej Karpathy on YouTube.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
analysis		analysis
decoder_only		decoder_only
log		log
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
download_text.py		download_text.py
fineweb.py		fineweb.py
hellaswag.py		hellaswag.py
input.txt		input.txt
run.sh		run.sh
train_gpt2.py		train_gpt2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple-GPT

About

Releases

Packages

Languages

Guz-Ali/simple-GPT

Folders and files

Latest commit

History

Repository files navigation

simple-GPT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages