GitHub - LeonardPuettmann/understanding-mistral

This is my glorious attempt to understand the Mistral 7B model. Because the people from Mistral AI have open-sourced their model code, I tried to replicate a small version of the model. Like... really small. A whopping a million parameters. Needless to say, the model is useless for anything.

The model was trained on a handful examples from the Cosmopedia dataset, which is an open-source version of the high quality textbook dataset in a similar style to the Phi dataset.

Check out the model here: https://huggingface.co/LeonardPuettmann/MiniMistral-8M

Loss

The loss is pretty unspectacular. I just trained for one epoch:

How to use

Please don't. You should probably use Mistral 7B instead: mistralai/Mistral-7B-v0.3 Or if you are (very) GPU rich, you can try to train their model yourself: https://github.com/mistralai/mistral-inference

In the folder inference you actually find a small script, which allows you to chat with the 7B param model. All you need is a free HuggingFace API token.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
build		build
inference		inference
2310.06825v1.pdf		2310.06825v1.pdf
README.md		README.md
loss.png		loss.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loss

How to use

About

Releases

Packages

Languages

LeonardPuettmann/understanding-mistral

Folders and files

Latest commit

History

Repository files navigation

Loss

How to use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages