llama.cpp perplexity scorecard

A helper project to run perplexity tests for llama.cpp. llama.cpp is a leading LLM (Large Language Model) inference engine. llama.cpp runs LLMs like Llama2.

Perplexity is the most commonly used measure of a language model's performance on a given text corpus. It is a measure of how well a model is able to predict the contents of a dataset. Lower perplexity scores are better.

See background discussions in the llama.cpp discussions on the needs and motives for this project here and here

This python app wraps the llama.cpp ./perplexity executable and uploads perplexity scores and test results as JSON to an Amazon S3 bucket for analysis.

The standard llama.cpp perplexity test uses wiki.test.raw.406 - ie 406 lines from wiki.test.raw

Install

pip install -r requirements.txt

Config

Copy .env.example and update the config variables to suit your system.

You can use an existing wiki.test.raw if you want. The script will download the test corpus if required.

Run

python3 perplexity_scorecard.py

Coming soon... the llama.cpp perplexity leaderboard and Jupyter (.ipynb) analysis and charting examples.

PRs are welcome 😀

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
libs		libs
results		results
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
scorecard.py		scorecard.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama.cpp perplexity scorecard

Install

Config

Run

About

Releases 1

Packages

Contributors 2

Languages

License

ianscrivener/llama-cpp-perplexity-scorecard

Folders and files

Latest commit

History

Repository files navigation

llama.cpp perplexity scorecard

Install

Config

Run

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages