USE KOBOLDCPP INSTEAD

openassistant.cpp

Run a pythia-based Open Assistant model locally on your cpu. This is a combination of ggml, and cformers's gpt-neox implementation, hacked together to provide a chat interface for Open Asisstant in pure c++. This is not a serious project; it was hacked together, and is based on an old ggml version. I created this because I wanted to test out open assistant locally.

Running it:

Clone this repository, and run make.

Download the model: oasst-sft-1-pythia-12b.

(The model is already quantized with 4 bit quantization)

Run it with: ./main -m PATH_TO_MODEL

Notes

You could probably run a gpt-neox-20B-based open assistant model with this, but I haven't tested it. And with some simple modifications, you could adapt this to run stuff like Dolly 2.0, stablelm, and other neox-like chat-tuned llms. But if you are interested in any of this, I would recommend contributing to the main ggml repo.

If you are confused about how anything else in this repo works, refer to cformers, as most of the code is shared.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Makefile		Makefile
README.md		README.md
convert_gptneox_to_ggml.py		convert_gptneox_to_ggml.py
ggml.c		ggml.c
ggml.h		ggml.h
main.cpp		main.cpp
quantize_gptneox.cpp		quantize_gptneox.cpp
utils.cpp		utils.cpp
utils.h		utils.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

USE KOBOLDCPP INSTEAD

openassistant.cpp

Running it:

Notes

About

Releases

Packages

Languages

pikalover6/openassistant.cpp

Folders and files

Latest commit

History

Repository files navigation

USE KOBOLDCPP INSTEAD

openassistant.cpp

Running it:

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages