Skip to content

Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.

License

Notifications You must be signed in to change notification settings

vegaluisjose/mlx-rag

Repository files navigation

MLX RAG

Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.

I have previously converted the weights for the embedding model gte-large into MLX format, and you can find them stored here in the mlx-rag repository. Additionally, as a base model, I am using NeuralBeagle14-7B-4bit-mlx.

Getting started

  • Install requirements
python3 -m pip install -r requirements.txt
  • Create vector database from a pdf file
python3 create_vdb.py --pdf flash_attention.pdf --vdb vdb.npz
  • Query database (pdf file)
python3 query_vdb.py --question "what is flash attention?"

About

Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages