Skip to content

Meta's LLaMa ready to run on your Mac with M1/M2 Apple Silicon

License

Notifications You must be signed in to change notification settings

yazanobeidi/LLaMa-for-Mac

Repository files navigation

LLaMa-for-Mac

Meta's LLaMa ready to run on your Mac with M1/M2 Apple Silicon

Description

The original LLaMa release (facebookresearch/llma) requires CUDA.

This repo contains minimal modifications to run on Apple Silicon M1/M2 and GPU by leveraging Torch MPS.

Have fun 🤩!

Setup

First download a) the weights and b) tokenizer.model:

  1. IPFS: here or a mirror (note this one does not have tokenizer.model)
  2. BiTorrent: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA

After downloading, move the files from your Downloads folder to LLaMa-for-Mac/weights/sharded

Next, clone this repository

git clone https://github.com/yazanobeidi/LLaMa-for-Mac

Setup a virtualenv (optional) and install Python requirements by running:

python3.11 -m venv ~/.LLaMa

source ~/.env/LLaMa/bin/activate

pip install -r requirements.txt

To run without torch-distributed on single node we must unshard the sharded weights. To do this, run the following, where --model points to the model version you downloaded. The path arguments don't need to be changed.

python3 convert_to_unsharded.py --model 30B --path-to-weights LLaMa-for-Mac/weights/sharded --output-path LLaMa-for-Mac/weights/unsharded/

This will take a minute or so to complete.

Usage

After following the Setup steps above, you can launch a webserver hosting LLaMa with a single command:

python server.py --path-to-weights weights/unsharded/ --max-seq-len 128 --max-gen-len 128 --model 30B

Now you can make requests to the /generate endpoint with your prompt as payload, for example:

curl -X GET http://localhost:3000/generate -H "Content-Type: application/json" -d '{"prompt": "Hello world"}'

License

This is a modified version of facebookresearch/llma which was originally licensed under GPL3. Therefore this work retains the GPL3 license. See LICENSE.md.

About

Meta's LLaMa ready to run on your Mac with M1/M2 Apple Silicon

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages