⚡️ mlx-benchmark ⚡️

A comprehensive benchmark of MLX ops.

This repo aims to benchmark Apple's MLX operations and layers, on all Apple Silicon chips, along with some GPUs.

Contributions: Everyone can contribute to the benchmark! If you have a missing device or if you want to add a missing layer/operation, please read the contribution guidelines.

Current M chips: M1, M1 Pro, M1 Max, M2, M2 Pro, M2 Max, M2 Ultra, M3 Pro, M3 Max.

Current CUDA GPUs: RTX4090, Tesla V100, A100

Missing devices: M1 Ultra, M3, and other CUDA GPUs.

Note

You can submit your benchmark even for a device that is already listed, provided you use a newer version of MLX. Simply submit a PR by overriding the old benchmark table. Also, most of the existing benchmarks do not include the mx.compile feature, which has been recently added to mlx-benchmark.

Benchmarks 🧪

Benchmarks are generated by measuring the runtime of every mlx operations on GPU and CPU, along with their equivalent in pytorch with mps, cpu and cuda backends. On MLX with GPU, the operations compiled with mx.compile are included in the benchmark by default. To not benchmark the compiled functions, set --compile=False.

For each operation, we measure the runtime of multiple experiments. We propose 2 benchmarks based on these experiments:

Detailed benchmark: provides the runtime of each experiment.
Average runtime benchmark: computes the mean of experiments. Easier to navigate, with fewer details.

Installation 💻

Installation on Mac devices

Running the benchmark locally is straightforward. Create a new env with osx-arm64 architecture and install the dependencies.

CONDA_SUBDIR=osx-arm64 conda create -n mlx_benchmark python=3.10 numpy pytorch torchvision scipy requests -c conda-forge

pip install -r requirements.txt

Installation on other devices

Other operating systems than macOS can only run the torch experiments, on CPU or with a CUDA device. Install a new env without the CONDA_SUBDIR=osx-arm64 prefix and install the torch package that matches your CUDA version. Then install all the requirements within requirements.txt, except mlx.

Finally, open the config.py file and set:

USE_MLX = False

to avoid importing the mlx package, which cannot be installed on non-Mac devices.

Run the benchmark 🧑‍💻

Run on Mac

To run the benchmark on mps, mlx and CPU:

python run_benchmark.py --include_mps=True --include_mlx=True --include_cpu=True

Run on other devices

To run the torch benchmark on CUDA and CPU:

python run_benchmark.py --include_mps=False --include_mlx=False --include_cuda=True --include_cpu=True

Run only compiled functions

If you're interested in benchmarking only operations against operations compiled with mx.compile, you can run:

python run_benchmark.py --include_mps=False --include_cpu=False

Contributing 🚀

If you have a device not yet featured in the benchmark, especially the ones listed below, your PR is welcome to broaden the scope and accuracy of this project.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github		.github
benchmarks		benchmarks
images		images
mlx_benchmark		mlx_benchmark
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡️ mlx-benchmark ⚡️

A comprehensive benchmark of MLX ops.

Benchmarks 🧪

Installation 💻

Installation on Mac devices

Installation on other devices

Run the benchmark 🧑‍💻

Run on Mac

Run on other devices

Run only compiled functions

Contributing 🚀

About

Releases

Packages

Languages

License

Dave2011/mlx-benchmark

Folders and files

Latest commit

History

Repository files navigation

⚡️ mlx-benchmark ⚡️

A comprehensive benchmark of MLX ops.

Benchmarks 🧪

Installation 💻

Installation on Mac devices

Installation on other devices

Run the benchmark 🧑‍💻

Run on Mac

Run on other devices

Run only compiled functions

Contributing 🚀

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages