Initial release
This first release integrates the following fixes since the last uploaded benchmarks:
- 🐛 CUDA benchmarks are now fixed
- ⚡️
x @ y + z
as linear operation in MLX is changed bymx.addmm
- the overall benchmark runs faster by spawning less processes