vllm: init at v0.3.1 with rocm support #283842

CertainLach · 2024-01-25T20:19:44Z

Description of changes

Depends on:

python311Packages.werkzeug: 2.3.7 -> 3.0.1; python311Packages.flask: 2.3.3 -> 3.0.1; python311Packages.quart: 0.18.4 -> 0.19.4 #258300 (VLLM uses newer versions of those libraries) - done in staging
RFE: please provide update for werkzeug 3.x psf/httpbin#36 (As the needed for new werkzeug version is not released. VLLM doesn't depend directly on this, but httpbin is still in the dependency graph, and needs to be built.) - fix backported to staging
ROCm: Allow setting compilation target vllm-project/vllm#2581 (Needed for proper sandbox support) - added patch
rocmPackages.clr: backport bf16 compilation fix #286720
python311Packages.openai-triton: remove ptxas --version call when built without cuda support #286763

I am not very familiar with python packaging in nixos (Though, anything is better than requirements.txt)

Couple of changes were made to fix build due to conflicting dependencies:

It now depends on torch instead of torch-bin, because plain torch dependency is propagated somewhere, together with duplicate openai-triton package
xformers and openai-triton were also broken for me due to usage of different torch

I think the only working way here is to use nixpkgs.config.torchSupport/nixpkgs.config.cudaSupport, as overriding torch dependency doesn't work very well in this case.

It also builds with CUDA for me, though I haven't properly tested it in runtime

Things done

Add a 👍 reaction to pull requests you find important.

mweinelt · 2024-01-25T23:34:02Z

Flask/Werkzeug can't go into master and needs a lot more work on reverse dependencies.

happysalada · 2024-01-25T23:45:39Z

vllm as is still needed some work and wasn't working properly with cuda. If you've got it working for rocm, we could merge an early version with that only. (if you ever want to submit a PR for vllm, you don't need to credit me at all, you can just copy paste what you feel is useful and put me as a maintainer).
transformers is already at 4.37.1 on master.
there are some elements that could be merged relatively quickly, but I feel it would best be to break it down into smaller PRs that are easier to test and review.

pkgs/development/python-modules/vllm/default.nix

pkgs/development/python-modules/openai-triton/default.nix

CertainLach · 2024-02-01T08:09:36Z

Managed to get it working on Cuda with this PR:
#285495

CertainLach · 2024-02-17T13:11:05Z

Currently re-testing CUDA build after rebase on master, for ROCm it requires

CertainLach · 2024-02-17T13:38:05Z

pkgs/development/python-modules/vllm/default.nix

+      gpuTargets
+    else
+      # vllm supports less gpu targets than rocm clr, supported target list is taken from ROCM_SUPPORTED_ARCHS in setup.py
+      lib.lists.intersectLists rocmPackages.clr.gpuTargets ["gfx90a" "gfx908" "gfx906" "gfx1030" "gfx1100"]


Looks like vllm setup.py calculates the intersection by itself now, this needs to be removed.

CertainLach · 2024-02-17T16:56:52Z

Tested with CUDA, it works.
ROCm testing (on top of patches mentioned earlier, + couple of fixups on top of them) in progress.

Edit: retested with cuda, i was testing on base branch without #285249, I have patched dependencies for torch 2.2.0.

CertainLach · 2024-02-17T22:46:16Z

Works on ROCm.

happysalada · 2024-02-18T09:54:12Z

@CertainLach thank you very much for this!

github-actions bot added the 6.topic: python label Jan 25, 2024

CertainLach changed the title ~~Vllm init rocm~~ vllm: init at 2024-01-25 with rocm support Jan 25, 2024

ofborg bot added the 8.has: package (new) This PR adds a new package label Jan 25, 2024

ofborg bot requested review from fabaff, happysalada, mweinelt, NickCao, AluisioASG, gebner and pashashocky January 25, 2024 21:07

ofborg bot added 10.rebuild-darwin: 501+ 10.rebuild-darwin: 1001-2500 10.rebuild-linux: 501+ 10.rebuild-linux: 2501-5000 labels Jan 25, 2024

mweinelt marked this pull request as draft January 25, 2024 23:34

CertainLach changed the base branch from master to staging January 27, 2024 18:50

CertainLach force-pushed the vllm-init-rocm branch from 19e6aa3 to a5d537a Compare January 27, 2024 18:50

ofborg bot added 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 10.rebuild-linux: 1-10 10.rebuild-linux: 1 and removed 10.rebuild-darwin: 501+ 10.rebuild-darwin: 1001-2500 10.rebuild-linux: 501+ 10.rebuild-linux: 2501-5000 labels Jan 27, 2024

CertainLach force-pushed the vllm-init-rocm branch from a5d537a to 27666ab Compare January 28, 2024 15:15

Tungsten842 suggested changes Jan 31, 2024

View reviewed changes

pkgs/development/python-modules/vllm/default.nix Outdated Show resolved Hide resolved

pkgs/development/python-modules/vllm/default.nix Outdated Show resolved Hide resolved

pkgs/development/python-modules/openai-triton/default.nix Outdated Show resolved Hide resolved

CertainLach force-pushed the vllm-init-rocm branch from 27666ab to 7420a6a Compare February 17, 2024 13:02

github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Feb 17, 2024

CertainLach changed the title ~~vllm: init at 2024-01-25 with rocm support~~ vllm: init at v0.3.1 with rocm support Feb 17, 2024

CertainLach force-pushed the vllm-init-rocm branch from 7420a6a to 1f409b4 Compare February 17, 2024 13:03

CertainLach changed the base branch from staging to master February 17, 2024 13:03

github-actions bot removed 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Feb 17, 2024

CertainLach force-pushed the vllm-init-rocm branch from 1f409b4 to 1846a26 Compare February 17, 2024 13:06

CertainLach commented Feb 17, 2024

View reviewed changes

ofborg bot added the 11.by: package-maintainer This PR was created by the maintainer of the package it changes label Feb 17, 2024

CertainLach marked this pull request as ready for review February 17, 2024 16:56

CertainLach force-pushed the vllm-init-rocm branch from bd63471 to 3adc470 Compare February 17, 2024 17:14

happysalada and others added 2 commits February 17, 2024 22:26

python311Packages.vllm: init at v0.3.1

85306e2

python3Packages.vllm: add lach to maintainers

dac6982

CertainLach force-pushed the vllm-init-rocm branch from 3adc470 to dac6982 Compare February 17, 2024 21:28

happysalada merged commit 53312e4 into NixOS:master Feb 18, 2024
23 checks passed

CertainLach deleted the vllm-init-rocm branch February 18, 2024 15:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm: init at v0.3.1 with rocm support #283842

vllm: init at v0.3.1 with rocm support #283842

CertainLach commented Jan 25, 2024 •

edited

Loading

mweinelt commented Jan 25, 2024

happysalada commented Jan 25, 2024

CertainLach commented Feb 1, 2024

CertainLach commented Feb 17, 2024

CertainLach Feb 17, 2024 •

edited

Loading

CertainLach commented Feb 17, 2024 •

edited

Loading

CertainLach commented Feb 17, 2024

happysalada commented Feb 18, 2024

vllm: init at v0.3.1 with rocm support #283842

vllm: init at v0.3.1 with rocm support #283842

Conversation

CertainLach commented Jan 25, 2024 • edited Loading

Description of changes

Things done

mweinelt commented Jan 25, 2024

happysalada commented Jan 25, 2024

CertainLach commented Feb 1, 2024

CertainLach commented Feb 17, 2024

CertainLach Feb 17, 2024 • edited Loading

Choose a reason for hiding this comment

CertainLach commented Feb 17, 2024 • edited Loading

CertainLach commented Feb 17, 2024

happysalada commented Feb 18, 2024

CertainLach commented Jan 25, 2024 •

edited

Loading

CertainLach Feb 17, 2024 •

edited

Loading

CertainLach commented Feb 17, 2024 •

edited

Loading