-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
Another approach we could follow in the future would be to use the docker images from PyTorch, which comes with a number of things already packed in
That being said, the PyTorch team will at some point in the future provide a set of helper functions so that dealing with all of this would be simpler, so we will probably just use that when they get available.
Oh that would be great for the docker image ! I had a quick look when penning this one and it seemed that we had to setup the actual image hosting on top, so I skipped that for now, but if there's an existing image hosted by pytorch that would be perfect 😍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks! :)
command: | | ||
source $BASH_ENV | ||
cd docs | ||
python3 -m pip install -r requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be $CONDA_PYTHON also?
…attention mask or not (#266) check the assert
* parent be72b26 author Kashif Rasul <kashif.rasul@gmail.com> 1648069860 +0100 committer Benjamin Lefaudeux <benjamin.lefaudeux@pm.me> 1650256563 -0700 Move to Triton 2 Author: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Benjamin Lefaudeux <benjamin.lefaudeux@pm.me> Tentatively fixing layernorm - faster all around - bugfix better take on sparse tensors, put layout on the correct device update the pip packages, minor cleanup * catering for triton blocksparse being probably more reliable in fp16 * faster layernorm * Minor blocksparse refactoring, update block size restrictions, relax power of two constraint (#277) * Relax device size restrictions * Refactor device creation and run all tests * linting Co-authored-by: Cole Hawkins <colehawk@amazon.com> * code review, thanks @fmassa ! Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: colepshawkins <31542048+colehawkins@users.noreply.github.com> Co-authored-by: Cole Hawkins <colehawk@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #271 +/- ##
==========================================
- Coverage 92.81% 92.69% -0.12%
==========================================
Files 61 61
Lines 3368 3397 +29
==========================================
+ Hits 3126 3149 +23
- Misses 242 248 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
What does this PR do?
Not super pretty, suggestions welcome. The crux of the matter was that circleCI supports images with cuda 11.1 (soon to be deprecated) and 11.4, while pytorch nightlies are built for everything but cuda 11.4..
a (note the a, unicity is not proven) solution is to rely on conda instead, which handles both pytorch and the matching cuda in a pinch. The handling of shells in circleci is a semi-mistery to me, but in short most of the conda mechanics do not work (the shells have no .bashrc for instance), and I changed the scripts to point to the conda installed python instead (after many more elegant tests which did not work)
A Triton unit test segfaults now, but this is with an old version of triton, I think that spending time on that is ill-advised if we can agree on the next-next-PR (triton2) and land all of them instead
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.