[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271

blefaudeux · 2022-04-13T05:00:51Z

What does this PR do?

Not super pretty, suggestions welcome. The crux of the matter was that circleCI supports images with cuda 11.1 (soon to be deprecated) and 11.4, while pytorch nightlies are built for everything but cuda 11.4..

a (note the a, unicity is not proven) solution is to rely on conda instead, which handles both pytorch and the matching cuda in a pinch. The handling of shells in circleci is a semi-mistery to me, but in short most of the conda mechanics do not work (the shells have no .bashrc for instance), and I changed the scripts to point to the conda installed python instead (after many more elegant tests which did not work)

A Triton unit test segfaults now, but this is with an old version of triton, I think that spending time on that is ill-advised if we can agree on the next-next-PR (triton2) and land all of them instead

Before submitting

Did you have fun?
- Make sure you had fun coding 🙃
Did you read the contributor guideline?
Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you make sure to update the docs?
Did you write any new necessary tests?
Did you update the changelog? (if needed)

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

blefaudeux · 2022-04-19T21:25:02Z

@fmassa @dianaml0 this goes with the triton2 PR, the plan was to land all three in one go

fmassa

LGTM, thanks a lot!

Another approach we could follow in the future would be to use the docker images from PyTorch, which comes with a number of things already packed in

That being said, the PyTorch team will at some point in the future provide a set of helper functions so that dealing with all of this would be simpler, so we will probably just use that when they get available.

blefaudeux · 2022-04-20T18:28:07Z

LGTM, thanks a lot!

Another approach we could follow in the future would be to use the docker images from PyTorch, which comes with a number of things already packed in

That being said, the PyTorch team will at some point in the future provide a set of helper functions so that dealing with all of this would be simpler, so we will probably just use that when they get available.

Oh that would be great for the docker image ! I had a quick look when penning this one and it seemed that we had to setup the actual image hosting on top, so I skipped that for now, but if there's an existing image hosted by pytorch that would be perfect 😍

dianaml0

This is great, thanks! :)

dianaml0 · 2022-04-20T18:16:11Z

.circleci/config.yml

+      command: |
+        source $BASH_ENV
+        cd docs
+        python3 -m pip install -r requirements.txt


should this be $CONDA_PYTHON also?

…attention mask or not (#266) check the assert

@fmassa

* parent be72b26 author Kashif Rasul <kashif.rasul@gmail.com> 1648069860 +0100 committer Benjamin Lefaudeux <benjamin.lefaudeux@pm.me> 1650256563 -0700 Move to Triton 2 Author: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Benjamin Lefaudeux <benjamin.lefaudeux@pm.me> Tentatively fixing layernorm - faster all around - bugfix better take on sparse tensors, put layout on the correct device update the pip packages, minor cleanup * catering for triton blocksparse being probably more reliable in fp16 * faster layernorm * Minor blocksparse refactoring, update block size restrictions, relax power of two constraint (#277) * Relax device size restrictions * Refactor device creation and run all tests * linting Co-authored-by: Cole Hawkins <colehawk@amazon.com> * code review, thanks @fmassa ! Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: colepshawkins <31542048+colehawkins@users.noreply.github.com> Co-authored-by: Cole Hawkins <colehawk@amazon.com>

codecov-commenter · 2022-04-21T04:02:45Z

Codecov Report

Merging #271 (4ecbec1) into main (549bd42) will decrease coverage by 0.11%.
The diff coverage is 91.30%.

@@            Coverage Diff             @@
##             main     #271      +/-   ##
==========================================
- Coverage   92.81%   92.69%   -0.12%     
==========================================
  Files          61       61              
  Lines        3368     3397      +29     
==========================================
+ Hits         3126     3149      +23     
- Misses        242      248       +6

Flag	Coverage Δ
Python	`92.69% <91.30%> (-0.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
xformers/triton/softmax.py	`92.00% <ø> (-0.60%)`	⬇️
xformers/utils.py	`78.43% <ø> (ø)`
xformers/components/multi_head_dispatch.py	`97.87% <77.77%> (-2.13%)`	⬇️
xformers/components/attention/blocksparse.py	`92.18% <80.00%> (-2.02%)`	⬇️
xformers/triton/layer_norm.py	`87.71% <80.00%> (-0.42%)`	⬇️
xformers/components/attention/base.py	`97.36% <100.00%> (+0.14%)`	⬆️
xformers/components/attention/compositional.py	`100.00% <100.00%> (ø)`
xformers/components/attention/favor.py	`100.00% <100.00%> (ø)`
xformers/components/attention/fourier_mix.py	`100.00% <100.00%> (ø)`
xformers/components/attention/global_tokens.py	`92.50% <100.00%> (-7.50%)`	⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 549bd42...4ecbec1. Read the comment docs.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 13, 2022

blefaudeux marked this pull request as draft April 13, 2022 05:00

blefaudeux changed the title ~~[WIP] Moving the CI to conda, picking a more modern cuda + pytorch combo~~ [WIP][RFC] Moving the CI to conda, picking a more modern cuda + pytorch combo Apr 13, 2022

blefaudeux changed the title ~~[WIP][RFC] Moving the CI to conda, picking a more modern cuda + pytorch combo~~ [WIP][RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo Apr 13, 2022

testing using conda to get the pytorch nightlies and matching cuda

9cad6bb

blefaudeux force-pushed the conda_ci branch from 77fa95e to 9cad6bb Compare April 18, 2022 04:34

blefaudeux marked this pull request as ready for review April 19, 2022 21:24

fmassa approved these changes Apr 20, 2022

View reviewed changes

blefaudeux changed the title ~~[WIP][RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo~~ [RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo Apr 20, 2022

dianaml0 approved these changes Apr 20, 2022

View reviewed changes

blefaudeux and others added 2 commits April 20, 2022 23:23

[fix] Making it explicit whether the attention mechanism supports an …

e3b57de

…attention mask or not (#266) check the assert

blefaudeux merged commit 498e009 into main Apr 21, 2022

blefaudeux deleted the conda_ci branch April 21, 2022 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271

[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271

blefaudeux commented Apr 13, 2022

blefaudeux commented Apr 19, 2022 •

edited

Loading

fmassa left a comment

blefaudeux commented Apr 20, 2022 •

edited

Loading

dianaml0 left a comment

dianaml0 Apr 20, 2022

codecov-commenter commented Apr 21, 2022

[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271

[RFC] 1/3 Moving the CI to conda, picking a more modern cuda + pytorch combo #271

Conversation

blefaudeux commented Apr 13, 2022

What does this PR do?

Before submitting

PR review

blefaudeux commented Apr 19, 2022 • edited Loading

fmassa left a comment

Choose a reason for hiding this comment

blefaudeux commented Apr 20, 2022 • edited Loading

dianaml0 left a comment

Choose a reason for hiding this comment

dianaml0 Apr 20, 2022

Choose a reason for hiding this comment

codecov-commenter commented Apr 21, 2022

Codecov Report

blefaudeux commented Apr 19, 2022 •

edited

Loading

blefaudeux commented Apr 20, 2022 •

edited

Loading