-
-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update JAX and fix aarch64-darwin build #219778
Conversation
There are a few TODOs before this can be merged:
|
I should add that although we may drop support for from-source |
This PR should be ready to go now. The JAX team was kind enough to hook us up with a new snappy release in order to fix the build (s/o @hawkinsp! 🚀) I've tested that jaxlib builds with and without CUDA support. I'm running a nixpkgs-review now... |
cc @NixOS/cuda-maintainers @uri-canva @ndl |
Unfortunately I now do not have access to |
You can run the build on ofborg with fake shas to get the hash of the fetch derivations. |
If we don't have anyone with access to a machine to test it out conveniently I vote that we drop support from the source build and x86_64-darwin users can use the jaxlib-bin package instead. Otherwise the maintenance load is just too high IMHO. |
Feels reasonable. For the sake of documentation, I suggest that if one later wants to reintroduce x8664Darwin support they also set up a GitHub Actions workflow that runs over a matrix of platforms and tries to update the derivation (version and hashes) via something like nix-update. And I don't expect nix-update to just work here. The whole business of conditional hashes is too bearing |
We could also fix it so the fetch hashes are not system dependent, though I'm not sure how hard it would be. I'll take a look. Are there any other reasons to keep jax both built from source and as a binary? |
jaxlib-bin comes in handy since the builds are so simple and fast. So eg that makes it a lot easier to support more architectures and lets users more easily override to get different versions of jaxlib if necessary. The full jaxlib source builds are just so looooooong |
snappy is fairly popular, so this actually results in a few hundred rebuilds. I was hoping to run a nixpkgs-review and be done with it, but I actually ran out of disk space on my 200Gb drive. Would someone else be able to nixpkgs-review it? If not, I could create a snappy-only PR targeting staging. |
I hit another error in snappy on
|
Yeah we should probably do the snappy upgrade separately first. |
I can try to nixpkgs-review the snappy PR once you put it up. |
Are there things the source build supports that the binary doesn't? I'm confused why we need both, I thought we had both as a workaround for some issue that prevented us from building from source, but it doesn't sound like it's a transitional thing. Do we still need both? |
I worked around the snappy break with: diff --git a/pkgs/development/libraries/snappy/default.nix b/pkgs/development/libraries/snappy/default.nix
index 24c3e7bb7df..d33e5755084 100644
--- a/pkgs/development/libraries/snappy/default.nix
+++ b/pkgs/development/libraries/snappy/default.nix
@@ -29,6 +29,8 @@ stdenv.mkDerivation rec {
nativeBuildInputs = [ cmake ];
+ env.NIX_CFLAGS_COMPILE = "-Wno-sign-compare";
+
cmakeFlags = [
"-DBUILD_SHARED_LIBS=${if static then "OFF" else "ON"}"
"-DSNAPPY_BUILD_TESTS=OFF" but I'm having trouble getting the diff of the fetch derivation, I'll have to come back to it another time. |
Spun off the snappy change into #221215. @uri-canva could you ptal? We should be able to rebase and merge this once that chance lands on master. |
wdym? it built fine for me on aarch64-darwin
Are you proposing removing the source build or removing the binary build? i think the main selling point for the source build is just that nixpkgs generally prefers source builds 🤷 |
Yeah but I want to build it on a different system too so I can diff them. Usually if different systems need different shas it means the deps have system specific files in them, which we can delete and rebuild during the build. Ideally the output of the fetch only has system independent files that were downloaded.
I'm proposing keeping only one of the two, so we don't duplicate the maintenance effort. Ideally it would be the source one, that's why I'm trying to figure out what is causing the diff in the deps shas, so we can develop the derivation for all systems more easily. |
Ohhhh this is in regards to
There are still benefits to having the binary build around. It's a lot easier to hack on, and users (like myself) occasionally need to override things in order to upgrade/downgrade versions, play around with settings, etc. Luckily the maintenance burden of the binary build is much lower. |
Right then maybe we should keep only the binary build? |
I would be sympathetic to such a change, but IIUC there are a number of people who really like to have the source build around |
Ok 😢 . |
Note to self: We should prob merge #221390 before this |
Thank you for this updating effort ! I think that we might need to change the
|
@@ -50,21 +50,21 @@ let | |||
cpuSrcs = { | |||
"x86_64-linux" = fetchurl { | |||
url = "https://storage.googleapis.com/jax-releases/nocuda/jaxlib-${version}-cp310-cp310-manylinux2014_x86_64.whl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp310-cp310
Should we also account for pythonVersion
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm not a bad idea....
Every time I try to run nixpkgs-review it hangs, apparently on the cvxpy test suite. Not sure why as the cvxpy build does not seem to hang either on this branch or on master. |
I've been continuously suffering from cvxpy tests when running parallel builds, and I haven't quite figured out why. Upstream github actions reveal that their tests take at most two minutes |
Could you try to update this branch ? I can give it a shot on my machine if you want. |
@samuela if you get bored, I think it's OK to
The failures we're looking for are likely unrelated to cvxpy... |
Glad to hear I'm not alone here!
Brilliant! I'll see how far I can get with that. I really wish nixpkgs-review had an option to skip packages... |
@samuela When trying to build jaxlib 0.4.7 from source I faced a build failure at the It seems to come from this upstream commit. |
Result of 43 packages failed to build:
45 packages built:
|
Failed derivations
|
Doesn't look like any of these failures are JAX version related AFAICT. I propose we go ahead and merge this and then address the python version-specific downloads and 0.4.7/8 upgrade in separate PR(s). |
We're still at |
You mean we're not yet branching on the python version? Or something else? |
No, the ofborg thing? |
Ohhh, I totally forgot about that! Should be addressed in 3fa9f1f. |
Looks like all checks are green now! proceeding to merge |
Description of changes
This PR does a few things:
rev
that JAX was building on which was messed up by 0d6a071 which was merged without approval.jaxlib
source build still doesn't work on this platform but that doesn't mean we can use thejaxlib-bin
package in its place, esp. considering it's only acheckPhase
dependency ofjax
.Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)