Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildBazelPackage's fetch derivation is error prone #224446

Open
uri-canva opened this issue Apr 3, 2023 · 0 comments
Open

buildBazelPackage's fetch derivation is error prone #224446

uri-canva opened this issue Apr 3, 2023 · 0 comments
Labels
0.kind: bug Something is broken

Comments

@uri-canva
Copy link
Contributor

uri-canva commented Apr 3, 2023

buildBazelPackage has two derivations in it: a fixed output derivation that fetches dependencies of the bazel target, and a regular derivation that builds the bazel target. The build derivation is not allowed to download anything, so we expect the fetch derivation to have downloaded everything the build derivation will need. Unfortunately bazel doesn't have a well defined concept for an entity that downloads files: repository rules both download and generate files based on configuration, mixing downloading and configuring in one. This brings the following issues:

  1. Depending on whether you configure the fetch in the same way as the build, you'll either overfetch, downloading and configuring more than necessary, or you'll require a different fixed output hash for each permutation of configuration options. This is especially tricky given that configuration of repository rules can depend on which other repository rules have been configured already, which means that the configuration options don't compose.
  2. Authors need to remove non-deterministic files from the derivation. Sometimes these files are the result of configuration, so they will still need to be regenerated in the build derivations.
  3. The fetch can contain configuration results that differ depending on OS or cpu architecture. This requires either removing them or specifying different fixed output hashes for each supported OS/cpu combination.
  4. Some rules mix downloading and configuring by making what they download depend on the configuration. That makes it much more difficult to avoid having multiple fixed output hashes.
  5. Having multiple fixed output hashes makes them very hard to update: not everyone has ready access to environments in which to build the fixed output derivations to derive those hashes, especially for OSes that are hard to virtualise (macOS), and for non-native architectures (the fetch derivation is still quite cpu intensive, and emulating x86 on arm or viceversa is quite slow).
  6. All these points apply to both direct and transitive dependencies, which makes it all even harder, as the authors of the derivations are likely familiar with the source of the target they're trying to write the derivation for, but each level of dependency indirection decreases the likely familiarity.

Possible alternatives:

  1. --distdir. This is what we do in bazel's own derivation. We generate a json file that captures all the files downloaded during the bazel build, then we use fetchurl to download those files, and provide them to bazel via --distdir. This separates the configuration from the downloading, with nix doing all the downloading, and bazel doing all the configuration and building in a single invocation. It's harder to support though, as our current implementation isn't meant to be generic, and it cannot support repository rules that do their own downloading, for example rules_nodejs's yarn_install and rules_python's pip_parse.
  2. --override_repository. This is what we do when we need to patch the repositories. We define the repository in nix, as a derivation, and then pass it into bazel, replacing the download / configuration logic in bazel with our own. Depending on the implementation of the rule we're overriding, this can avoid the download / configuration happening at all, but even when it can't, we can easily delete the files. This would require a lot of effort, as it would require creating a bazel repository package set like we have for python and haskell packages. See https://discourse.nixos.org/t/nixpkgss-current-development-workflow-is-not-sustainable/18741 for some issues that arise from doing that.
  3. Something something bzlmod?

Notify maintainers

@NixOS/bazel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken
Projects
None yet
Development

No branches or pull requests

1 participant