allow repository_rules to run on stale state #23720

dgoldstein0 · 2024-09-23T19:38:57Z

Description of the feature request:

repository_rules are a way to run arbitrary imperative operations during bazel's loading phase. The default is that bazel clears their output folder before each execution of the repository rule - a good default, but it makes some use cases challenging. I propose adding a parameter to repository_rule: clear_output_folder_before_fetch which should be True by default (maintaining current behavior), but could be set to False for use cases that trust their rules to properly handle stale state.

Which category does this issue belong to?

Rules API

What underlying problem are you trying to solve with this feature?

I've written a repository rule using yarn to install node_modules/ which we've integrated with our custom js library rules. Yarn itself has a ton of logic to handle incremental updating of the generated node_modules/, so that yarn install after a small change to the yarn.lock can take a few seconds rather than a minute or more. Since bazel clears the repository rule output folder, either we end up with quite bad performance, or we need quite a bit of extra complexity to put the outputs in a different folder and symlink them to bazel's output folder for the repository rule, so that we only have to create and rewrite symlinks on each repeat invocation and let yarn take care of the rest, as opposed to starting from scratch; (the symlink hacks cost us a few seconds of execution time - and also have caused us to encounter bugs in yarn - so aren't free either, but much better than no workaround)

Likely other "use a package manager or other build system in a repository rule" use cases have this problem as well.

Which operating system are you running Bazel on?

linux - ubuntu 22.04

What is the output of `bazel info release`?

release 6.1.0-1c2df03733503215490047c398d25fe5b5553a0e

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

No response

What's the output of `git remote get-url origin; git rev-parse HEAD` ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

The text was updated successfully, but these errors were encountered:

mwoehlke-kitware · 2024-10-09T20:50:23Z

Likely other "use a package manager or other build system in a repository rule" use cases have this problem as well.

Can confirm; ran into this same problem with a repository that creates a Python virtual environment.

Also, repository rules sometimes run even when there has been no change. Why? (In particular, bazel shutdown triggers a re-run.)

Wyverald · 2024-10-09T20:53:57Z

Also, repository rules sometimes run even when there has been no change. Why? (In particular, bazel shutdown triggers a re-run.)

This sounds like a bug. If you could pin it down in a minimal repro case and file an issue, we'd appreciate it!

jwnimmer-tri · 2024-10-09T20:56:21Z

In @mwoehlke-kitware's case we were purposefully setting local = True for other reasons, in which case I believe re-running after a shutdown is the desired / specified behavior and there is no bug.

mwoehlke-kitware · 2024-10-09T20:59:35Z

In @mwoehlke-kitware's case we were purposefully setting local = True ...

Ah, I missed that note in the docs. Regardless, something like the proposed feature would still be useful for us!

I propose adding a parameter to repository_rule: clear_output_folder_before_fetch ...

I'd like to propose an alternative: allow specific files and/or trees to be marked such that they aren't purged when a refetch needs to occur. (Naturally, bazel clean or at least bazel clean --expunge should still nuke these.) This would be more flexible than the original proposal, but marking the whole output tree thusly should be equivalent to the original proposal.

meteorcloudy · 2024-10-29T16:13:09Z

Sounds like interesting idea, @dgoldstein0 if you want to contribute, a simple design doc would be nice!

dgoldstein0 added type: feature request untriaged labels Sep 23, 2024

dgoldstein0 assigned iancha1992, satyanandak and sgowroji Sep 23, 2024

github-actions bot added the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Sep 23, 2024

iancha1992 unassigned sgowroji, iancha1992 and satyanandak Sep 24, 2024

mwoehlke-kitware mentioned this issue Oct 9, 2024

WIP: Use venv for Python dependencies (macOS) RobotLocomotion/drake#21844

Closed

Wyverald added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts labels Oct 9, 2024

meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) help wanted Someone outside the Bazel team could own this and removed untriaged labels Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow repository_rules to run on stale state #23720

allow repository_rules to run on stale state #23720

dgoldstein0 commented Sep 23, 2024 •

edited

Loading

mwoehlke-kitware commented Oct 9, 2024

Wyverald commented Oct 9, 2024

jwnimmer-tri commented Oct 9, 2024

mwoehlke-kitware commented Oct 9, 2024

meteorcloudy commented Oct 29, 2024

allow repository_rules to run on stale state #23720

allow repository_rules to run on stale state #23720

Comments

dgoldstein0 commented Sep 23, 2024 • edited Loading

Description of the feature request:

Which category does this issue belong to?

What underlying problem are you trying to solve with this feature?

Which operating system are you running Bazel on?

What is the output of bazel info release?

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

What's the output of git remote get-url origin; git rev-parse HEAD ?

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

mwoehlke-kitware commented Oct 9, 2024

Wyverald commented Oct 9, 2024

jwnimmer-tri commented Oct 9, 2024

mwoehlke-kitware commented Oct 9, 2024

meteorcloudy commented Oct 29, 2024

dgoldstein0 commented Sep 23, 2024 •

edited

Loading

What is the output of `bazel info release`?

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

What's the output of `git remote get-url origin; git rev-parse HEAD` ?