Dynamic spawn scheduler prevents remote cache from being populated #7328
Labels
not stale
Issues or PRs that are inactive but not considered stale
P3
We're not considering working on this, but happy to review a PR. (No assignee)
team-Remote-Exec
Issues and PRs for the Execution (Remote) team
type: feature request
When a local action racing a remote action wins, the dynamic spawn scheduler cancels the remote action. As a result, we never populate the remote cache with the results of this cancelled action, and subsequent builds from the same or a different user cannot benefit from reuse.
An option here would be to make the dynamic spawn scheduler not cancel remote actions. This way they would eventually populate the cache and be reusable by other users/builds. But this would be inefficient because we'd leave some remote actions running when we know we don't truly care about their results.
Therefore, a better solution suggested by @philwo would be to delegate this "cancellation" to the remote execution service. The dynamic spawn scheduler should tell the remote service that it doesn't care about the action any longer (instead of forcibly canceling it), and the remote service would be in charge of deciding whether it wants to continue running the action if resources permit it or dropping it under pressure (or whichever other policy).
Filing under Bazel because we need to propagate this signal to the remote execution engine, which requires code changes and possibly some changes to the protocol. CC @buchgr
The text was updated successfully, but these errors were encountered: