Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the Merkle tree computation #21379

Open
tjgq opened this issue Feb 16, 2024 · 0 comments
Open

Optimize the Merkle tree computation #21379

tjgq opened this issue Feb 16, 2024 · 0 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@tjgq
Copy link
Contributor

tjgq commented Feb 16, 2024

The input Merkle tree computation often shows up in profiles holding up the critical path. The --experimental_remote_merkle_tree_cache optimization, which was introduced to alleviate this problem, doesn't work reliably (for reasons explained at #21378).

The current implementation creates an intermediate Java object representation for each node, which must then be converted into a Java protobuf, and then into wire format for hashing and upload. There might be gains to be had in avoiding all of the copying and thrashing involved in these conversions.

To do better, we'd want to iterate the input list in post-order (children before parents) so that each node can be constructed directly as its final wire representation. Then we never need to copy things around, and the only state to keep is the node currently under construction and a (path -> node) map over previously built nodes.

Ideally, we'd get the input list in the right order from SpawnInputExpander#getInputMapping. But this has other consequences, since the list is cached in memory for reuse, and other callers likely expect it to be sorted alphabetically instead of topologically. So we might end up sorting it twice, undoing some of the gains.

An additional angle to explore is to try to make the computation parallel, perhaps by expressing it as a graph of ForkJoinTasks, one per node; but this requires some pre-work to set up the graph, which is time not spent actually computing things (and the computation of an individual node isn't particularly expensive, so we might need to split the computation at a coarser level).

@tjgq tjgq added untriaged team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug labels Feb 16, 2024
@tjgq tjgq changed the title Optimize the Merkle cache computation Optimize the Merkle tree computation Feb 16, 2024
@joeleba joeleba added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

2 participants