Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor GraphTraversal to avoid non-determinism #4098

Merged

Conversation

alexkirsz
Copy link
Contributor

@alexkirsz alexkirsz commented Mar 7, 2023

Description

The current design of the Visit trait makes it easy to introduce non-determinism by having map_children return different results depending on the order in which it's called.

For instance, if map_children tries to de-duplicate children (as is the case in ChunkContentVisit):

Order 1:
1. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA, NodeB])
2. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeC])

Order 2:
1. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeB, NodeC])
2. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA])

These two calling orders will result in different generated graphs, as the first order will forget about the edge Node2 -> NodeB, while the second will forget about the edge Node1 -> NodeB.

Instead, this PR makes it so map_children is called after inserting all nodes into the graph. This ensures that a different ordering can't affect the final shape of the graph.

It also refactors the GraphTraversal::visit method and the Visit trait to make it more consistent with graph terminology and (hopefully) easier to understand.

Testing Instructions

This solves an issue on the branch sokra/repro-for-alex.

@vercel
Copy link

vercel bot commented Mar 7, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
examples-nonmonorepo 🔄 Building (Inspect) Mar 7, 2023 at 1:44PM (UTC)
9 Ignored Deployments
Name Status Preview Comments Updated
examples-basic-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-cra-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-designsystem-docs ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-kitchensink-blog ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-native-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-svelte-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-tailwind-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
examples-vite-web ⬜️ Ignored (Inspect) Mar 7, 2023 at 1:44PM (UTC)
turbo-site ⬜️ Ignored (Inspect) Visit Preview Mar 7, 2023 at 1:44PM (UTC)

self.chunk_items_count += 1;
}
map_children.push(child);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "else" branch here was the issue: ChunkContentVisit::map_children could return different children depending on whether an asset was already processed or not.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

🟢 CI successful 🟢

Thanks

@alexkirsz alexkirsz requested a review from sokra March 7, 2023 14:42
@alexkirsz alexkirsz marked this pull request as ready for review March 7, 2023 14:42
@alexkirsz alexkirsz requested a review from a team as a code owner March 7, 2023 14:42
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

Benchmark for 348ee46

Test Base PR % Significant %
bench_startup/Turbopack SSR/1000 modules 2089.36ms ± 3.17ms 2042.42ms ± 2.67ms -2.25% -1.69%
Click to view full benchmark
Test Base PR % Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules 9023.20µs ± 35.72µs 9084.38µs ± 61.10µs +0.68%
bench_hmr_to_commit/Turbopack RCC/1000 modules 12.48ms ± 0.11ms 12.16ms ± 0.22ms -2.59%
bench_hmr_to_commit/Turbopack RSC/1000 modules 501.43ms ± 0.93ms 506.93ms ± 1.97ms +1.10%
bench_hmr_to_commit/Turbopack SSR/1000 modules 9222.10µs ± 58.94µs 9180.18µs ± 45.63µs -0.45%
bench_hmr_to_eval/Turbopack CSR/1000 modules 7970.77µs ± 38.42µs 7996.23µs ± 29.65µs +0.32%
bench_hmr_to_eval/Turbopack RCC/1000 modules 10.96ms ± 0.24ms 10.55ms ± 0.20ms -3.76%
bench_hmr_to_eval/Turbopack SSR/1000 modules 8258.43µs ± 53.56µs 8185.49µs ± 40.35µs -0.88%
bench_hydration/Turbopack RCC/1000 modules 3562.60ms ± 8.18ms 3544.62ms ± 7.59ms -0.50%
bench_hydration/Turbopack RSC/1000 modules 3199.54ms ± 9.42ms 3226.28ms ± 8.30ms +0.84%
bench_hydration/Turbopack SSR/1000 modules 3301.83ms ± 7.26ms 3288.53ms ± 6.10ms -0.40%
bench_startup/Turbopack CSR/1000 modules 2577.52ms ± 6.03ms 2561.49ms ± 8.81ms -0.62%
bench_startup/Turbopack RCC/1000 modules 2145.08ms ± 5.91ms 2160.01ms ± 4.45ms +0.70%
bench_startup/Turbopack RSC/1000 modules 2086.53ms ± 9.12ms 2112.46ms ± 9.89ms +1.24%
bench_startup/Turbopack SSR/1000 modules 2089.36ms ± 3.17ms 2042.42ms ± 2.67ms -2.25% -1.69%

@alexkirsz alexkirsz merged commit 3f9de96 into main Mar 7, 2023
@alexkirsz alexkirsz deleted the alexkirsz/web-690-chunk_content_internal_parallel-graph branch March 7, 2023 15:20
mehulkar pushed a commit that referenced this pull request Mar 7, 2023
### Description

The current design of the `Visit` trait makes it easy to introduce
non-determinism by having `map_children` return different results
depending on the order in which it's called.

For instance, if `map_children` tries to de-duplicate children (as is
the case in `ChunkContentVisit`):

```
Order 1:
1. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA, NodeB])
2. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeC])

Order 2:
1. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeB, NodeC])
2. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA])
```

These two calling orders will result in different generated graphs, as
the first order will forget about the edge Node2 -> NodeB, while the
second will forget about the edge Node1 -> NodeB.

Instead, this PR makes it so `map_children` is called *after* inserting
all nodes into the graph. This ensures that a different ordering can't
affect the final shape of the graph.

It also refactors the `GraphTraversal::visit` method and the `Visit`
trait to make it more consistent with graph terminology and (hopefully)
easier to understand.

### Testing Instructions

This solves an issue on the branch sokra/repro-for-alex.
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Jul 25, 2024
### Description

The current design of the `Visit` trait makes it easy to introduce
non-determinism by having `map_children` return different results
depending on the order in which it's called.

For instance, if `map_children` tries to de-duplicate children (as is
the case in `ChunkContentVisit`):

```
Order 1:
1. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA, NodeB])
2. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeC])

Order 2:
1. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeB, NodeC])
2. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA])
```

These two calling orders will result in different generated graphs, as
the first order will forget about the edge Node2 -> NodeB, while the
second will forget about the edge Node1 -> NodeB.

Instead, this PR makes it so `map_children` is called *after* inserting
all nodes into the graph. This ensures that a different ordering can't
affect the final shape of the graph.

It also refactors the `GraphTraversal::visit` method and the `Visit`
trait to make it more consistent with graph terminology and (hopefully)
easier to understand.

### Testing Instructions

This solves an issue on the branch sokra/repro-for-alex.
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Jul 29, 2024
### Description

The current design of the `Visit` trait makes it easy to introduce
non-determinism by having `map_children` return different results
depending on the order in which it's called.

For instance, if `map_children` tries to de-duplicate children (as is
the case in `ChunkContentVisit`):

```
Order 1:
1. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA, NodeB])
2. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeC])

Order 2:
1. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeB, NodeC])
2. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA])
```

These two calling orders will result in different generated graphs, as
the first order will forget about the edge Node2 -> NodeB, while the
second will forget about the edge Node1 -> NodeB.

Instead, this PR makes it so `map_children` is called *after* inserting
all nodes into the graph. This ensures that a different ordering can't
affect the final shape of the graph.

It also refactors the `GraphTraversal::visit` method and the `Visit`
trait to make it more consistent with graph terminology and (hopefully)
easier to understand.

### Testing Instructions

This solves an issue on the branch sokra/repro-for-alex.
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Aug 1, 2024
### Description

The current design of the `Visit` trait makes it easy to introduce
non-determinism by having `map_children` return different results
depending on the order in which it's called.

For instance, if `map_children` tries to de-duplicate children (as is
the case in `ChunkContentVisit`):

```
Order 1:
1. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA, NodeB])
2. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeC])

Order 2:
1. (from parent Node2) map_children([NodeB, NodeC]) -> GraphTraversalControlFlow::Continue([NodeB, NodeC])
2. (from parent Node1) map_children([NodeA, NodeB]) -> GraphTraversalControlFlow::Continue([NodeA])
```

These two calling orders will result in different generated graphs, as
the first order will forget about the edge Node2 -> NodeB, while the
second will forget about the edge Node1 -> NodeB.

Instead, this PR makes it so `map_children` is called *after* inserting
all nodes into the graph. This ensures that a different ordering can't
affect the final shape of the graph.

It also refactors the `GraphTraversal::visit` method and the `Visit`
trait to make it more consistent with graph terminology and (hopefully)
easier to understand.

### Testing Instructions

This solves an issue on the branch sokra/repro-for-alex.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants