storage controller: use proper ScheduleContext when evacuating a node #9908

jcsp · 2024-11-27T12:43:06Z

Problem

When picking locations for a shard, we should use a ScheduleContext that includes all the other shards in the tenant, so that we apply proper anti-affinity between shards. If we don't do this, then it can lead to unstable scheduling, where we place a shard somewhere that the optimizer will then immediately move it away from.

We didn't always do this, because it was a bit awkward to accumulate the context for a tenant rather than just walking tenants.

This was a TODO in handle_node_availability_transition:

                        // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters
                        // for tenants without secondary locations: if they have a secondary location, then this
                        // schedule() call is just promoting an existing secondary)

This is a precursor to #8264, where the current imperfect scheduling during node evacuation hampers testing.

Summary of changes

Add an iterator type that yields each shard along with a schedulecontext that includes all the other shards from the same tenant
Use the iterator to replace hand-crafted logic in optimize_all_plan (functionally identical)
Use the iterator in handle_node_availability_transition to apply proper anti-affinity during node evacuation.

VladLazar

Nice!

storage_controller/src/service.rs

storage_controller/src/scheduler.rs

storage_controller/src/service.rs

github-actions · 2024-11-27T15:18:40Z

6967 tests run: 6659 passed, 0 failed, 308 skipped (full report)

Flaky tests (4)

Postgres 17

test_pageserver_compaction_circuit_breaker: debug-x86-64
test_metric_collection_cleans_up_tempfile: debug-x86-64
test_location_conf_churn[3]: debug-x86-64
test_scrubber_physical_gc_ancestors[2]: debug-x86-64

Code coverage* (full report)

functions: 30.6% (7988 of 26068 functions)
lines: 48.6% (63486 of 130635 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
252d6b5 at 2024-11-29T13:40:59.579Z :recycle:}

…#9908) ## Problem When picking locations for a shard, we should use a ScheduleContext that includes all the other shards in the tenant, so that we apply proper anti-affinity between shards. If we don't do this, then it can lead to unstable scheduling, where we place a shard somewhere that the optimizer will then immediately move it away from. We didn't always do this, because it was a bit awkward to accumulate the context for a tenant rather than just walking tenants. This was a TODO in `handle_node_availability_transition`: ``` // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters // for tenants without secondary locations: if they have a secondary location, then this // schedule() call is just promoting an existing secondary) ``` This is a precursor to #8264, where the current imperfect scheduling during node evacuation hampers testing. ## Summary of changes - Add an iterator type that yields each shard along with a schedulecontext that includes all the other shards from the same tenant - Use the iterator to replace hand-crafted logic in optimize_all_plan (functionally identical) - Use the iterator in `handle_node_availability_transition` to apply proper anti-affinity during node evacuation.

jcsp added 2 commits November 27, 2024 13:07

storcon: add tenant context iterator

5230552

storcon: use tenant iterator in node evac and optimisatino

d70d741

jcsp requested a review from VladLazar November 27, 2024 13:09

jcsp force-pushed the jcsp/storcon-context-iterator branch from 66f7e23 to d70d741 Compare November 27, 2024 13:09

VladLazar approved these changes Nov 27, 2024

View reviewed changes

storage_controller/src/service.rs Outdated Show resolved Hide resolved

storage_controller/src/service.rs Outdated Show resolved Hide resolved

storage_controller/src/scheduler.rs Show resolved Hide resolved

storage_controller/src/service.rs Outdated Show resolved Hide resolved

jcsp added 2 commits November 28, 2024 10:14

review nits

c03bbd8

move into module

252d6b5

jcsp marked this pull request as ready for review November 28, 2024 10:21

jcsp requested a review from a team as a code owner November 28, 2024 10:21

jcsp requested a review from arpad-m November 28, 2024 10:21

jcsp enabled auto-merge November 28, 2024 10:22

jcsp removed the request for review from arpad-m November 28, 2024 10:22

jcsp added this pull request to the merge queue Nov 29, 2024

Merged via the queue into main with commit ea3798e Nov 29, 2024
81 checks passed

jcsp deleted the jcsp/storcon-context-iterator branch November 29, 2024 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage controller: use proper ScheduleContext when evacuating a node #9908

storage controller: use proper ScheduleContext when evacuating a node #9908

jcsp commented Nov 27, 2024 •

edited

Loading

VladLazar left a comment

github-actions bot commented Nov 27, 2024 •

edited

Loading

Postgres 17

storage controller: use proper ScheduleContext when evacuating a node #9908

storage controller: use proper ScheduleContext when evacuating a node #9908

Conversation

jcsp commented Nov 27, 2024 • edited Loading

Problem

Summary of changes

VladLazar left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 27, 2024 • edited Loading

6967 tests run: 6659 passed, 0 failed, 308 skipped (full report)

Postgres 17

Code coverage* (full report)

jcsp commented Nov 27, 2024 •

edited

Loading

github-actions bot commented Nov 27, 2024 •

edited

Loading