Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage controller: use proper ScheduleContext when evacuating a node #9908

Merged
merged 4 commits into from
Nov 29, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Nov 27, 2024

Problem

When picking locations for a shard, we should use a ScheduleContext that includes all the other shards in the tenant, so that we apply proper anti-affinity between shards. If we don't do this, then it can lead to unstable scheduling, where we place a shard somewhere that the optimizer will then immediately move it away from.

We didn't always do this, because it was a bit awkward to accumulate the context for a tenant rather than just walking tenants.

This was a TODO in handle_node_availability_transition:

                        // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters
                        // for tenants without secondary locations: if they have a secondary location, then this
                        // schedule() call is just promoting an existing secondary)

This is a precursor to #8264, where the current imperfect scheduling during node evacuation hampers testing.

Summary of changes

  • Add an iterator type that yields each shard along with a schedulecontext that includes all the other shards from the same tenant
  • Use the iterator to replace hand-crafted logic in optimize_all_plan (functionally identical)
  • Use the iterator in handle_node_availability_transition to apply proper anti-affinity during node evacuation.

@jcsp jcsp requested a review from VladLazar November 27, 2024 13:09
@jcsp jcsp force-pushed the jcsp/storcon-context-iterator branch from 66f7e23 to d70d741 Compare November 27, 2024 13:09
Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

storage_controller/src/service.rs Outdated Show resolved Hide resolved
storage_controller/src/service.rs Outdated Show resolved Hide resolved
storage_controller/src/scheduler.rs Show resolved Hide resolved
storage_controller/src/service.rs Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Nov 27, 2024

6967 tests run: 6659 passed, 0 failed, 308 skipped (full report)


Flaky tests (4)

Postgres 17

Code coverage* (full report)

  • functions: 30.6% (7988 of 26068 functions)
  • lines: 48.6% (63486 of 130635 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
252d6b5 at 2024-11-29T13:40:59.579Z :recycle:

@jcsp jcsp marked this pull request as ready for review November 28, 2024 10:21
@jcsp jcsp requested a review from a team as a code owner November 28, 2024 10:21
@jcsp jcsp requested a review from arpad-m November 28, 2024 10:21
@jcsp jcsp enabled auto-merge November 28, 2024 10:22
@jcsp jcsp removed the request for review from arpad-m November 28, 2024 10:22
@jcsp jcsp added this pull request to the merge queue Nov 29, 2024
Merged via the queue into main with commit ea3798e Nov 29, 2024
81 checks passed
@jcsp jcsp deleted the jcsp/storcon-context-iterator branch November 29, 2024 13:28
awarus pushed a commit that referenced this pull request Dec 5, 2024
…#9908)

## Problem

When picking locations for a shard, we should use a ScheduleContext that
includes all the other shards in the tenant, so that we apply proper
anti-affinity between shards. If we don't do this, then it can lead to
unstable scheduling, where we place a shard somewhere that the optimizer
will then immediately move it away from.

We didn't always do this, because it was a bit awkward to accumulate the
context for a tenant rather than just walking tenants.

This was a TODO in `handle_node_availability_transition`:
```
                        // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters
                        // for tenants without secondary locations: if they have a secondary location, then this
                        // schedule() call is just promoting an existing secondary)
```

This is a precursor to #8264,
where the current imperfect scheduling during node evacuation hampers
testing.

## Summary of changes

- Add an iterator type that yields each shard along with a
schedulecontext that includes all the other shards from the same tenant
- Use the iterator to replace hand-crafted logic in optimize_all_plan
(functionally identical)
- Use the iterator in `handle_node_availability_transition` to apply
proper anti-affinity during node evacuation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants