Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prewarm compute nodes #4828

Merged
merged 3 commits into from
Jul 31, 2023
Merged

Prewarm compute nodes #4828

merged 3 commits into from
Jul 31, 2023

Conversation

bojanserafimov
Copy link
Contributor

@bojanserafimov bojanserafimov commented Jul 27, 2023

Problem

After deploying the sync safekeepers hot path, total startup improved, but postgres startup p90 in us-east-2 got worse. I'm not sure but seems that it's only for VMs. That's the only region that has VMs, and they take about 10% of starts, and only p90 starts are impacted. Grepping logs also confirms it's mostly VMs.

Now the question is, why does postgres start faster on VMs if we run walproposer.c before it? It's possible that walproposer warms up the vm:

  • qemu allocates ram lazily, so walproposer breaks the ice for postgres
  • our binaries are large, and walproposer puts them in os cache
  • ???

Summary of changes

When we start compute_ctl in pool mode, I run initdb and postgres to warm it up. I'm not sure if this will have an effect but it's easy to test on stage. If it doesn't work it's easy to revert.

@github-actions
Copy link

github-actions bot commented Jul 27, 2023

1240 tests run: 1190 passed, 0 failed, 50 skipped (full report)


@bojanserafimov
Copy link
Contributor Author

Another mystery: why does postgres start time depend so much on requester cc @ololobus in case it's obvious to you. Maybe it's a correlation, but idk what it would correlate with

Copy link
Member

@ololobus ololobus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's a peculiar theory :) I'd love to see more proofs / investigation, but should be also fine just to try, though

compute_tools/src/bin/compute_ctl.rs Show resolved Hide resolved
compute_tools/src/compute.rs Outdated Show resolved Hide resolved
@ololobus
Copy link
Member

Another mystery: why does postgres start time depend so much on [requester] cc @\ololobus in case it's obvious to you. Maybe it's a correlation, but idk what it would correlate with

Several correlations you should be aware of:

  • endpoint_api - is explicit API call to start endpoint OR create new endpoint on existing branch without endpoint (or add RO), I'm pretty sure that 99% of these calls are ours and come from e2e tests
  • create_branch - is start on some new timeline, it could be that on start Postgres does some get page requests, and maybe Pageserver performs better when it's a fresh timeline? (I was expecting the opposite, though)
  • proxy - as it's staging, this is the only one, who could be related to some wake ups for periodic perf tests we do on staging. Alexander, Lassi and Artur all have their fleet of computes, so older compute may start slower? (again if Postgres does enough get page requests)

Just random thoughts, cannot give any good explanation

@bojanserafimov
Copy link
Contributor Author

bojanserafimov commented Jul 28, 2023

Wow, that's a peculiar theory :) I'd love to see more proofs / investigation, but should be also fine just to try, though

One more data point: I found out that on staging walproposer full sync actually runs on endpoint_api and create_branch requests. And that's exactly the group of postgres starts that didn't regress :)

Off topic: for this reason I also switched the "inside pod breakdown" panel to only show proxy requests. IMO makes sense to de-prioritize create_branch, as it's the same category thing as create_project

@bojanserafimov bojanserafimov marked this pull request as ready for review July 28, 2023 14:33
@bojanserafimov bojanserafimov requested a review from a team as a code owner July 28, 2023 14:33
Copy link
Member

@ololobus ololobus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any immediate issues with it, if you want to run experiments on stgaing :)

compute_tools/src/compute.rs Show resolved Hide resolved
@bojanserafimov bojanserafimov merged commit ddbe170 into main Jul 31, 2023
@bojanserafimov bojanserafimov deleted the prewarm-compute branch July 31, 2023 18:13
@bojanserafimov
Copy link
Contributor Author

Now that we know this works, let's reopen the question of "avoid binding to vms that are busy prewarming". Do you think I should add a new ComputeStatus variant to deal with the "prewarming" case, or maybe avoid taking http requests until we're done prewarming? Or maybe just let cplane figure it out (prefer older VMs to newer ones)

@ololobus
Copy link
Member

ololobus commented Aug 1, 2023

Do you think I should add a new ComputeStatus variant to deal with the "prewarming" case

If there will be a state before empty this will work

or maybe avoid taking http requests until we're done prewarming?

That's the easiest one, I guess, just don't start http server until prewarming is finished

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants