Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: revert flush backpressure (#8550) #10135

Merged
merged 5 commits into from
Dec 15, 2024
Merged

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Dec 13, 2024

Problem

In #8550, we made the flush loop wait for uploads after every layer. This was to avoid unbounded buildup of uploads, and to reduce compaction debt. However, the approach has several problems:

  • It prevents upload parallelism.
  • It prevents flush and upload pipelining.
  • It slows down ingestion even when there is no need to backpressure.
  • It does not directly backpressure WAL ingestion (only via disk_consistent_lsn), and will build up in-memory layers.
  • It does not directly backpressure based on compaction debt and read amplification.

An alternative solution to these problems is proposed in #8390.

In the meanwhile, we revert the change to reduce the impact on ingest throughput. This does reintroduce some risk of unbounded upload/compaction buildup. Until #8390, this can be addressed in other ways:

  • Use max_replication_apply_lag (aka remote_consistent_lsn), which will more directly limit upload debt.
  • Shard the tenant, which will spread the flush/upload work across more Pageservers and move the bottleneck to Safekeeper.

Touches #10095.

Summary of changes

Remove waiting on the upload queue in the flush loop.

@erikgrinaker erikgrinaker requested a review from arpad-m December 13, 2024 09:29
@erikgrinaker erikgrinaker requested a review from a team as a code owner December 13, 2024 09:29
Copy link

github-actions bot commented Dec 13, 2024

7110 tests run: 6813 passed, 0 failed, 297 skipped (full report)


Flaky tests (7)

Postgres 17

Postgres 16

  • test_physical_replication_config_mismatch_too_many_known_xids: release-arm64

Postgres 15

  • test_pgdata_import_smoke[None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: release-arm64
  • test_lr_with_slow_safekeeper: release-x86-64

Code coverage* (full report)

  • functions: 31.4% (8399 of 26729 functions)
  • lines: 48.0% (66605 of 138626 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
9f3f060 at 2024-12-13T14:32:24.415Z :recycle:

@arpad-m arpad-m requested a review from problame December 13, 2024 11:46
Copy link
Member

@arpad-m arpad-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me but I want to see if Christian is okay with this (I think he is the one who originally suggested it)

Copy link
Member

@arpad-m arpad-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a few tests that fail, they need fixing.

@erikgrinaker
Copy link
Contributor Author

There is a few tests that fail, they need fixing.

I know, working my way through them.

@erikgrinaker
Copy link
Contributor Author

In #10144, I add parallel layer uploads by only scheduling index uploads (which act as upload queue barriers) every 8 layers.

If we want to derisk this, a halfway solution might be to keep the flush backpressure for now, but only trigger it every time we perform an index upload (every 8 layers) -- this would provide some parallelism and pipelining without building up an unbounded upload queue.

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving under the condition that this time we actually implement compact debt based backpressure.

Is this in Q1 planning?

I like your plan in #10095

@erikgrinaker
Copy link
Contributor Author

Merging this to get some benchmark runtime.

Is this in Q1 planning?

I took this to be part of "Write path throttling", which is right on the edge of Q1 inclusion. I agree that it's worth bumping this, certainly more so than S3 throttling. cc @jcsp

I like your plan in #10095

Sharding complicates #10095 and can lead to deadlock. I have an updated proposal in #8390.

@erikgrinaker erikgrinaker added this pull request to the merge queue Dec 15, 2024
Merged via the queue into main with commit f3ecd5d Dec 15, 2024
83 checks passed
@erikgrinaker erikgrinaker deleted the erik/revert-flush-wait branch December 15, 2024 09:47
erikgrinaker added a commit that referenced this pull request Jan 3, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jan 3, 2025
This reverts commit f3ecd5d.

It is
[suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759)
to have caused significant read amplification in the [ingest
benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs)
(specifically during index creation).

We will revisit an intermediate improvement here to unblock [upload
parallelism](#10096) before
properly addressing [compaction
backpressure](#8390).
erikgrinaker added a commit that referenced this pull request Jan 3, 2025
This reverts commit f3ecd5d.

It is
[suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759)
to have caused significant read amplification in the [ingest
benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs)
(specifically during index creation).

We will revisit an intermediate improvement here to unblock [upload
parallelism](#10096) before
properly addressing [compaction
backpressure](#8390).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants