-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: revamp pageserver backpressure #8390
Labels
a/performance
Area: relates to performance of the system
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
Comments
skyzh
added
t/bug
Issue Type: Bug
c/storage/pageserver
Component: storage: pageserver
labels
Jul 15, 2024
Let's look over our existing backpressure-related issues and make a plan |
Plan:
Our existing mitigation for L0 compaction (only compact 10 at once) makes us safe. |
erikgrinaker
changed the title
Epic: pageserver backpressure
Epic: revamp pageserver backpressure
Dec 12, 2024
This was referenced Dec 12, 2024
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 15, 2024
## Problem In #8550, we made the flush loop wait for uploads after every layer. This was to avoid unbounded buildup of uploads, and to reduce compaction debt. However, the approach has several problems: * It prevents upload parallelism. * It prevents flush and upload pipelining. * It slows down ingestion even when there is no need to backpressure. * It does not directly backpressure WAL ingestion (only via `disk_consistent_lsn`), and will build up in-memory layers. * It does not directly backpressure based on compaction debt and read amplification. An alternative solution to these problems is proposed in #8390. In the meanwhile, we revert the change to reduce the impact on ingest throughput. This does reintroduce some risk of unbounded upload/compaction buildup. Until #8390, this can be addressed in other ways: * Use `max_replication_apply_lag` (aka `remote_consistent_lsn`), which will more directly limit upload debt. * Shard the tenant, which will spread the flush/upload work across more Pageservers and move the bottleneck to Safekeeper. Touches #10095. ## Summary of changes Remove waiting on the upload queue in the flush loop.
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 3, 2025
This reverts commit f3ecd5d. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](#10096) before properly addressing [compaction backpressure](#8390).
erikgrinaker
added a commit
that referenced
this issue
Jan 3, 2025
This reverts commit f3ecd5d. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](#10096) before properly addressing [compaction backpressure](#8390).
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 14, 2025
## Problem The upload queue currently sees significant head-of-line blocking. For example, index uploads act as upload barriers, and for every layer flush we schedule a layer and index upload, which effectively serializes layer uploads. Resolves #10096. ## Summary of changes Allow upload queue operations to bypass the queue if they don't conflict with preceding operations, increasing parallelism. NB: the upload queue currently schedules an explicit barrier after every layer flush as well (see #8550). This must be removed to enable parallelism. This will require a better mechanism for compaction backpressure, see e.g. #8390 or #5415.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
a/performance
Area: relates to performance of the system
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
Followup on https://neondb.slack.com/archives/C03F5SM1N02/p1721058880447979 and #10095.
Updated proposal 2024-12-12 by @erikgrinaker:
Recall the current backpressure mechanism, based on these compute knobs:
max_replication_write_lag
: 500 MB (based on Pageserverlast_received_lsn
).max_replication_flush_lag
: 10 GB (based on Pageserverdisk_consistent_lsn
).max_replication_apply_lag
: disabled (based on Pageserverremote_consistent_lsn
).If the compute WAL leads by the given thresholds, the compute will inject a 10 ms sleep after every WAL record.
There are three aspects we don't backpressure on, but should:
remote_consistent_lsn
is misleading (#10095 comment).With sharding,
disk_consistent_lsn
orremote_consistent_lsn
are misleading, because they don't scale with shard count. 8 shards lagging by 1 GB LSN is very different from 1 shard lagging by 1 GB LSN -- we should bound the outstanding amount of work per shard, not the total outstanding work.Additionally, the current backpressure protocol has a few issues:
Sketch for a new backpressure protocol:
Tasks
Tasks
The text was updated successfully, but these errors were encountered: