Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend: add interpreted record replication message type #32

Merged
merged 3 commits into from
Nov 25, 2024

Conversation

VladLazar
Copy link

Add a new replication message type to support the interpreted safekeeper <-> pageserver protocol.
See also neondatabase/neon#9746

This is to be used by the safekeeper -> pageserver protocol.
@VladLazar VladLazar merged commit 2a2a7c5 into neon Nov 25, 2024
6 checks passed
@VladLazar VladLazar deleted the vlad/interpreted-wal-record-replication-support branch November 25, 2024 14:38
github-merge-queue bot pushed a commit to neondatabase/neon that referenced this pull request Nov 25, 2024
…#9746)

## Problem

For any given tenant shard, pageservers receive all of the tenant's WAL
from the safekeeper.
This soft-blocks us from using larger shard counts due to bandwidth
concerns and CPU overhead of filtering
out the records.

## Summary of changes

This PR lifts the decoding and interpretation of WAL from the pageserver
into the safekeeper.

A customised PG replication protocol is used where instead of sending
raw WAL, the safekeeper sends
filtered, interpreted records. The receiver drives the protocol
selection, so, on the pageserver side, usage
of the new protocol is gated by a new pageserver config:
`wal_receiver_protocol`.

 More granularly the changes are:
1. Optionally inject the protocol and shard identity into the arguments
used for starting replication
2. On the safekeeper side, implement a new wal sending primitive which
decodes and interprets records
 before sending them over
3. On the pageserver side, implement the ingestion of this new
replication message type. It's very similar
 to what we already have for raw wal (minus decoding and interpreting).
 
 ## Notes
 
* This PR currently uses my [branch of
rust-postgres](https://github.com/neondatabase/rust-postgres/tree/vlad/interpreted-wal-record-replication-support)
which includes the deserialization logic for the new replication message
type. PR for that is open
[here](neondatabase/rust-postgres#32).
* This PR contains changes for both pageservers and safekeepers. It's
safe to merge because the new protocol is disabled by default on the
pageserver side. We can gradually start enabling it in subsequent
releases.
* CI tests are running on #9747
 
 ## Links
 
 Related: #9336
 Epic: #9329
conradludgate pushed a commit that referenced this pull request Nov 28, 2024
This is to be used by the safekeeper -> pageserver protocol.
conradludgate pushed a commit that referenced this pull request Dec 4, 2024
This is to be used by the safekeeper -> pageserver protocol.
conradludgate pushed a commit that referenced this pull request Dec 4, 2024
This is to be used by the safekeeper -> pageserver protocol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants