Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce emission strategy into CockroachDB Watch API #2120

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

vroldanbet
Copy link
Contributor

@vroldanbet vroldanbet commented Nov 5, 2024

depends on #2122
introduces WatchOptions.EmissionStrategy to handle limitations of CockroachDB changefeeds.

In CRDB, checkpointing for a span only occurs once its catch-up scan completes. As a result, resolved timestamps for the changefeed won't be emitted until all spans have been checkpointed and moved past the initial high-water mark.
Unfortunately, no workarounds exist to make the changefeed emit resolved timestamps during a catch-up scan. This is a known issue on CRL's radar.

This is important because, without checkpoints, it means changes are accumulated in memory, and depending on the use case can lead to increased memory pressure and out-of-memory errors

This commit introduces a new change emission strategy option into WatchOptions:

  • EmitWhenCheckpointedStrategy is the original strategy, which accumulates until a checkpoint is emitted.
  • EmitImmediatelyStrategy is the new strategy that streams changes right away. It's important to note that makes the client responsible of:
    • deduplicating changes
    • ordering revisions
    • accumulate changes until a checkpoint is emitted

Essentially the client is responsible for doing all the work that used to be done with the Change struct helper.

For every other datastore, calling Watch API with EmitImmediatelyStrategy will return an error. We don't consider there is an evident need at the moment for other data stores. Once we've proven this strategy is viable in production workloads, we will study implementing a Watch API strategy that buffers to disk to address out-of-memory issues, and potentially deprecate the emit immediately strategy in favor of one that does all the heavy lifting for the client.

@github-actions github-actions bot added the area/datastore Affects the storage system label Nov 5, 2024
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch 4 times, most recently from 24c72be to 1862861 Compare November 6, 2024 13:50
@github-actions github-actions bot added the area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) label Nov 6, 2024
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch 4 times, most recently from 72c6109 to de69cc2 Compare November 6, 2024 14:22
@vroldanbet vroldanbet self-assigned this Nov 6, 2024
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch from de69cc2 to 384c649 Compare November 6, 2024 16:58
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch 2 times, most recently from 611ba5a to d9300de Compare November 6, 2024 19:38
@vroldanbet vroldanbet marked this pull request as ready for review November 6, 2024 19:38
@vroldanbet vroldanbet requested a review from a team November 6, 2024 19:38
@github-actions github-actions bot removed the area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) label Nov 6, 2024
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch from d9300de to e51d42d Compare November 6, 2024 20:00
josephschorr
josephschorr previously approved these changes Nov 6, 2024
Copy link
Member

@josephschorr josephschorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…ockroachDB changefeeds.

In CRDB, checkpointing for a span only occurs once its catch-up scan completes. As a result,
resolved timestamps for the changefeed won't be emitted until all spans have been checkpointed
and moved past the initial high-water mark.
Unfortunately, no workarounds exist to make the changefeed emit resolved timestamps during
a catch-up scan. This is a known issue on CRL's radar.

This is important because, without checkpoints, it means changes are accumulated in
memory, and depending on the use case can lead to increased memory pressure and out-of-memory errors

This commit introduces a new change emission strategy option into WatchOptions:
- `EmitWhenCheckpointedStrategy` is the original strategy, which accumulates until a checkpoint is emitted.
- `EmitImmediatelyStrategy` is the new strategy that streams changes right away.
  It's important to note that makes the client responsible of:
  - deduplicating changes
  - ordering revisions
  - accumulate changes until a checkpoint is emitted

Essentially the client is responsible for doing all the work that used to be done with the `Change` struct helper.

For every other datastore, calling Watch API with `EmitImmediatelyStrategy` will return an error.
We don't consider there is an evident need at the moment for other data stores. Once we've proven
this strategy is viable in production workloads, we will study implementing a Watch API strategy that
buffers to disk to address out-of-memory issues, and potentially deprecate the emit immediately
strategy in favor of one that does all the heavy lifting for the client.
@vroldanbet vroldanbet force-pushed the crdb-emission-strategy branch from cbac7b9 to 6c4ceb2 Compare November 7, 2024 10:32
@vroldanbet vroldanbet enabled auto-merge November 7, 2024 10:32
@vroldanbet vroldanbet added this pull request to the merge queue Nov 7, 2024
Merged via the queue into main with commit ef6e183 Nov 7, 2024
22 checks passed
@vroldanbet vroldanbet deleted the crdb-emission-strategy branch November 7, 2024 11:20
@github-actions github-actions bot locked and limited conversation to collaborators Nov 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/datastore Affects the storage system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants