Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pageserver: billing events sent to Vector and S3 should use the same idempotency key #8605

Closed
Daniel-ef opened this issue Aug 5, 2024 · 3 comments · Fixed by #8876
Closed
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug triaged bugs that were already triaged

Comments

@Daniel-ef
Copy link

Daniel-ef commented Aug 5, 2024

Problem

The idempotency key is used to make sure that events are not recorded more than once, although they may be submitted multiple times in some cases to make sure they are not lost.

We should calculate the idempotency key beforehand and send the same events to S3 and Vector.

Relates (internal issue): https://github.com/neondatabase/cloud/issues/9824

Detail

Consumption metrics for billing are written over a socket to an external service (Vector), and also written to S3 for posterity.

In consumption_metrics.rs, we call two output methods with the same vector of metric values:

  • upload::upload_metrics_bucket
  • upload::upload_metrics_http

Each of these ultimately uses RawMetric::as_event on each metric to add an "idempotency key" to the entry: this enables the billing system to receive delta metrics (e.g. data written since last sample) without risking double-counting on retries.

To ensure the S3 output and the Vector output have the same idempotency key, we need to pull the calculation of the keys up into collect_metrics, and pass those with the RawMetrics into each upload function, so that the uploads aren't independently calculating different keys.

@Daniel-ef Daniel-ef added the c/storage/pageserver Component: storage: pageserver label Aug 5, 2024
@jcsp jcsp added the t/bug Issue Type: Bug label Aug 5, 2024
@jcsp jcsp added the triaged bugs that were already triaged label Aug 15, 2024
@jcsp
Copy link
Collaborator

jcsp commented Aug 15, 2024

Note: If we can eliminate any cases that rely on counter metrics (bytes written), then we can fix this by just settings a constant idempotency key when sending synthetic size.

@jcsp
Copy link
Collaborator

jcsp commented Aug 15, 2024

Checking internally if we can get rid of the counter metric that is the motivation for having idempotency keys to begin with https://neondb.slack.com/archives/C061CPK7UQL/p1723737069420979

@jcsp
Copy link
Collaborator

jcsp commented Aug 29, 2024

Looks like we need the counter metric for the forseeable future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug triaged bugs that were already triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants