feat(logger): rate limit based on peer address #1351

oddgrd · 2023-10-27T12:17:12Z

Description of change

Add rate limiting by peer address to the logger server. This sets the sustained load LPS limit for projects to 512, but allowing bursts of up to 256 * 6. The rate limiter will regenerate a slot every 0.5s.

How has this been tested? (if applicable)

Tested by running projects locally that went over the limit. I also created an integration test to verify the limits work as expected, and the returned tonic status and headers are as expected when rate limiting is applied.

…ogs-send

oddgrd · 2023-11-02T08:28:49Z

cargo-shuttle/src/lib.rs

+                        Err(err) => {
+                            debug!(error = %err, "failed to parse logs");
+
+                            let message = if line.contains("rate limit") {


I'm not too happy about how I handle this case (string matching), but it should be rare (it will only happen when the application is being rate limited and the user tries to request logs, and it won't always fail). Suggestions on how to improve the error handling here are most welcome.

What about having the error be an actual JSON as well ? You could deserialize this as an enum (it's either a log line from the app or a control message), and match on that instead.

Thanks for the suggestion! I don't think we can make an enum that matches on either log line or control message, since it would break the logs stream of all users who don't upgrade. But just sending a JSON error works. New clients will be able to deserialize the error, older clients will receive an error that suggests upgrading their CLI.

Here is the error they get in the rare case that the logger client is rate limited + their logs request got rate limited:

Error: 429 Too Many Requests Message: your application is producing too many logs. Interactions with the shuttle logger service will be rate limited

…ogs-send

…te-limits-the-amount-of-logs-send

chesedo

Left some notes 😄

chesedo · 2023-11-10T16:07:54Z

deployer/src/handlers/error.rs

@@ -26,6 +26,8 @@ pub enum Error {
    Internal(#[from] anyhow::Error),
    #[error("Missing header: {0}")]
    MissingHeader(String),
+    #[error("{0}. Retry the command in a few minutes")]


Why is rate limiting telling the user to retry a command?

Currently, this is returned if the deployer's logger client is being rate limited and the user calls the get_logs or get_logs_stream endpoints. Since this API is also consumed from the console, instructing to run a command is a bit misleading, I can rename it to Retry the request....

deployer/src/handlers/mod.rs

chesedo · 2023-11-10T16:11:03Z

deployer/src/handlers/mod.rs

            logs.into_inner()
                .log_items
                .into_iter()
                .map(|l| l.to_log_item_with_id(deployment_id))
                .collect(),
-        ))
-    } else {
-        Err(Error::NotFound("deployment not found".to_string()))


Is the not found case no longer relevant?

No, I don't think it ever was. The https://github.com/shuttle-hq/shuttle/blob/main/logger/src/lib.rs#L89-L100 call can fail in the following ways:

can't extract the claim from extensions, this shouldn't happen since we check the claim in the deployer scopedlayers. This would return internal error

claim is missing the logs scope, shouldn't happen because we check it in deployer scopedlayer. This would return permission denied, I can add a match for this to the deployer handler, even though it is currently impossible, if we make changes in the deployer that may change.

the db query fails, this would return internal error.

Hm, after thinking about this some more, if the caller has access to this endpoint, but the logger client doesn't have access to get_logs, I think it makes sense to keep this as is, returning an internal error to the user from the deployer.

deployer/src/handlers/mod.rs

chesedo · 2023-11-10T16:18:10Z

logger/tests/integration_tests.rs

+        let governor_config = GovernorConfigBuilder::default()
+            .per_millisecond(500)
+            .burst_size(6)
+            .use_headers()
+            .key_extractor(TonicPeerIpKeyExtractor)
+            .finish()
+            .unwrap();


In theory we could make changes to the main code (rate limiter) and not catch it in the test because of this repeat. Maybe there should be a helper function in the main code that returns the governor? Or something that adds it to the server?

Ah, yes, I tried to extract the service builder into a utility function, but I couldn't get it to compile with the complicated types. I tried again now to just extract the config creation into a util, but that was also difficult due to some private types. I will add constants for these values, it's not the cleanest solution, but it should do the trick.

proto/src/lib.rs

…ogs-send

jonaro00

LGTM

proto/src/lib.rs

logger/src/rate_limiting.rs

chesedo

LGTM

* revert: rate limit based on peer address #1351 * revert: keep cargo-shuttle logs error handling

feat(logger): rate limit based on peer address

a838d5a

oddgrd marked this pull request as draft October 27, 2023 12:24

oddgrd added 11 commits October 27, 2023 17:24

feat(proto): reduce log batch size

a8a9732

feat(logger): refactor out axum layer and dependency

0f2d550

feat(deployer): handle rate limiting error for get_logs

2d42218

refactor: increase batch & burst size

f179eae

feat(logger): send warning to logger when rate limited

34a67e0

feat(logger): increase refresh rate of rate limiter to 2 RPS

ce32001

fix(proto): clippy

5841342

tests(logger): refactor rate limiter to 2 RPS

0668981

Merge branch 'main' into feature/eng-1609-rate-limits-the-amount-of-l…

36253fe

…ogs-send

refactor(deployer): match more common x-ratelimit-limit header

6311872

misc(logger): cleanups and comments

2404f5a

oddgrd commented Nov 2, 2023

View reviewed changes

oddgrd marked this pull request as ready for review November 2, 2023 08:39

oddgrd added 3 commits November 3, 2023 09:25

Merge branch 'main' into feature/eng-1609-rate-limits-the-amount-of-l…

f4bc9b6

…ogs-send

Merge remote-tracking branch 'upstream/main' into feature/eng-1609-ra…

684fbc2

…te-limits-the-amount-of-logs-send

chore: update lockfile

58ec21d

chesedo reviewed Nov 10, 2023

View reviewed changes

oddgrd added 4 commits November 14, 2023 09:54

misc: rephrase errors, comments, constants

c1df918

refactor: send deserializable error for log stream rate limit

eaa5e23

feat: add ratelimited apierror variant

6968f8c

Merge branch 'main' into feature/eng-1609-rate-limits-the-amount-of-l…

b3c640e

…ogs-send

jonaro00 approved these changes Nov 20, 2023

View reviewed changes

proto/src/lib.rs Show resolved Hide resolved

logger/src/rate_limiting.rs Outdated Show resolved Hide resolved

oddgrd added 2 commits November 21, 2023 08:36

refactor(logger): use downcast_ref in if clause

b750e6e

docs(logger): remove todo

b5705a9

chesedo approved these changes Nov 21, 2023

View reviewed changes

oddgrd merged commit 4a99d4a into main Nov 21, 2023

oddgrd deleted the feature/eng-1609-rate-limits-the-amount-of-logs-send branch November 21, 2023 11:30

oddgrd added a commit that referenced this pull request Nov 27, 2023

revert: rate limit based on peer address #1351

bf564a6

oddgrd mentioned this pull request Nov 27, 2023

revert: rate limit based on peer address #1351 #1426

Merged

jonaro00 pushed a commit that referenced this pull request Nov 28, 2023

revert: rate limit based on peer address #1351 (#1426)

bc25873

* revert: rate limit based on peer address #1351 * revert: keep cargo-shuttle logs error handling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(logger): rate limit based on peer address #1351

feat(logger): rate limit based on peer address #1351

oddgrd commented Oct 27, 2023 •

edited

Loading

oddgrd Nov 2, 2023 •

edited

Loading

Kazy Nov 10, 2023 •

edited

Loading

oddgrd Nov 20, 2023 •

edited

Loading

chesedo left a comment

chesedo Nov 10, 2023

oddgrd Nov 14, 2023 •

edited

Loading

chesedo Nov 10, 2023

oddgrd Nov 14, 2023

oddgrd Nov 14, 2023 •

edited

Loading

chesedo Nov 10, 2023

oddgrd Nov 14, 2023

jonaro00 left a comment

chesedo left a comment

feat(logger): rate limit based on peer address #1351

feat(logger): rate limit based on peer address #1351

Conversation

oddgrd commented Oct 27, 2023 • edited Loading

Description of change

How has this been tested? (if applicable)

oddgrd Nov 2, 2023 • edited Loading

Choose a reason for hiding this comment

Kazy Nov 10, 2023 • edited Loading

Choose a reason for hiding this comment

oddgrd Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

chesedo left a comment

Choose a reason for hiding this comment

chesedo Nov 10, 2023

Choose a reason for hiding this comment

oddgrd Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

chesedo Nov 10, 2023

Choose a reason for hiding this comment

oddgrd Nov 14, 2023

Choose a reason for hiding this comment

oddgrd Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

chesedo Nov 10, 2023

Choose a reason for hiding this comment

oddgrd Nov 14, 2023

Choose a reason for hiding this comment

jonaro00 left a comment

Choose a reason for hiding this comment

chesedo left a comment

Choose a reason for hiding this comment

oddgrd commented Oct 27, 2023 •

edited

Loading

oddgrd Nov 2, 2023 •

edited

Loading

Kazy Nov 10, 2023 •

edited

Loading

oddgrd Nov 20, 2023 •

edited

Loading

oddgrd Nov 14, 2023 •

edited

Loading

oddgrd Nov 14, 2023 •

edited

Loading