Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate-rejected connection metric #22803

Conversation

travisdowns
Copy link
Member

Add connections_rejected_rate_limit which counts connections rejected
due to the rate limit, analogously to the existing metric which counts
rejected connections due to the hitting the open connection limit.

Some additional renaming & clarification of these two connection related limits.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

Improvements

  • Added vectorized_kafka_rpc_connections_rejected_rate_limit metric which counts incoming Kafka connections rejected due to the connection rate limit (if set), analogously to the existing vectorized_kafka_rpc_connections_rejected metric which counts rejected connections due to the hitting the open connection limit.

Adjust the existing metric text to clarify that is connections
rejected for hitting the connection count cap, not the connection
rate limit cap (as we are shortly introducing a metric for the
latter).

Ref CORE-6827.
We can reject connections because we hit two types of limits: the
open connection limit or the rate limit (in the rate limit case
we do try to wait a short period to see if accepting the connection
after the wait would meet the rate, but if not we reject).

Existing naming of variables, log lines etc was inconsistent using
"declined" for the second case and generally not being clear about
the two cases.

Bring this into line by using "rejected" as the standard terminology
and then "open limit" and "rate limit" being the two reasons.

Log the addr in the same way too, including the port, in either case.
Add connections_rejected_rate_limit which counts connections rejected
due to the rate limit, analogously to the existing metric which counts
rejected connections due to the hitting the open connection limit.

Fixes CORE-6827.
@travisdowns travisdowns changed the title Td core 6827 rate rejected connections metric Add rate-rejected connection metric Aug 8, 2024
@travisdowns
Copy link
Member Author

@mmaslankaprv put you on review here as Vadim did the original changes. Based on the PR it looks like the intent was to add this metric originally (all the plumbing was there) but the metric was never registered. So this just fixes that.

Backport to 24.2 since this is something we sort of need for existing clusters over the new few months as it is a blind spot currently.

@travisdowns travisdowns merged commit 0495334 into redpanda-data:dev Aug 9, 2024
21 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants