Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made client id parsing vcluster aware #18464

Merged
merged 6 commits into from
May 16, 2024

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented May 14, 2024

Previously Redpanda virtualized Kafka connections based on the full
client_id string and the string was used as a client id in all
downstream processing.

Change the parsing logic to add context to the parsed client id.
The client id format expected by Redpanda has the following structure:

[vcluster_id][connection_id][actual client id]

where:

vcluster_id - string encoded XID representing virtual cluster
(20 characters)

connection_id - hex encoded 32 bit integer representing virtual
connection id (8 characters)

client_id - standard protocol defined client id

If Redpanda fails to parse the client id while working with virtualized
connections the whole connection is closed.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

Signed-off-by: Michał Maślanka <michal@redpanda.com>
When using mpx protocol extension actual client id is only a part of the
whole client id buffer sent by MPX to Redpanda. Added a method allowing
overriding client id.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, have some questions about input validation

Comment on lines 80 to 98
vcluster_connection_id parse_vcluster_connection_id(std::string_view str) {
vcluster_connection_id cid;
std::stringstream sstream(str.data());
sstream >> std::hex >> cid;
return cid;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be some validation to ensure that the vcluster_connection_id contains only valid hex characters. The streaming operation won't throw any exception if an invalid character is encountered and will simply stop processing.

src/v/kafka/server/connection_context.cc Outdated Show resolved Hide resolved
tests/rptest/tests/connection_virtualizing_test.py Outdated Show resolved Hide resolved
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple questions but generally lgtm. I agree with @michael-redpanda - some DT tests for junk input might be nice.

src/v/kafka/server/connection_context.cc Outdated Show resolved Hide resolved
src/v/kafka/server/connection_context.h Outdated Show resolved Hide resolved
Previously Redpanda virtualized Kafka connections based on the full
client_id string and the string was used as a client id in all
downstream processing.

Change the parsing logic to add context to the parsed client id.
The client id format expected by Redpanda has the following structure:

```
[vcluster_id][connection_id][actual client id]
```
where:

`vcluster_id` - string encoded XID representing virtual cluster
	        (20 characters)

`connection_id` - hex encoded 32 bit integer representing virtual
                  connection id (8 characters)

`client_id` - standard protocol defined client id

If Redpanda fails to parse the client id while working with virtualized
connections the whole connection is closed.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
Signed-off-by: Michał Maślanka <michal@redpanda.com>
Signed-off-by: Michał Maślanka <michal@redpanda.com>
Replaced `node_hash_map` keeping state of virtualized connections with
`chunked_hash_map`. The change will allow us to avoid large allocations
when dealing with large virtual connection number.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
@vbotbuildovich
Copy link
Collaborator

Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mmaslankaprv mmaslankaprv merged commit 076ddb8 into redpanda-data:dev May 16, 2024
18 checks passed
@mmaslankaprv mmaslankaprv deleted the mpx-client-parsing branch May 16, 2024 06:27
@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants