[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use and reduce retries. #23868

oleiman · 2024-10-21T21:35:10Z

Backport of PR #23775

Useful in audit_log_manager as well as transform logging Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 090f41b) Conflicts: No bazel anywhere

And adds make_batch_of_one Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 11b8a23)

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 37414a7)

We need these for sorting out which partitions are locally led Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 5c7ed51) Conflicts: audit_log_manager.cc: kafka/server/handlers/topics/types.h

Previous implementation used a very high value for retries on the internal kafka client, which prevents the client from recovering certain types of errors. Instead, we batch up drained records on the manager side, allowing us to hold a copy of each batch in memory and retry failed produce calls from "scratch". This also allows us to be _much_ more aggressive about batching. The internal kafka client will calculate a destination partition for each record, round robin style over the number of partitions. In the new scheme, we shoot for a maximally sized batch first, then select a destination, still round-robin style, but biasing heavily toward locally led partitions. In this way, given the default audit per-shard queue limit and default max batch size (both 1MiB), the most common drain operation should result in exactly one produce request. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit e5fd326) Conflicts: No bazel anywhere

vbotbuildovich · 2024-10-22T00:42:02Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/56960#0192b170-fd36-45ca-a6f6-a701f861f9f2
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/56960#0192b173-010a-4ff3-bf44-240b2c3ca6ba

oleiman added 3 commits October 21, 2024 14:32

record_batcher: Move from transform/logging to kafka/client

d90e8a1

Useful in audit_log_manager as well as transform logging Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 090f41b) Conflicts: No bazel anywhere

k/record_batcher: Make k/v interfaces optional<iobuf>

3cba64b

And adds make_batch_of_one Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 11b8a23)

k/record_batcher: Optionally inject a ss::logger

9d2c53a

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com> (cherry picked from commit 37414a7)

oleiman added this to the v24.1.x-next milestone Oct 21, 2024

oleiman added the kind/backport PRs targeting a stable branch label Oct 21, 2024

github-actions bot added area/redpanda area/wasm WASM Data Transforms labels Oct 21, 2024

oleiman self-assigned this Oct 21, 2024

oleiman marked this pull request as ready for review October 21, 2024 21:35

oleiman added 2 commits October 21, 2024 15:22

oleiman force-pushed the vbotbuildovich/backport-23775-v24.1.x-221 branch from 30c139c to 7e4fef8 Compare October 21, 2024 22:22

oleiman requested review from michael-redpanda and BenPope October 22, 2024 02:18

BenPope approved these changes Oct 22, 2024

View reviewed changes

oleiman merged commit 0ee9a52 into redpanda-data:v24.1.x Oct 22, 2024
17 checks passed

BenPope modified the milestones: v24.1.x-next, v24.1.18 Nov 18, 2024

BenPope mentioned this pull request Nov 18, 2024

[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use client::produce_record_batch and reduce retries. #23863

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use and reduce retries. #23868

[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use and reduce retries. #23868

oleiman commented Oct 21, 2024 •

edited

Loading

vbotbuildovich commented Oct 22, 2024 •

edited

Loading

[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use and reduce retries. #23868

[v24.1.x] [CORE-7803] Audit Log Manager: Refactoring - use and reduce retries. #23868

Conversation

oleiman commented Oct 21, 2024 • edited Loading

vbotbuildovich commented Oct 22, 2024 • edited Loading

oleiman commented Oct 21, 2024 •

edited

Loading

vbotbuildovich commented Oct 22, 2024 •

edited

Loading