-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/index_state: use chunked_vector #22962
Conversation
In cases of lots of small indexes the overhead of many of these fragmented_vectors can be quite high. Reduce the overhead by using chunked_vector so the first chunk isn't full length.
Nice. There are other places in which we have Are these sites also good candidates for a swap to Follow up question, is there clear (internal or not) messaging to Redpanda devs anywhere about how the use of |
this specific change is less about the vector scaling, but the number of vectors. Those vectors you linked with scales with the number of log_impl (which is the number of partitions right?). The difference between fragmented_vector and chunked_vector is that redpanda/src/v/container/fragmented_vector.h Lines 564 to 572 in 10eb41e
Generally I recommented chunked_vector is a better default data structure than fragmented vector because you don't have to make the tradeoff of overhead for small vectors and larger chunks for performance at scale.
Again I recommend chunked_vector everywhere as the default vector type in Redpanda. Vector is only safe we if have a hard limit (with validation) that the length will not grow to our oversized allocation limit (even then you need to make sure that the vector doubling allocation strategy doesn't bite you). We have internal documentation from the perf team here: https://redpandadata.atlassian.net/wiki/spaces/CORE/pages/318275653/Memory+Management+in+Redpanda |
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53228#01917184-c845-4f68-93cc-1d66bd49a973 |
Thank you, @rockwotj! |
/backport v24.2.x |
/backport v24.1.x |
In cases of lots of small indexes the overhead of many of these
fragmented_vectors can be quite high. Reduce the overhead by using
chunked_vector so the first chunk isn't full length.
Backports Required
Release Notes
Improvements