perf: tune async batch iterator #3358

kolesnikovae · 2024-06-17T08:50:09Z

It's been observed that the async batch iterator we're using for fetching Parquet rows might be using too much memory. Note the query.CloneParquetValues call under iter.(*AsyncBatchIterator[...]).fillBuffer in the flame graph:

The problem manifests when the query hits downsampled (aggregated) profiles: a row may contain thousands of values. Another factor is the misalignment of the query split interval and the block duration: each sub-range is processed independently, with its own iterator, thus multiplying the memory requirement.

In practice, a large buffer is not required here, as it is only needed to avoid waiting for fetches from individual columns by reading the data from them ahead of time. In turn, each of the columns has it's own "read ahead" buffer, which should minimaize blocking of the top-level iterator.

One way to solve the problem is to make the iterator work with size in bytes and have a predictable memory footprint. In the PR, I reduce the default buffer size and change the allocation strategy to use the new slices.Grow function.

kolesnikovae · 2024-06-18T10:43:33Z

The change has helped to reduce the pressure, however, I still think that the memory usage is too wasteful. I'm thinking about removing the buffer altogether – my experiments have shown no significant impact on the performance

There are more spots that need to be optmimized:

The map that accumulates samples. We could replace it with a slice and use direct indexing.
The intermediate tree (before truncation). This one is tricky: we need to find a way to trim stack traces before building the tree.
In-memory symdb partitions. We should disable chunking for stack traces.

perf: tune async batch iterator

a5d3cdc

kolesnikovae requested a review from a team as a code owner June 17, 2024 08:50

kolesnikovae force-pushed the perf/tune-async-batch-iterator branch from 5a279f5 to a5d3cdc Compare June 17, 2024 08:56

aleks-p approved these changes Jun 17, 2024

View reviewed changes

kolesnikovae merged commit 96c3860 into main Jun 18, 2024
16 checks passed

kolesnikovae deleted the perf/tune-async-batch-iterator branch June 18, 2024 03:35

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: tune async batch iterator #3358

perf: tune async batch iterator #3358

kolesnikovae commented Jun 17, 2024 •

edited

Loading

kolesnikovae commented Jun 18, 2024

perf: tune async batch iterator #3358

perf: tune async batch iterator #3358

Conversation

kolesnikovae commented Jun 17, 2024 • edited Loading

kolesnikovae commented Jun 18, 2024

kolesnikovae commented Jun 17, 2024 •

edited

Loading