Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase default dynamic filter collection limits #17130

Merged
merged 4 commits into from
Apr 26, 2023

Conversation

raunaqmorarka
Copy link
Member

Description

Increase default dynamic filter collection limits

Additional context and related issues

Potentially alternative to #16875

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % assuming coordinator is not becoming a bottleneck now. Might be worth running concurrency benchmarks.

@raunaqmorarka raunaqmorarka force-pushed the large-df branch 9 times, most recently from aff6be3 to bd487e9 Compare April 22, 2023 13:05
@raunaqmorarka raunaqmorarka requested a review from sopel39 April 22, 2023 17:12
@raunaqmorarka
Copy link
Member Author

lgtm % assuming coordinator is not becoming a bottleneck now. Might be worth running concurrency benchmarks.

I've added a couple of small tweaks and made the increases smaller based on benchmarks. PTAL again

@@ -341,9 +341,12 @@ public boolean isAwaitable()
@Override
public TupleDomain<ColumnHandle> getCurrentPredicate()
{
Set<DynamicFilterId> completedDynamicFilters = dynamicFilters.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the trick here? Avoiding stream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each invocation to getDynamicFilterSummaries was creating a new map of completed dynamic filters and then we would lookup filter in that map.
I just simplified that to lookup in existing map directly.

@@ -418,7 +418,7 @@ private ChannelFilter(
blockTypeOperators.getEqualOperator(type),
blockTypeOperators.getHashCodeOperator(type),
blockBuilder,
Math.min(maxDistinctValues, 8192) * 2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we switch to higher default, the sizing factor on maxDistinctValues doesn't need to be as aggressive.
The previous value of 8192 also turned out to be good in microbenchmarking but a bit over eager in using memory for actual queries. Not big enough to notice any change in peak memory usage but I decided to make it more conservative anyway.

Avoid constructing RangeView
Resizing blockbuilder is not as expensive as rehashing
or a high fill ratio in TypedSet. So we can be more conservative
in it's sizing to stay under size limits
@raunaqmorarka raunaqmorarka merged commit 1dd8893 into trinodb:master Apr 26, 2023
@raunaqmorarka raunaqmorarka deleted the large-df branch April 26, 2023 03:18
@mosabua
Copy link
Member

mosabua commented Apr 26, 2023

Do we need docs for any of these? Or updates to existing docs?

@raunaqmorarka
Copy link
Member Author

Do we need docs for any of these? Or updates to existing docs?

Some of these properties are mentioned in dynamic filtering docs, we want to avoid advertising these low level knobs as we want users to stick to only using enable-large-dynamic-filters if they need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants