Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table stats cache #13047

Merged
merged 5 commits into from
Jul 21, 2022
Merged

Conversation

findepi
Copy link
Member

@findepi findepi commented Jun 30, 2022

#12196 with most comments applied

already reviewed at #12196

cc @lxynov @alexjo2144 @sopel39 @losipiuk

@cla-bot cla-bot bot added the cla-signed label Jun 30, 2022
@findepi
Copy link
Member Author

findepi commented Jun 30, 2022

#12196 with most comments applied

this one i didn't apply: #12196 (comment)

already reviewed at #12196

review isn't required here.

@findepi findepi mentioned this pull request Jun 30, 2022
lxynov added 2 commits June 30, 2022 12:37
Before the change, the planner cached table stats within
`IterativeOptimizer` run (as part of `Memo`). After the change, there is
another cache that spans the whole optimization process.
The overloads were introduced only temporarily.
@findepi findepi force-pushed the findepi/lxynov/caching-tablestats branch from cbe6f26 to 1b94f09 Compare June 30, 2022 15:43
@findepi
Copy link
Member Author

findepi commented Jun 30, 2022

this one i didn't apply: #12196 (comment)

per my comment in the original PR: dc8228a#r911207504
i decided to keep the current code as-is.

@findepi
Copy link
Member Author

findepi commented Jul 1, 2022

test (plugin/trino-iceberg)

Error:  java.lang.OutOfMemoryError: Java heap space
Error: 
Error:  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "VM Pause Meter"
Error: 
Error:  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "http-worker-8680"
Error: 
Error:  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "VM Pause Meter"
Error:  Terminating due to java.lang.OutOfMemoryError: Java heap space

*/
PlanNodeStatsEstimate calculateStats(
PlanNode node,
StatsProvider sourceStats,
Lookup lookup,
Session session,
TypeProvider types);
TypeProvider types,
TableStatsProvider tableStatsProvider);
Copy link
Member

@sopel39 sopel39 Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does TableStatsProvider needs to be 1st class citizen in StatsCalculator? It seems it should be local to TableScanStatsRule

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableScanStatsRule is effectively a singleton.
CachingTableStatsProvider lifecycle is "during plan optimization", so it's query-scoped.
we need to pass TableStatsProvider instance created in the optimizer back to the TableScanStatsRule rule

other option would be to have stats rules query-scoped and create new StatsCalculator for each query, but this is IMO generally not a good idea.

Copy link
Member

@sopel39 sopel39 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other option would be to have stats rules query-scoped and create new StatsCalculator for each query, but this is IMO generally not a good idea.

why? Maybe we can just preserve cached stats across planner/optimizer rules?

The root cause here is really that we have a lot of iterative optimizer instances. If we just had one, then there would be no caching problem.

Anyways, I'm more in favor of preserving cached stats (maybe with weak references) across optimizer executions rather than adding artificial components to a clean interface. This is similar issue as with caching expression optimizer/analyzer results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? Maybe we can just preserve cached stats across planner/optimizer rules?

that's what the PR is doing

The root cause here is really that we have a lot of iterative optimizer instances. If we just had one, then there would be no caching problem.

i agree. Memo would do all of the work for now.
We can remove this new concept once we get to that stage.

Anyways, I'm more in favor of preserving cached stats (maybe with weak references) across optimizer executions rather than adding artificial components to a clean interface. This is similar issue as with caching expression optimizer/analyzer results.

I am not following. @sopel39 what are you suggesting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following. @sopel39 what are you suggesting?

I suggest introducing some kind of context to doCalculate rather than adding another explicit parameter. I think @gaurav8297 has similar proposal for expression optimizer/analyzer results.

Copy link
Member

@gaurav8297 gaurav8297 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've implemented it as part of this PR: #12016

Basically like we have io.trino.sql.planner.iterative.Rule.Context for iterative rules, we could have the same thing for StatsRule

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest introducing some kind of context to doCalculate rather than adding another explicit parameter.

We still need to pass a parameter.

I agree we could introduce a more generic context as well.
I don't see a need for this here yet, can be a follow up

@@ -55,13 +51,13 @@ public Pattern<TableScanNode> getPattern()
}

@Override
protected Optional<PlanNodeStatsEstimate> doCalculate(TableScanNode node, StatsProvider sourceStats, Lookup lookup, Session session, TypeProvider types)
protected Optional<PlanNodeStatsEstimate> doCalculate(TableScanNode node, StatsProvider sourceStats, Lookup lookup, Session session, TypeProvider types, TableStatsProvider tableStatsProvider)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why StatsProvider cannot handle caching? It does already, right? Something doesn't add up here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was my initial idea as well. see #12196 (comment) .

@findepi findepi requested a review from sopel39 July 19, 2022 10:36
private final Metadata metadata;
private final Session session;

private final Map<TableHandle, TableStatistics> cache = new WeakHashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why do you need WeakHashMap if CTSP is created per query. Do you expect that TableHandle objects will be GCed often while query is running. It is possible for sure but I am not sure if often in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During optimizations we can create large number of TableHandle objects (eg subsequent pushdowns) and old TableHandle objects may not be reachable in the plan. I don't think "large number" would be a common situation (there isn't an unlimited number of pushdown opportunities for a given query), but i still don't think we should be using strong references here.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Why not merge commits together?

@findepi
Copy link
Member Author

findepi commented Jul 21, 2022

LGTM. Why not merge commits together?

the commit were separated per my request in #12196
this is to separate interesting logical changes from mechanical ones.

@findepi
Copy link
Member Author

findepi commented Jul 21, 2022

This is an important change (see #11708 #13198 for some rationale) and the idea to introduce a context seems optional to me, and definitely can be taken care of as a follow-up, if it is needed. Let me merge this as is, hopefully no-one feels bad about that.

@findepi findepi merged commit 53bd064 into trinodb:master Jul 21, 2022
@findepi findepi deleted the findepi/lxynov/caching-tablestats branch July 21, 2022 11:28
@findepi findepi mentioned this pull request Jul 21, 2022
@github-actions github-actions bot added this to the 391 milestone Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants