-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance overhead on HiveConnector::createDataSource if split preloading kicked in #10173
Comments
Can you put the bloom filter in dynamic filters instead? It does not sound right to put a bloom filter in remaining filter. Where is the bloom filter generated? |
What's the ideal use case for remaining filter? I am not pretty much familiar with this part of design in Velox.
It's generated by Spark query planner before creating Velox task. |
Remaining filter is part of the where clause that cannot be converted to tuple domains (key value filters). In your case it is probably easier to wrap the bloom filter in a |
A second way is to inherit compiled expression in |
Or make
|
@zhli1142015 If it is the same in all data sources, recompiling them is a waste of CPU |
I see, make sense. Then how about store the |
Reusing them will be tricky. The most straightforward way would be wrap the bloom filter in filter object and push them into dynamic filters instead of modeling them as expression (it seems more logically right that way as well). Otherwise keeping them as expression will be a fairly large change on connector to factor out the remaining filter compilation (the connector interface might be hard to change to accommodate this). We are not seeing compilation cost anywhere else so it's a question whether it is worthing doing it just for spark bloom filter. |
Bug description
It's observed in Gluten's use case,
HiveConnector::createDataSource
slows down data scan when split preload is turned on.In the case a hotspot appeared in filter expression compilation (namely
SimpleExpressionEvaluator::compile
):In the case the filter expression contains bloom-filters so it took much longer time than usual since bloom-filter's compilation can be slower than other types of expressions.
In the case when split preloading is turned off, the scan time can be shorten by ~6x (~30s vs ~5s). The estimated total split number is ~200K.
Related code:
velox/velox/exec/TableScan.cpp
Lines 290 to 329 in 3eb9f01
To solve the issue, perhaps split-preloading procedure could adopt some kind of reuse logics to avoid compiling the expressions every time a split is preloaded.
System information
The text was updated successfully, but these errors were encountered: