-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[native] Add caching of parsed Types #21325
Conversation
2d8fc42
to
17b39fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinwilfong nice catch. Thanks for the optimization!
@@ -61,6 +67,7 @@ class VeloxExprConverter { | |||
const protocol::CallExpression& pexpr) const; | |||
|
|||
velox::memory::MemoryPool* pool_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark pool_ and typeParser_ as consts? Thanks!
@@ -35,7 +37,7 @@ class VeloxQueryPlanConverterBase { | |||
explicit VeloxQueryPlanConverterBase( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NYC: drop explicit as the ctor takes more than one input? Thanks!
@@ -218,6 +220,7 @@ class VeloxQueryPlanConverterBase { | |||
velox::memory::MemoryPool* pool_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NYC: mark poo_ and queryCtx_ as consts?
velox::memory::MemoryPool* const pool_;
velox::core::QueryCtx* const queryCtx_;
17b39fa
to
a239894
Compare
We've seen cases of queries that spend a large amount of time just parsing types when converting the Presto Plan to Velox. This seems to be because it parses the same large Row Types that are used across many field accesses. Adding caching within a request shows a substantial decrease in the amount of time it takes to do the conversion. Notably, this helps with timeouts we're seeing making calls from the coordinator to create tasks on the Workers.
a239894
to
20e0ac1
Compare
@kevinwilfong I am adding Presto type parser support using Flex/Bison in Velox. facebookincubator/velox#7568 |
velox::TypePtr parse(const std::string& text) const; | ||
|
||
private: | ||
mutable std::unordered_map<std::string, velox::TypePtr> cache_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use the SimpleLRUCache
from Velox?
We use that to cache file handles
https://github.com/facebookincubator/velox/blob/main/velox/connectors/hive/FileHandle.h#L62
I am worried that without a bound, the cache might grow too big in a production system. |
We've seen cases of queries that spend a large amount of time just parsing types when converting the Presto Plan to Velox. This seems to be because it parses the same large Row Types that are used across many field accesses.
Adding caching within a request shows a substantial decrease in the amount of time it takes to do the conversion.
Notably, this helps with timeouts we're seeing making calls from the coordinator to create tasks on the Workers.