-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Presto builds hashtable for MapBlocks eagerly when constructing the MapBlock even it's not needed in the query. Building a hashtable could take up to 40% CPU of the scan cost on a map column. This commit defers the hashtable build to the time it's needed in SeekKey(). Note that we only do this to the MapBlock, not the MapBlockBuilder to avoid complex synchronization problems. The MapBlockBuilder will always build the hashtable. As the result MergingPageOutput and PartitionOutputOperator will still rebuild the hashtables when needed. The measurements shows there will be less than 10% pages for MergingPageOutput to build the hashtables. We will have a seperate PR to improve PartitionOutput and avoid rebuilding the pages so as to avoid hashtable rebuilding. Simple select checsum queries show over 40% CPU gain: Test | After | Before | Improvement select 2 map columns checksum | 11.69d | 20.06d | 42% Select 1 map column checksum | 9.67d | 17.73d | 45%
- Loading branch information
Showing
7 changed files
with
296 additions
and
130 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.