Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[optimization] avoid extra memory copy while build hash table #5301

Merged
merged 3 commits into from
Jan 30, 2021

Conversation

stdpain
Copy link
Contributor

@stdpain stdpain commented Jan 26, 2021

Proposed changes

avoid extra memory copy while build hash table

reference to #5300

Types of changes

  • [] Bugfix (non-breaking change which fixes an issue)
  • [] New feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] Documentation Update (if none of the other choices apply)
  • Code refactor (Modify the code structure, format the code, etc...)

Checklist

  • I have created an issue on (Fix #ISSUE) and described the bug/feature there in detail
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged

Further comments

This modification may have a performance improvement of more than 25% during the hash table construction phase.

@stdpain stdpain force-pushed the optimize_hash_table_build branch from 7b86ef7 to 4c102ea Compare January 26, 2021 06:00
be/src/exec/hash_table.cpp Outdated Show resolved Hide resolved
be/src/exec/hash_table.hpp Outdated Show resolved Hide resolved
// TODO: integrate with mem pools
uint8_t* _nodes;
// Buffer to store node data.
uint8_t* _current_nodes;
// number of nodes stored (i.e. size of hash table)
int64_t _num_nodes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int64_t _num_nodes;
int64_t _total_used;

Easier to understand

stdpain and others added 2 commits January 28, 2021 10:54
Co-authored-by: wangbo <506340561@qq.com>
Copy link
Member

@yangzhg yangzhg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wangbo wangbo added the area/sql/execution Issues or PRs related to the execution engine label Jan 28, 2021
Copy link
Contributor

@wangbo wangbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@morningman morningman merged commit bf0cb78 into apache:master Jan 30, 2021
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
@stdpain stdpain deleted the optimize_hash_table_build branch February 21, 2021 17:01
EmmyMiao87 pushed a commit to EmmyMiao87/incubator-doris that referenced this pull request Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sql/execution Issues or PRs related to the execution engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants