Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: implement the execution part of the outer hash join #12882

Merged
merged 23 commits into from
Nov 12, 2019

Conversation

fzhedu
Copy link
Contributor

@fzhedu fzhedu commented Oct 22, 2019

What problem does this PR solve?

implement the execution part of the outer hash join. Related to #6868

What is changed and how it works?

update the hash join for outer join, after reverse the inner and the outer for all outer joins when building hash joins if the outer size is smaller.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch

@fzhedu fzhedu added the sig/execution SIG execution label Oct 22, 2019
@fzhedu fzhedu requested review from SunRunAway and XuHuaiyu and removed request for SunRunAway October 22, 2019 12:20
@fzhedu fzhedu changed the title implement the execution part of the outer hash join executor: implement the execution part of the outer hash join Oct 22, 2019
executor/benchmark_test.go Outdated Show resolved Hide resolved
executor/hash_table.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
executor/builder.go Outdated Show resolved Hide resolved
executor/builder.go Outdated Show resolved Hide resolved
executor/builder.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Show resolved Hide resolved
@fzhedu fzhedu requested review from XuHuaiyu and qw4990 October 29, 2019 06:43
@XuHuaiyu XuHuaiyu removed their request for review October 29, 2019 07:02
@XuHuaiyu
Copy link
Contributor

Please address the comments

@fzhedu fzhedu requested a review from XuHuaiyu October 29, 2019 13:47
executor/join.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Outdated Show resolved Hide resolved
util/bitmap/concurrent_test.go Outdated Show resolved Hide resolved
util/bitmap/concurrent_test.go Outdated Show resolved Hide resolved
executor/hash_table.go Outdated Show resolved Hide resolved
}
// reverse the inner and the outer
if e.outerHashJoin {
v.InnerChildIdx = 1 - v.InnerChildIdx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we'd better move these lines to the plan building phase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be not.
The plan building phase just make choose whether to adopt the outer hash join, and the executing phase reverses the inner and the outer internally. This way does not need to change a lot of code in the plan build phase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execute phase better not change the physical plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other members may not think so according to previous daily meeting when discussing the solutions.
On the other hand, the current way is implemented and tested. If taking to way to change physical plans at builing plan phase, it needs to rewrite a lot of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to rewrite a lot of code?
Only the *LogicalJoin.getHashJoin and the PhysicalHashJoin.GetCost function of HashJoin may be affected?

Copy link
Contributor Author

@fzhedu fzhedu Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, buildHashJoin here and explain()
In other words, it needs to rewrite the PR #12883 , as well as the function here.

executor/builder.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Show resolved Hide resolved
util/bitmap/concurrent.go Show resolved Hide resolved
util/bitmap/concurrent.go Outdated Show resolved Hide resolved
@XuHuaiyu
Copy link
Contributor

Please merge the master and resolve the conflicts.

@codecov
Copy link

codecov bot commented Oct 31, 2019

Codecov Report

Merging #12882 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #12882   +/-   ##
===========================================
  Coverage   80.5351%   80.5351%           
===========================================
  Files           471        471           
  Lines        114437     114437           
===========================================
  Hits          92162      92162           
  Misses        15237      15237           
  Partials       7038       7038

@fzhedu fzhedu requested a review from XuHuaiyu October 31, 2019 08:15
executor/builder.go Outdated Show resolved Hide resolved
executor/hash_table.go Outdated Show resolved Hide resolved
util/bitmap/concurrent.go Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
executor/join.go Outdated Show resolved Hide resolved
@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Nov 4, 2019

PTAL @SunRunAway @qw4990

@fzhedu fzhedu requested a review from XuHuaiyu November 4, 2019 07:02
wg.Wait()
// If clearCounter is too big, it means setter concurrency of this test is not enough.
c.Assert(clearCounter < loopCount, Equals, true)
c.Assert(setterCounter, Equals, clearCounter+1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this always be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some errors in bm.Set() at Line:76 occur, these checks may be false.

@fzhedu fzhedu requested a review from XuHuaiyu November 11, 2019 03:33
executor/join.go Outdated
} else { // Sequentially handling unmatched rows from the hash table that is from disk
numChks := e.rowContainer.recordsInDisk.NumChunks()
for i := 0; i < numChks; i++ {
numOfRows := e.rowContainer.recordsInDisk.NumRowsOfChunk(i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It a little ugly here...
I think we can do the following things in an individual PR:

  1. rename chunk.List to chunk.ListInMemory
  2. define an interface
type List interface{
    Add(chk *Chunk)
    NumChunks()
    GetRow(chunk.RowPtr)
}
  1. remove the useless duplicate code for this function

@fzhedu @SunRunAway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change effects too much code, so regect this suggestion.

@fzhedu fzhedu force-pushed the OuterHashJoin-execution branch from f03af87 to 53b3a60 Compare November 11, 2019 09:55
XuHuaiyu
XuHuaiyu previously approved these changes Nov 11, 2019
Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XuHuaiyu XuHuaiyu added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 11, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 11, 2019

Your auto merge job has been accepted, waiting for 13230, 13235, 13231

@sre-bot
Copy link
Contributor

sre-bot commented Nov 11, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Nov 11, 2019

@fzhedu merge failed.

@sykp241095
Copy link
Member

/rebuild

@fzhedu fzhedu requested a review from SunRunAway November 11, 2019 11:36
@qw4990 qw4990 removed their request for review November 12, 2019 08:04
@SunRunAway
Copy link
Contributor

/merge

@SunRunAway SunRunAway added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Nov 12, 2019
@SunRunAway SunRunAway added this to the v4.0.0 milestone Nov 12, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 12, 2019

/run-all-tests

@sre-bot sre-bot merged commit b84c5a7 into pingcap:master Nov 12, 2019
XiaTianliang pushed a commit to XiaTianliang/tidb that referenced this pull request Dec 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/new-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants