-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: implement the execution part of the outer hash join #12882
Conversation
Please address the comments |
executor/builder.go
Outdated
} | ||
// reverse the inner and the outer | ||
if e.outerHashJoin { | ||
v.InnerChildIdx = 1 - v.InnerChildIdx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we'd better move these lines to the plan building phase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be not.
The plan building phase just make choose whether to adopt the outer hash join, and the executing phase reverses the inner and the outer internally. This way does not need to change a lot of code in the plan build phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Execute phase better not change the physical plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other members may not think so according to previous daily meeting when discussing the solutions.
On the other hand, the current way is implemented and tested. If taking to way to change physical plans at builing plan phase, it needs to rewrite a lot of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to rewrite a lot of code?
Only the *LogicalJoin.getHashJoin
and the PhysicalHashJoin.GetCost
function of HashJoin may be affected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, buildHashJoin
here and explain()
In other words, it needs to rewrite the PR #12883 , as well as the function here.
Please merge the master and resolve the conflicts. |
Codecov Report
@@ Coverage Diff @@
## master #12882 +/- ##
===========================================
Coverage 80.5351% 80.5351%
===========================================
Files 471 471
Lines 114437 114437
===========================================
Hits 92162 92162
Misses 15237 15237
Partials 7038 7038 |
PTAL @SunRunAway @qw4990 |
wg.Wait() | ||
// If clearCounter is too big, it means setter concurrency of this test is not enough. | ||
c.Assert(clearCounter < loopCount, Equals, true) | ||
c.Assert(setterCounter, Equals, clearCounter+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this always be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If some errors in bm.Set() at Line:76 occur, these checks may be false.
executor/join.go
Outdated
} else { // Sequentially handling unmatched rows from the hash table that is from disk | ||
numChks := e.rowContainer.recordsInDisk.NumChunks() | ||
for i := 0; i < numChks; i++ { | ||
numOfRows := e.rowContainer.recordsInDisk.NumRowsOfChunk(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It a little ugly here...
I think we can do the following things in an individual PR:
- rename chunk.List to chunk.ListInMemory
- define an interface
type List interface{
Add(chk *Chunk)
NumChunks()
GetRow(chunk.RowPtr)
}
- remove the useless duplicate code for this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change effects too much code, so regect this suggestion.
f03af87
to
53b3a60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Your auto merge job has been accepted, waiting for 13230, 13235, 13231 |
/run-all-tests |
@fzhedu merge failed. |
/rebuild |
…shJoin-execution
1b1af78
to
70ca7fe
Compare
/merge |
/run-all-tests |
What problem does this PR solve?
implement the execution part of the outer hash join. Related to #6868
What is changed and how it works?
update the hash join for outer join, after reverse the inner and the outer for all outer joins when building hash joins if the outer size is smaller.
Check List
Tests
Code changes
Side effects
Related changes