-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement single inequality joins for join_where #18727
Conversation
CodSpeed Performance ReportMerging #18727 will degrade performances by 25.3%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #18727 +/- ##
==========================================
- Coverage 79.88% 79.86% -0.02%
==========================================
Files 1513 1513
Lines 203466 203631 +165
Branches 2892 2892
==========================================
+ Hits 162546 162640 +94
- Misses 40372 40443 +71
Partials 548 548 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @adamreeve. Great that we can reuse the parallelism logic. Makes a lot of sense.
All this work can also greatly be re-used in the streaming engine.
This implements the "piecewise merge join" algorithm (described in this DuckDB article) to handle
join_where
with a single inequality, without using a cross join and filter.I've reused the
IEJoin
join type internally due to the similarities in how these two join types are handled. Technically this isn't using the IEJoin algorithm but it is another type of inequality join, so this seems OK, but it could be pulled out into its own join type if needed.