-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Add initial support for window functions #8928
sql: Add initial support for window functions #8928
Conversation
Review status: 0 of 23 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. sql/window.go, line 465 [r1] (raw file):
populated -> populates sql/parser/sql.y, line 3740 [r1] (raw file):
Unnecessary semicolon Comments from Reviewable |
A couple general remarks:
Reviewed 23 of 23 files at r1. sql/window.go, line 30 [r1] (raw file):
Please explain in a comment here your terminology, in particular the difference between:
sql/window.go, line 111 [r1] (raw file):
Explain why / when overriding is necessary. sql/window.go, line 114 [r1] (raw file):
I think for readability this switch and the test on refName below could be moved to their own function. sql/window.go, line 118 [r1] (raw file):
Why not sqlbase.NormalizeName()? sql/window.go, line 123 [r1] (raw file):
ditto sql/window.go, line 277 [r1] (raw file):
Maybe move this assignment to below the checks? That's what you did in group.go already. Unless the two methods use it. sql/window.go, line 355 [r1] (raw file):
Disclaimer: I do not feel qualified to review the correctness of this function. And probably I'll need a second round reading this, it's really a lot of new SQL stuff for me to page into. sql/parser/aggregate_builtins.go, line 55 [r1] (raw file):
Can you clarify what is meant here. If the node can contain aggregate functions, then I would expect that IsAggregate() should return true if it actually does. sql/parser/aggregate_builtins.go, line 270 [r1] (raw file):
Please move this to a different commit (at the least) or PR (best) sql/parser/aggregate_builtins.go, line 661 [r1] (raw file):
ditto sql/parser/select.go, line 503 [r1] (raw file):
What's wrong with doing sql/parser/sql.y, line 3623 [r1] (raw file):
Bleah! Make a helper function at the start of this file and use it. I want to read sql/parser/window.go, line 23 [r1] (raw file):
I think you can un-export the field since it's only reachable by the methods below. sql/parser/window.go, line 44 [r1] (raw file):
I don't like this method (see my other review comments) and if it is there to stay the code comment needs to make a better job at convincing the reader why it's needed. sql/parser/window.go, line 51 [r1] (raw file):
Wait, no. Allocating a defer here is not warranted. Just do:
Also I do not fully grasp why you need to reset the field to false here; a comment is warranted. sql/parser/window.go, line 67 [r1] (raw file):
Same here. sql/parser/window.go, line 69 [r1] (raw file):
Either reuse sql/testdata/window, line 1 [r1] (raw file):
Disclaimer: I can't review these tests; I am not (yet) well enough versed with WINDOW functions to check the expected results are all correct. I assume you checked/produced them with sqlite or pg? sql/testdata/window, line 333 [r1] (raw file):
Please add some tests either here or in Comments from Reviewable |
Wow you have been busy! :) I will need to make multiple passes but the overall structure looks good. As far as memory usage goes, I don't think this is any different than grouping right? (so wrt #8691 we should do whatever we do for grouping) Review status: all files reviewed at latest revision, 25 unresolved discussions, some commit checks failed. sql/window.go, line 30 [r1] (raw file):
|
@RaduBerinde it's different from grouping because with grouping we're not storing all the rows in each group (the grouped rows are streamed through each group's aggregator and then discarded; the only memory budget is for the small aggregator struct itself, one per group). With WINDOW we keep all the rows in memory for each partition. That can grow huge. Review status: all files reviewed at latest revision, 25 unresolved discussions, some commit checks failed. sql/window.go, line 72 [r1] (raw file):
|
Ah, I see. Then s/grouping/sorting :) Review status: all files reviewed at latest revision, 25 unresolved discussions, some commit checks failed. Comments from Reviewable |
5b6577a
to
53f5793
Compare
TFTRs! I'll wait for #8691 to land and rebase ontop of it so as not to step on @knz's toes. With regard to the the the memory usage, I also dont see why this would be a big issue. Like @RaduBerinde said, this should be comparable to sorting with a linear memory usage with respect to row count, unlike the quadratic memory usage blowup of joins. Review status: all files reviewed at latest revision, 25 unresolved discussions, some commit checks failed. sql/window.go, line 30 [r1] (raw file):
|
This looks good. I'll have a last review round after I finish reading the PG docs. Hopefully before Monday :) Reviewed 7 of 7 files at r2. sql/window.go, line 35 [r2] (raw file):
Either define "window" here or give a link to some external documentation that explains what windows are. sql/parser/aggregate_builtins.go, line 55 [r1] (raw file):
|
Ok now I understand what windows do and what the code does. Please file an issue to optimize the case where the data source is already ordered by the partition expression, in which case we only need to hold the rows for 1 partition at a time in memory. Also if the data source is already ordered by both the partition expression and the partition's order by clause we only need to hold 1 row at a time. Review status: all files reviewed at latest revision, 16 unresolved discussions, some commit checks failed. sql/window.go, line 355 [r1] (raw file):
|
This change adds support for a subset of all window function functionality. Window functions can be run over either pre-existing aggregate builtins or a set of window function-only builtins. The change only supports running window functions over aggregation functions for now. It fully supports the PARTITION BY and ORDER BY clauses, but does not yet support frame clause for window definitions. Future related work: - Support window frame clause - Add built-in window functions - Improve partitioning/sorting/aggregating efficiency (see referenced papers in TODOs) - Improve inter- and intra-partition parallelism - Avoid adding unnecessary extra renders to selectNode - Support WITHIN GROUP and FILTER clauses
Review status: all files reviewed at latest revision, 15 unresolved discussions, some commit checks failed. sql/window.go, line 35 [r2] (raw file):
|
53f5793
to
3df4959
Compare
Reviewed 9 of 9 files at r3. sql/window.go, line 245 [r2] (raw file):
|
Closes #2475.
This change adds support for a subset of all window function
functionality. Window functions can be run over either pre-existing
aggregate builtins or a set of window function-only builtins. The change
only supports running window functions over aggregation functions for now.
It fully supports the
PARTITION BY
andORDER BY
clauses, but does notyet support frame clause for window definitions.
Future related work:
papers in TODOs)
WITHIN GROUP
andFILTER
clausesThis change is