Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: add templating for RANK and ROW_NUMBER window functions #37282

Merged
merged 1 commit into from
Jun 3, 2019

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented May 2, 2019

Templates out rankOp into two operators (for dense and non-dense
case) and templates out support for PARTITION BY clause for both
rankOp and rowNumberOp. The code is undertested and is lacking
benchmarks, but that will be addressed soon.

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label May 2, 2019
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add some lightweight benchmarks to these. Also add the .eg.go files to .gitignore.

Reviewed 5 of 6 files at r1, 5 of 5 files at r2, 5 of 5 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @jordanlewis, @solongordon, and @yuzefovich)


pkg/sql/exec/vecbuiltins/rank_tmpl.go, line 52 at r4 (raw file):

// {{range .}}

type rankDense__DENSE_HasPartition__PARTITION_Op struct {

You could make an embedded struct that has all of the data that's repeated in these, to avoid the nasty initialization above.


pkg/sql/exec/vecbuiltins/rank_tmpl.go, line 103 at r4 (raw file):

	rankCol := r.batch.ColVec(r.outputColIdx).Int64()
	sel := r.batch.Selection()
	if sel != nil {

I would probably try to template sel vs nonsel here as well since there's so much code in the body.


pkg/sql/exec/vecbuiltins/row_number.go, line 59 at r2 (raw file):

		// TODO(yuzefovich): I couldn't figure out how to pass the batch as an
		// argument, so I embedded it into the struct. Is it possible to pass it as
		// an argument?

You can pass it as an argument to this function, yes. The template "function" can see everything in its surrounding scope - so what I usually do is give the template function an argument of the same name as the equivalent in the outer scope, so it looks like ordinary Go code.


pkg/sql/exec/vecbuiltins/row_number_tmpl.go, line 29 at r2 (raw file):

// {{/*
func _NEXT_ROW_NUMBER(hasPartition bool) { // */}}

So I'd give this guy a batch coldata.Batch argument, and also give nextBodyWithPartition that same named argument, and things should just work.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left TODOs for the benchmarks and for sel vs non-sel cases in rankOp. I want to get it in as is (undertested with no benchmarks) since it's already a pretty big PR.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @jordanlewis, and @solongordon)


pkg/sql/exec/vecbuiltins/rank_tmpl.go, line 52 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

You could make an embedded struct that has all of the data that's repeated in these, to avoid the nasty initialization above.

Looks better, thanks!


pkg/sql/exec/vecbuiltins/rank_tmpl.go, line 103 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I would probably try to template sel vs nonsel here as well since there's so much code in the body.

Also left a TODO for this.


pkg/sql/exec/vecbuiltins/row_number.go, line 59 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

You can pass it as an argument to this function, yes. The template "function" can see everything in its surrounding scope - so what I usually do is give the template function an argument of the same name as the equivalent in the outer scope, so it looks like ordinary Go code.

Indeed, that worked, thanks!


pkg/sql/exec/vecbuiltins/row_number_tmpl.go, line 29 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

So I'd give this guy a batch coldata.Batch argument, and also give nextBodyWithPartition that same named argument, and things should just work.

Done, thanks!

@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label May 17, 2019
@yuzefovich
Copy link
Member Author

RFAL guys.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @jordanlewis, @solongordon, and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/rank_gen.go, line 31 at r5 (raw file):

}

func (r rankTmplInfo) UpdateRank() string {

Comments on these would be nice. In general I worry that this code will be confusing as it is far removed from where it is used, but that's kind of a limitation of the model in general. I think comments are the best we can do for now.

@yuzefovich yuzefovich force-pushed the exec_template branch 5 times, most recently from e094502 to b95d2f3 Compare May 22, 2019 00:42
@yuzefovich yuzefovich force-pushed the exec_template branch 3 times, most recently from 7979fab to e339160 Compare June 3, 2019 15:45
Templates out rankOp into two operators (for dense and non-dense
case) and templates out support for PARTITION BY clause for both
rankOp and rowNumberOp. The code is undertested and is lacking
benchmarks, but that will be addressed soon.

Release note: None
@yuzefovich
Copy link
Member Author

TFTR!

bors r+

craig bot pushed a commit that referenced this pull request Jun 3, 2019
36101: roachprod: Minor readme updates r=bobvawter a=bdarnell

Use new convenience commands and demonstrate running a workload

Release note: None

37282: exec: add templating for RANK and ROW_NUMBER window functions r=yuzefovich a=yuzefovich

Templates out rankOp into two operators (for dense and non-dense
case) and templates out support for PARTITION BY clause for both
rankOp and rowNumberOp. The code is undertested and is lacking
benchmarks, but that will be addressed soon.

37412: server/heapprofiler: don't consider "goIdle" memory r=andreimatei a=andreimatei

Before this patch we were taking a heap profile once RSS got above 85%
of system memory and then we wouldn't take another profile until we go
below it and then again above it. The problem with the RSS is that it
includes "goIdle" memory (memory allocated from the OS that's not
actually in use) ; I've seen it be up to several gigs.
With this patch, we now take a profile every time a new high-water mark
is reached. The recorded high-water mark is reset once an hour, this way
ensuring that one big measurement doesn't stop us forever from taking
another profile. It's simpler and it has a chance of catching some
further heap increases.

The rationale behind the patch is that we want profiles when the heap is
large more than when the RSS is large. I'm looking at a case where we
took a heap profile when the heap was 4.5 gigs with 2 gigs idle and then
never took one again because of how the heuristic works. And then we
OOMed when the heap was larger and the idle space was lower, but the RSS
was about the same. With this patch, we would have taken a profile at a
more interesting time.

Release note: None

Co-authored-by: Ben Darnell <ben@bendarnell.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jun 3, 2019

Build succeeded

@craig craig bot merged commit 02332ed into cockroachdb:master Jun 3, 2019
@yuzefovich yuzefovich deleted the exec_template branch June 13, 2019 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants