Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: attempt to account for memory in crdb_internal.jobs #45758

Merged
merged 1 commit into from
Mar 6, 2020

Conversation

spaskob
Copy link
Contributor

@spaskob spaskob commented Mar 5, 2020

This is a stop-gap PR to do some memory accounting during the
execution of crdb_internal.jobs. It turns out that with hundreds of
databases and hours backups per database, one can easily get to 100k jobs
which are roughly 2-3Kb each. Given the total lack of memory monitoring here,
this situation can lead to OOMs.

Ideally we'd track the memory usage by the internal executor. That's a
bigger change left for later. Also ideally we'd paginate the output.

Touches #44166.

Release note: None

@spaskob spaskob requested a review from ajwerner March 5, 2020 18:02
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)


pkg/sql/crdb_internal.go, line 490 at r1 (raw file):

			for field, d := range r {
				// NB: Add the datum size twice to count the fact that we already allocated
				// it once and we're about to

This comment seems to end before it should


pkg/sql/crdb_internal.go, line 493 at r1 (raw file):

				datumSize := int64(d.Size())
				if field == payloadFieldN || field == progressFieldN {
					datumSize *= 2

I don't understand this... why? Also if you insist on doing this, can we just do it after the loop rather than this comparison stuff.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: mod Jordan's comments. Thanks for writing the test.

Reviewed 1 of 2 files at r1, 1 of 1 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/crdb_internal.go, line 490 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This comment seems to end before it should

... unmarshal it and keep that around.?


pkg/sql/crdb_internal.go, line 493 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I don't understand this... why? Also if you insist on doing this, can we just do it after the loop rather than this comparison stuff.

I did this. I did it because we unmarshal these fields and keep both the marshaled and unmarshaled bits alive. If we zero'd the row datums after unmarshaling then I'd feel better about not counting it but that seems like too much effort. I'm cool with just doubling the row count.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/crdb_internal.go, line 493 at r1 (raw file):

Previously, ajwerner wrote…

I did this. I did it because we unmarshal these fields and keep both the marshaled and unmarshaled bits alive. If we zero'd the row datums after unmarshaling then I'd feel better about not counting it but that seems like too much effort. I'm cool with just doubling the row count.

Oh I see. Yeah I'd just get rid of the "precise" stuff here with the column integer index constants and double the whole size.

This is a stop-gap PR to do _some_ memory accounting during the
execution of `crdb_internal.jobs`. It turns out that with hundreds of
databases and hours backups per database, one can easily get to 100k jobs
which are roughly 2-3Kb each. Given the total lack of memory monitoring here,
this situation can lead to OOMs.

Ideally we'd track the memory usage by the internal executor. That's a
bigger change left for later. Also ideally we'd paginate the output.

Touches cockroachdb#44166.

Release note: None
Copy link
Contributor Author

@spaskob spaskob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/crdb_internal.go, line 490 at r1 (raw file):

Previously, ajwerner wrote…

... unmarshal it and keep that around.?

Done.


pkg/sql/crdb_internal.go, line 493 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Oh I see. Yeah I'd just get rid of the "precise" stuff here with the column integer index constants and double the whole size.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants