Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server/heapprofiler: don't consider "goIdle" memory #37412

Merged
merged 2 commits into from
Jun 3, 2019

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented May 8, 2019

Before this patch we were taking a heap profile once RSS got above 85%
of system memory and then we wouldn't take another profile until we go
below it and then again above it. The problem with the RSS is that it
includes "goIdle" memory (memory allocated from the OS that's not
actually in use) ; I've seen it be up to several gigs.
With this patch, we now take a profile every time a new high-water mark
is reached. The recorded high-water mark is reset once an hour, this way
ensuring that one big measurement doesn't stop us forever from taking
another profile. It's simpler and it has a chance of catching some
further heap increases.

The rationale behind the patch is that we want profiles when the heap is
large more than when the RSS is large. I'm looking at a case where we
took a heap profile when the heap was 4.5 gigs with 2 gigs idle and then
never took one again because of how the heuristic works. And then we
OOMed when the heap was larger and the idle space was lower, but the RSS
was about the same. With this patch, we would have taken a profile at a
more interesting time.

Release note: None

@andreimatei andreimatei requested review from ajwerner, thoszhang, tbg, 2nishantg and a team May 8, 2019 22:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable, but I'd like to suck you into a little cleanup before this lands (I wrote most of the code in the comment below to make this easier to swallow - hope it works 😄)

The RSS size and to Go stats are taken separately because apparently at some point we had problems with the duration of the call to retrieve the Go stats:

cockroach/pkg/server/server.go

Lines 1916 to 1920 in 4850713

// We run two separate sampling loops, one for memory stats (via
// ReadMemStats) and one for all other runtime stats. This is necessary
// because as of go1.11, runtime.ReadMemStats() "stops the world" and
// requires waiting for any current GC run to finish. With a large heap, a
// single GC run may take longer than the default sampling period (10s).

The result is kind of crappy. We get the RSS in one loop, and the Go stats in the other, so they could be, at worst, 10s out of sync (or more if these stalls still happen).

What I think we should do is the following:

  1. unify the stats collection, i.e. move
    mem := gosigar.ProcMem{}
    into SampleEnvironment and store both that and the Go stats in a type MemStats struct { gosigar.ProcMem runtime.MemStats } that has convenience functions to get "good numbers" out of it. Note that it makes sense to call gosigar last (i.e. after getting go mem stats), so that it's taken very soon after the Go stats computation returns.
  2. try to sync up these loops somewhat. I'm thinking the following
	// As of go1.12, runtime.ReadMemStats() "stops the world" and requires
	// waiting for any current GC run to finish. With a large heap, a single GC
	// run may take longer than the default sampling period (10s). The code below
	// is resilient to this sort of thing and will continue to run roughly at the
	// defined frequency, using the last stats that were in fact collected.
	s.stopper.RunWorker(ctx, func(ctx context.Context) {
		timer := timeutil.NewTimer()
		defer timer.Stop()
		timer.Reset(frequency)

		var memStats atomic.Value // *base.MemStats
		var collecting int32      // atomic, 1 when stats call is ongoing
		memStats.Store(&base.MemStats{})
		for {
			select {
			case <-timer.C:
				timer.Read = true
				timer.Reset(frequency)

				statsCollected := make(chan struct{})
				if atomic.CompareAndSwapInt64(&collecting, 0, 1) {
					if err := s.stopper.RunAsyncTask(ctx, "go-mem-stats", func(ctx context.Context) {
						curStats := s.runtime.SampleMemStats(ctx)
						memStats.Store(&curStats)
						close(statsCollected)
						atomic.StoreInt32(&collecting, 0)
					}); err != nil {
						close(statsCollected)
					}
				} else {
					close(statsCollected)
				}

				select {
				case <-time.After(time.Second):
					// Maybe log a warning, but continue to use the old stats in
					// this iteration.
				case <-statsCollected:
					// Common case, we have updated stats now.
				}

				curStats := memStats.Load().(*base.MemStats)
				s.runtime.SampleEnvironment(ctx, curStats)

				if goroutineDumper != nil {
					goroutineDumper.MaybeDump(ctx, s.ClusterSettings(), s.runtime.Goroutines.Value())
				}
				if heapProfiler != nil {
					heapProfiler.MaybeTakeProfile(ctx, s.ClusterSettings(), curStats)
				}
			case <-s.stopper.ShouldStop():
				return
			}
		}
	})

When we have that, we can think about what the meaningful thing for the heap profiler to look at is. I believe RSS is always somewhat inflated (at least on linux) due to the use of MADVISE. Also, the heap profiler never tells us anything about allocs on the non-Go side, so it doesn't make a whole lot of sense to use it for the heap profiler in the first place; the only thing that really matters is some combination of GoAllocated and GoIdle (but note that jemalloc can dump heap also, an extension which might make sense in the future but not if we switch to pebble).

HeapIdle has this comment:

// HeapIdle minus HeapReleased estimates the amount of memory

	// that could be returned to the OS, but is being retained by

	// the runtime so it can grow the heap without requesting more

	// memory from the OS. If this difference is significantly

	// larger than the heap size, it indicates there was a recent

	// transient spike in live heap size.

	HeapIdle uint64

I agree that it's better to limit to the non-idle Go memory as a trigger. Lots of idle memory tells us that something interesting happened frequently, but we missed it, but maybe we did actually catch it earlier.

  1. remove fractionSystemMemoryHeuristic: we've established that we can't really reason about what a good threshold here is because we always expect sizeable chunks of memory on the C++ side
  2. introduce a high watermark heuristic: dump every time we see a new max for non-idle go heap size (or perhaps every time we see more than 1.3x the previous max, or something like that). Heap dumps are fairly cheap because they just write out info that is collected during GC.

Reviewed 5 of 5 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/base/mem_stats.go, line 17 at r2 (raw file):

package base

// MemStats represents information from the Go allocator.

This seems like sort of a random struct that's kind of useful, but not really useful, and also not aptly named. See the suggestion on the main thread.

@andreimatei
Copy link
Contributor Author

Please see now.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r3, 5 of 5 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, and @Nishant9)


pkg/base/mem_stats.go, line 1 at r4 (raw file):

// Copyright 2018 The Cockroach Authors.

19


pkg/server/server.go, line 1952 at r4 (raw file):
I'd clarify this to say

With a large heap and under extreme conditions, a single GC run may take longer than the default sampling period of 10s. Under normal operations and with more recent versions of Go, this hasn't been observed to be a problem.


pkg/server/heapprofiler/heapprofiler.go, line 94 at r4 (raw file):

}

// fractionSystemMemoryHeuristic is true if latest (RSS - goIdle) is more than

Thinking about this more, it still strikes me as silly that we're comparing apples (RSS) to oranges (goIdle) here. How about we always take a profile if GoActualHeapBytes() is larger than the max seen thus far? We don't have to worry about the min profile interval since we've resolved that taking a heap profile is cheap and because the profiler is already triggered periodically at a low enough frequency (though if you feel that we should retain some protection here, that's fine by me too but I would lower it to 10s).


pkg/server/heapprofiler/heapprofiler_test.go, line 52 at r4 (raw file):

		assert.Nil(t, err)
		if len(expectedScores) <= numProfiles {
			t.Fatalf("unexected profile: %d", numProfiles)

unexpected


pkg/server/status/runtime.go, line 502 at r4 (raw file):

		humanize.IBytes(mem.Resident), numGoroutine,
		humanize.IBytes(ms.GoAllocated), humanize.IBytes(ms.GoIdle), humanize.IBytes(ms.GoTotal),
		staleMsg,

Nice touch.


pkg/server/status/runtime.go, line 550 at r4 (raw file):

	goAllocated := ms.Alloc
	goTotal := ms.Sys - ms.HeapReleased
	rsr.GoAllocBytes.Update(int64(goAllocated))

Move this to SampleEnvironment, then remove this method completely (we just call runtime.ReadMemStats at the caller).


pkg/server/status/runtime.go, line 556 at r4 (raw file):

	return base.GoMemStats{
		GoAllocated: goAllocated,
		GoIdle:      ms.HeapIdle - ms.HeapReleased,

Please just use runtime.MemStats here (and use it in base.MemStats) and do all the math later with an accessor. That way it'll be much easier to navigate later.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/server/heapprofiler/heapprofiler.go, line 94 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Thinking about this more, it still strikes me as silly that we're comparing apples (RSS) to oranges (goIdle) here. How about we always take a profile if GoActualHeapBytes() is larger than the max seen thus far? We don't have to worry about the min profile interval since we've resolved that taking a heap profile is cheap and because the profiler is already triggered periodically at a low enough frequency (though if you feel that we should retain some protection here, that's fine by me too but I would lower it to 10s).

well I thought it's silly to take profiles when the memory usage is "low", regardless of whether we've reached a high-water mark or not. And we don't really know what the target for all the go memory is, since so much memory is taken by CGo. So I want something that says "if there's plenty of free memory on the system, don't take a profile".
And also I don't know about only taking profiles on high-water marks... I think while memory is "high" it's better to take more profiles periodically.

So, I dunno... It'd still go with what I have here. And hopefully we'll switch to Pebble and then we can talk about just Go memory.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, and @Nishant9)


pkg/server/heapprofiler/heapprofiler.go, line 94 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

well I thought it's silly to take profiles when the memory usage is "low", regardless of whether we've reached a high-water mark or not. And we don't really know what the target for all the go memory is, since so much memory is taken by CGo. So I want something that says "if there's plenty of free memory on the system, don't take a profile".
And also I don't know about only taking profiles on high-water marks... I think while memory is "high" it's better to take more profiles periodically.

So, I dunno... It'd still go with what I have here. And hopefully we'll switch to Pebble and then we can talk about just Go memory.

You're right that the first couple of profiles won't be useful, but it's not silly because it eliminates needing to know what a "reasonable" threshold for Go memory usage is, which is what has been the real silliness in this code thus far. Trying to fix that is nontrivial and we don't have much to win by even trying. Even when there's no C++ involved I'd prefer the robustness of the simple approach (proposed above). I'll also add that I originally asked for these complications to be introduced, which I regret now. It would've saved us a number of eng hours to just take a heap profile whenever a new high watermark is reached.

I can take this change over if you'd rather spend your time somewhere else.

@andreimatei andreimatei changed the title server/heapprofiler: don't conside "goIdle" memory server/heapprofiler: don't consider "goIdle" memory May 20, 2019
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/server/heapprofiler/heapprofiler.go, line 94 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

You're right that the first couple of profiles won't be useful, but it's not silly because it eliminates needing to know what a "reasonable" threshold for Go memory usage is, which is what has been the real silliness in this code thus far. Trying to fix that is nontrivial and we don't have much to win by even trying. Even when there's no C++ involved I'd prefer the robustness of the simple approach (proposed above). I'll also add that I originally asked for these complications to be introduced, which I regret now. It would've saved us a number of eng hours to just take a heap profile whenever a new high watermark is reached.

I can take this change over if you'd rather spend your time somewhere else.

ok, no more RSS. Highwater-mark + periodic reset.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see the other comments I had before your last push regarding to GoMemStats? I feel rather strongly that they're worth addressing. I just spent 15 minutes trying to figure out what numbers we actually use now, which I wouldn't have had to do with my suggestion addressed. (I also didn't manage to convince myself, so let's address the comment and then I'll try again).

Reviewed 5 of 5 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/base/mem_stats.go, line 21 at r6 (raw file):

// GoMemStats represents information from the Go allocator.
// All units are bytes.
type GoMemStats struct {

I hate to bring this up again, but I find it hard to justify this over just embedding a runtime.MemStats in base.MemStats. What's the thinking there? Things have names that don't match runtime.MemStats too which adds extra friction. I really think this just ought to be runtime.MemStats so that I can just navigate to the explanation from the stdlib instead of trusting that the numbers get propagated to the right fields and that the names match my expectations.

Since you're also holding on to Collected you'll probably want

type GoMemStats struct {
  runtime.MemStats
  Collected time.Time
}

with an accessor NotIdle() which boils down to whatever the final formula for it is (I tried piecing it together from this PR, but it's really not that straightforward and I also wasn't super confident that the result was the right thing, but let's discuss that when it's written down in one place here).


pkg/server/server.go, line 1952 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'd clarify this to say

With a large heap and under extreme conditions, a single GC run may take longer than the default sampling period of 10s. Under normal operations and with more recent versions of Go, this hasn't been observed to be a problem.

Ping


pkg/server/server.go, line 1947 at r6 (raw file):

				statsCollected := make(chan struct{})
				if atomic.CompareAndSwapInt32(&collectingMemStats, 0, 1) {
					if err := s.stopper.RunTask(ctx, "mem-stats-sample", func(ctx context.Context) {

RunAsyncTask or the whole dance is useless :-)


pkg/server/heapprofiler/heapprofiler.go, line 47 at r6 (raw file):

)

type stats struct {

?


pkg/server/heapprofiler/heapprofiler.go, line 101 at r6 (raw file):

	}

	curMemUse := memStats.Go.GoTotal - memStats.Go.GoIdle

Put this in a method on memStats.Go, for example memStats.Go.HeapNotIdle(). You


pkg/server/status/runtime.go, line 550 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Move this to SampleEnvironment, then remove this method completely (we just call runtime.ReadMemStats at the caller).

Ping

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/base/mem_stats.go, line 1 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

19

Done.


pkg/base/mem_stats.go, line 21 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I hate to bring this up again, but I find it hard to justify this over just embedding a runtime.MemStats in base.MemStats. What's the thinking there? Things have names that don't match runtime.MemStats too which adds extra friction. I really think this just ought to be runtime.MemStats so that I can just navigate to the explanation from the stdlib instead of trusting that the numbers get propagated to the right fields and that the names match my expectations.

Since you're also holding on to Collected you'll probably want

type GoMemStats struct {
  runtime.MemStats
  Collected time.Time
}

with an accessor NotIdle() which boils down to whatever the final formula for it is (I tried piecing it together from this PR, but it's really not that straightforward and I also wasn't super confident that the result was the right thing, but let's discuss that when it's written down in one place here).

ok, see now


pkg/server/server.go, line 1952 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Ping

done


pkg/server/server.go, line 1947 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

RunAsyncTask or the whole dance is useless :-)

indeed


pkg/server/heapprofiler/heapprofiler.go, line 47 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

?

removed


pkg/server/heapprofiler/heapprofiler.go, line 101 at r6 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Put this in a method on memStats.Go, for example memStats.Go.HeapNotIdle(). You

Done.


pkg/server/heapprofiler/heapprofiler_test.go, line 52 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

unexpected

n/a


pkg/server/status/runtime.go, line 502 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Nice touch.

Done.


pkg/server/status/runtime.go, line 550 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Ping

done


pkg/server/status/runtime.go, line 556 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Please just use runtime.MemStats here (and use it in base.MemStats) and do all the math later with an accessor. That way it'll be much easier to navigate later.

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking it out! Please add release note about the removed cluster setting. One more substantial comment that's easy to address, one nit, and then we're good to go 🙏

Reviewed 2 of 7 files at r6, 5 of 5 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, @Nishant9, and @tbg)


pkg/base/mem_stats.go, line 25 at r7 (raw file):

// All units are bytes.
type GoMemStats struct {
	// MemStats is a pointer because it's a large struct. We want GoMemStats to be

This isn't anywhere close to any hot path and never will be, so this looks like premature optimization. Even if we cared, wouldn't you just embed the value anyway and deal in *GoMemStats elsewhere? You can runtime.ReadMemStats(&goMemStats.MemStats).


pkg/base/mem_stats.go, line 43 at r7 (raw file):

// is counted as "in use" by this method.
func (ms *GoMemStats) GoInUse() uint64 {
	// NOTE(andrei): It may seem weird that we're only subtracting HeapReleased

I think we're quite likely to shoot ourselves in the foot here in some way. HeapReleased is a counter (i.e. only ever goes up, or am I reading that wrong?) but Sys isn't. And even if they're actually comparable quantities, I don't think we can win anything here by trying to account for "as much Go memory as we can" here because the profile will show only the heap, and because the heuristic isn't properly exercised at scale (i.e. there's no test that puts a cluster under heap pressure at scale and verifies that we get sensible profile scores over extended periods of time and I don't think we're planning to write one). Let's use HeapAlloc which is certifiably the thing we care about (and it shouldn't include HeapIdle which was the original motivation for this PR):

        // HeapAlloc is bytes of allocated heap objects.
        //
        // "Allocated" heap objects include all reachable objects, as
        // well as unreachable objects that the garbage collector has
        // not yet freed. Specifically, HeapAlloc increases as heap
        // objects are allocated and decreases as the heap is swept
        // and unreachable objects are freed. Sweeping occurs
        // incrementally between GC cycles, so these two processes
        // occur simultaneously, and as a result HeapAlloc tends to
        // change smoothly (in contrast with the sawtooth that is
        // typical of stop-the-world garbage collectors).
        HeapAlloc uint64

There's a chance that we'll sometimes see crashes in which the heap remains bounded but some other Go memory (stacks when goroutine leaking perhaps?) explodes, but these are rarer and I expect them to show other signs that we can interpret. Either way, a heap profile in that case wouldn't show us much except that the heap wasn't huge. This information is better conveyed by logging the heap size in the runtime stats, which we do already:

goAllocated := ms.Alloc


pkg/server/server.go, line 1968 at r7 (raw file):

					// Good; we managed to read the Go memory stats quickly enough.
				case <-time.After(time.Second):
					log.Infof(ctx, "get-mem-stats hasn't finished in one second; "+

Now that I'm looking at this, is this logging necessary? The runtime stats will print (stale), right? Your call.


pkg/server/heapprofiler/heapprofiler.go, line 56 at r7 (raw file):

//
// MaybeTakeProfile() is supposed to be called periodically. A profile is taken
// every time the current Go memory usage is the high-water mark. The recorded

s/memory/heap/ (in accordance with my comment about HeapAlloc)


pkg/server/status/runtime.go, line 389 at r7 (raw file):

// to keep runtime statistics current.
//
// SampleEnvironment takes GoMemStats as input because that it collected

s/it/is/

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)


pkg/base/mem_stats.go, line 25 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This isn't anywhere close to any hot path and never will be, so this looks like premature optimization. Even if we cared, wouldn't you just embed the value anyway and deal in *GoMemStats elsewhere? You can runtime.ReadMemStats(&goMemStats.MemStats).

Done.


pkg/base/mem_stats.go, line 43 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I think we're quite likely to shoot ourselves in the foot here in some way. HeapReleased is a counter (i.e. only ever goes up, or am I reading that wrong?) but Sys isn't. And even if they're actually comparable quantities, I don't think we can win anything here by trying to account for "as much Go memory as we can" here because the profile will show only the heap, and because the heuristic isn't properly exercised at scale (i.e. there's no test that puts a cluster under heap pressure at scale and verifies that we get sensible profile scores over extended periods of time and I don't think we're planning to write one). Let's use HeapAlloc which is certifiably the thing we care about (and it shouldn't include HeapIdle which was the original motivation for this PR):

        // HeapAlloc is bytes of allocated heap objects.
        //
        // "Allocated" heap objects include all reachable objects, as
        // well as unreachable objects that the garbage collector has
        // not yet freed. Specifically, HeapAlloc increases as heap
        // objects are allocated and decreases as the heap is swept
        // and unreachable objects are freed. Sweeping occurs
        // incrementally between GC cycles, so these two processes
        // occur simultaneously, and as a result HeapAlloc tends to
        // change smoothly (in contrast with the sawtooth that is
        // typical of stop-the-world garbage collectors).
        HeapAlloc uint64

There's a chance that we'll sometimes see crashes in which the heap remains bounded but some other Go memory (stacks when goroutine leaking perhaps?) explodes, but these are rarer and I expect them to show other signs that we can interpret. Either way, a heap profile in that case wouldn't show us much except that the heap wasn't huge. This information is better conveyed by logging the heap size in the runtime stats, which we do already:

goAllocated := ms.Alloc

Sys is also a counter. That's spelled out in HeapSys for example, which is one of its constituents, that says as well as virtual address space for which the physical memory has been returned to the OS after it became unused.

The reason why I had written what I wrote was because I thought that the stacks are actually reflected somehow in the profile - I kinda remember seeing something about allocating stacks, but maybe I'm thinking of CPU profiles not memory profiles. In any case, I went with HeapAlloc.


pkg/server/server.go, line 1968 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Now that I'm looking at this, is this logging necessary? The runtime stats will print (stale), right? Your call.

removed


pkg/server/heapprofiler/heapprofiler.go, line 56 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

s/memory/heap/ (in accordance with my comment about HeapAlloc)

Done.


pkg/server/status/runtime.go, line 389 at r7 (raw file):

Previously, tbg (Tobias Grieger) wrote…

s/it/is/

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: CI complains, I think it's because various tests now create errant profiles and we check for stray files at the end. Probably have to avoid writing profiles if the profiler is set up with an empty dir.

Reviewed 5 of 5 files at r8.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @andreimatei, @lucy-zhang, and @Nishant9)


pkg/server/heapprofiler/heapprofiler.go, line 56 at r8 (raw file):

Go heap allocated bytes exceeds the previous


pkg/server/heapprofiler/heapprofiler_test.go, line 38 at r8 (raw file):

		{0, 30, true},    // we always take the first profile
		{10, 40, true},   // new high-water mark
		{20, 30, false},  // below high-water mark; no profile

Add a few here to test GC? Or is there a dedicated test elsewhere?

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. Now the profiler is disabled in unit tests. Curious if I need to disable in... any other tests. We'll see.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)


pkg/server/heapprofiler/heapprofiler.go, line 56 at r8 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Go heap allocated bytes exceeds the previous

Done.


pkg/server/heapprofiler/heapprofiler_test.go, line 38 at r8 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Add a few here to test GC? Or is there a dedicated test elsewhere?

well testing GC would need some more mocking. The profiler used in this test is configured to not actually write any files.
I'll leave it...

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r9, 8 of 8 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @lucy-zhang, and @Nishant9)

Before this patch we were taking a heap profile once RSS got above 85%
of system memory and then we wouldn't take another profile until we go
below it and then again above it. The problem with the RSS is that it
includes "goIdle" memory (memory allocated from the OS that's not
actually in use) ; I've seen it be up to several gigs.
With this patch, we now take a profile every time a new high-water mark
is reached. The recorded high-water mark is reset once an hour, this way
ensuring that one big measurement doesn't stop us forever from taking
another profile. It's simpler and it has a chance of catching some
further heap increases.

The rationale behind the patch is that we want profiles when the heap is
large more than when the RSS is large. I'm looking at a case where we
took a heap profile when the heap was 4.5 gigs with 2 gigs idle and then
never took one again because of how the heuristic works. And then we
OOMed when the heap was larger and the idle space was lower, but the RSS
was about the same. With this patch, we would have taken a profile at a
more interesting time.

Release note: None
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)

@tbg
Copy link
Member

tbg commented May 31, 2019

bors pls
bors r+

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

borsy boy
bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @lucy-zhang, @Nishant9, and @tbg)

craig bot pushed a commit that referenced this pull request Jun 3, 2019
36101: roachprod: Minor readme updates r=bobvawter a=bdarnell

Use new convenience commands and demonstrate running a workload

Release note: None

37282: exec: add templating for RANK and ROW_NUMBER window functions r=yuzefovich a=yuzefovich

Templates out rankOp into two operators (for dense and non-dense
case) and templates out support for PARTITION BY clause for both
rankOp and rowNumberOp. The code is undertested and is lacking
benchmarks, but that will be addressed soon.

37412: server/heapprofiler: don't consider "goIdle" memory r=andreimatei a=andreimatei

Before this patch we were taking a heap profile once RSS got above 85%
of system memory and then we wouldn't take another profile until we go
below it and then again above it. The problem with the RSS is that it
includes "goIdle" memory (memory allocated from the OS that's not
actually in use) ; I've seen it be up to several gigs.
With this patch, we now take a profile every time a new high-water mark
is reached. The recorded high-water mark is reset once an hour, this way
ensuring that one big measurement doesn't stop us forever from taking
another profile. It's simpler and it has a chance of catching some
further heap increases.

The rationale behind the patch is that we want profiles when the heap is
large more than when the RSS is large. I'm looking at a case where we
took a heap profile when the heap was 4.5 gigs with 2 gigs idle and then
never took one again because of how the heuristic works. And then we
OOMed when the heap was larger and the idle space was lower, but the RSS
was about the same. With this patch, we would have taken a profile at a
more interesting time.

Release note: None

Co-authored-by: Ben Darnell <ben@bendarnell.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jun 3, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants