Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: goroutines may allocate beyond the hard heap goal #40460

Open
mknyszek opened this issue Jul 28, 2020 · 2 comments
Open

runtime: goroutines may allocate beyond the hard heap goal #40460

mknyszek opened this issue Jul 28, 2020 · 2 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mknyszek
Copy link
Contributor

Today, it's possible (though unlikely) that goroutines may allocate beyond the hard heap goal.

Consider the following scenario:

  1. A goroutine (G1) spins, making many large allocations in a loop.
  2. The heap is small, or there isn't actually much mark work, so the mark phase is paced to start fairly late (let's say at 95 MiB for a 100 MiB heap goal).
  3. G1 is forced to assist the GC, and in doing so accumulates a lot of credit relative to the heap size (let's say 30 MiB for a 100 MiB heap goal because it over-assists -- not unheard of).
  4. Through background work and assists, nearly all the work in the GC is done, right on schedule.
  5. Background GC goroutines try to terminate the mark phase, but something prevents them from doing so (e.g. runtime: ReadMemStats called in a loop may prevent GC #40459, or more work is found) many times (perhaps we get unlucky).
  6. G1, having accumulated a bunch of credit, continues allocating in parallel during this period, unimpeded.
  7. Very little time passes, but G1 successfully uses 16 MiB of its credit to generate 16 MiB of allocations, leading to a heap of 111 MiB before sweep.

We've now exceeded the hard heap goal, which is currently 1.1 * the heap goal (so 110 MiB in this example).

The fundamental problem here is that there is nothing to push back on G1 while the GC is nearly done, or after we've exceeded the soft goal. The assist credit system is the sole mechanism through which a goroutine may be blocked and forced to assist. If it still has credit, it won't even bother. This credit mechanism is not necessarily wrong, since it's what allows us to keep the assist cost down by amortizing it, but once we're in this regime where the most important thing is to finish the GC or we're in danger of exceeding the hard goal, it stands to reason that the goroutine should not be allowed to allocate. Doing so directly may have significant negative consequences, so a real solution here needs more thought.

With larger heaps this is even less likely to happen (because allocating fast enough is hard to do), but still possible.

@mknyszek
Copy link
Contributor Author

CC @aclements @dr2chase

@mknyszek mknyszek added this to the Backlog milestone Jul 28, 2020
@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label Jul 28, 2020
@mknyszek
Copy link
Contributor Author

I have a proposed fix: #37331 (comment)

Copied here just so it's one less click away:

Here's a quick fix idea: what if a goroutine could never have more credit than the amount of heap runway left (basically the difference between the heap goal and the heap size at the point the assist finishes)? Then by construction a goroutine could never allocate past the heap goal without going into assist first and finishing off the GC cycle. The downside is you could have many goroutines try to end GC at the same time (go into gcMarkDone, I mean, and start trying to acquire markDoneSema), stalling the whole program a little bit as everything waits for GC to actually end. This situation should be exceedingly rare, though, and only potentially common when GOMAXPROCS=1 in which case there will only ever be 1 goroutine in gcMarkDone.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: Triage Backlog
Development

No branches or pull requests

2 participants