Skip to content

Commit

Permalink
Page based heap size heuristics (#50144)
Browse files Browse the repository at this point in the history
This PR implements GC heuristics based on the amount of pages allocated
instead of live objects like was done before.
The heuristic for new heap target is based on
https://dl.acm.org/doi/10.1145/3563323 (in summary it argues that the
heap target should have square root behaviour).
From my testing this fixes
#49545 and
#49761
  • Loading branch information
oscardssmith authored Jul 23, 2023
2 parents d1be33d + 9f3ca7c commit 32aa29f
Show file tree
Hide file tree
Showing 9 changed files with 203 additions and 112 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Language changes

Compiler/Runtime improvements
-----------------------------
* Updated GC heuristics to count allocated pages instead of individual objects ([#50144]).

Command-line option changes
---------------------------
Expand Down
12 changes: 9 additions & 3 deletions doc/src/devdocs/gc.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,12 @@ This scheme eliminates the need of explicitly keeping a flag to indicate a full
## Heuristics

GC heuristics tune the GC by changing the size of the allocation interval between garbage collections.
If a GC was unproductive, then we increase the size of the allocation interval to allow objects more time to die.
If a GC returns a lot of space we can shrink the interval. The goal is to find a steady state where we are
allocating just about the same amount as we are collecting.

The GC heuristics measure how big the heap size is after a collection and set the next
collection according to the algorithm described by https://dl.acm.org/doi/10.1145/3563323,
in summary, it argues that the heap target should have a square root relationship with the live heap, and that it should also be scaled by how fast the GC is freeing objects and how fast the mutators are allocating.
The heuristics measure the heap size by counting the number of pages that are in use and the objects that use malloc. Previously we measured the heap size by counting
the alive objects, but that doesn't take into account fragmentation which could lead to bad decisions, that also meant that we used thread local information (allocations) to make
decisions about a process wide (when to GC), measuring pages means the decision is global.

The GC will do full collections when the heap size reaches 80% of the maximum allowed size.
17 changes: 15 additions & 2 deletions src/gc-debug.c
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
// This file is a part of Julia. License is MIT: https://julialang.org/license

#include "gc.h"
#include "julia.h"
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>

// re-include assert.h without NDEBUG,
Expand Down Expand Up @@ -1217,15 +1220,25 @@ JL_DLLEXPORT void jl_enable_gc_logging(int enable) {
gc_logging_enabled = enable;
}

void _report_gc_finished(uint64_t pause, uint64_t freed, int full, int recollect) JL_NOTSAFEPOINT {
void _report_gc_finished(uint64_t pause, uint64_t freed, int full, int recollect, int64_t live_bytes) JL_NOTSAFEPOINT {
if (!gc_logging_enabled) {
return;
}
jl_safe_printf("GC: pause %.2fms. collected %fMB. %s %s\n",
pause/1e6, freed/1e6,
pause/1e6, freed/(double)(1<<20),
full ? "full" : "incr",
recollect ? "recollect" : ""
);

jl_safe_printf("Heap stats: bytes_mapped %.2f MB, bytes_resident %.2f MB, heap_size %.2f MB, heap_target %.2f MB, live_bytes %.2f MB\n, Fragmentation %.3f",
jl_atomic_load_relaxed(&gc_heap_stats.bytes_mapped)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.bytes_resident)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.heap_size)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.heap_target)/(double)(1<<20),
live_bytes/(double)(1<<20),
(double)live_bytes/(double)jl_atomic_load_relaxed(&gc_heap_stats.heap_size)
);
// Should fragmentation use bytes_resident instead of heap_size?
}

#ifdef __cplusplus
Expand Down
4 changes: 4 additions & 0 deletions src/gc-pages.c
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ char *jl_gc_try_alloc_pages_(int pg_cnt) JL_NOTSAFEPOINT
// round data pointer up to the nearest gc_page_data-aligned
// boundary if mmap didn't already do so.
mem = (char*)gc_page_data(mem + GC_PAGE_SZ - 1);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_mapped, pages_sz);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, pages_sz);
return mem;
}

Expand Down Expand Up @@ -115,6 +117,7 @@ NOINLINE jl_gc_pagemeta_t *jl_gc_alloc_page(void) JL_NOTSAFEPOINT
// try to get page from `pool_freed`
meta = pop_lf_page_metadata_back(&global_page_pool_freed);
if (meta != NULL) {
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, GC_PAGE_SZ);
gc_alloc_map_set(meta->data, GC_PAGE_ALLOCATED);
goto exit;
}
Expand Down Expand Up @@ -188,6 +191,7 @@ void jl_gc_free_page(jl_gc_pagemeta_t *pg) JL_NOTSAFEPOINT
madvise(p, decommit_size, MADV_DONTNEED);
#endif
msan_unpoison(p, decommit_size);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, -decommit_size);
}

#ifdef __cplusplus
Expand Down
Loading

2 comments on commit 32aa29f

@vtjnash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nanosoldier runbenchmarks(ALL, isdaily = true)

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Please sign in to comment.