Concurrent page sweeping #48969

d-netto · 2023-03-11T05:08:38Z

Extends #48600 by making sweeping of object pools concurrent.

vchuravy · 2023-04-20T17:52:43Z

Can you rebase on top of #48600?

oscardssmith · 2023-05-10T19:45:56Z

Is this intentionally on top of #49644?

vchuravy

Can you disentangle this from #49644 so that we can study
#48969 (comment) independently?

d-netto · 2023-05-15T20:15:12Z

Should be independent of #49644 now.

src/gc.c

d-netto · 2023-05-16T01:58:26Z

Latest commit should allow part of sweeping of object pools to run concurrently with mutator threads independently of whether we have GC threads or not (e.g. a program running with --threads=4 --gcthreads=1 could in theory benefit from it).

The cost if, of course, more contention on gc_perm_lock. This needs more careful analysis to confirm that we are not making GC pauses shorter at the cost of throughput.

gbaraldi · 2023-06-06T15:10:07Z

I think the solution is to do away with that perm_lock. It doesn't seem too complicated to do that and switch to doing Compare and Swap.

PallHaraldsson · 2023-06-25T16:29:07Z

it seems like this PR can be detrimental to throughput even though it reduces GC pauses in a few cases.

Both are good properties, so if there's (necessarily) a trade-off, can it still me merged with it off by default, and an ENV var to enable for low GC pauses? While you want to avoid allocations, and the GC entirely, for real-time, it's hard to do fully, and shorter pauses very valuable for soft real-time. It's just a question what to call this ENV var, CONCURRENT_SWEEP_GC (or e.g. SOFT_REAL-TIME_GC)?

Stale

vchuravy

This PR brings real and tangible benefits for multi-threaded code that is allocating, by significantly shortening the STW phase, therefore improving scalabity (Amdahl's law says hi).

I believe we should add an environment and runtime flag for this feature.

On systems vulnerable to Meltdown&Spector KPTI can cause iTLB flushes. With concurrent page-sweeping instead of paying this cost "once"
we will concurrently invalidate the iTLB leading to runtime performance loss.

In particular for the GCBenchmark tree_multable I saw an increase in cpu-time being spent in __madvise and cpu-time being spent in asm_sysvec_call_function on the threads that are not running concurrent GC.

@kpamnany also voiced discomfort with the system being oversubscribed.

I also found it counter-intuitive that --gcthreads=1 would disable concurrent page sweeping.

In the long-term open questions for me are:

Could we implement this with the tasking system, e.g. schedule a task that will some cleanup work?
We could try out io_uring for batching the madvise calls, but that would be significant work.
If concurrent sweeping is disabled, we could run this after the STW phase ended, but before the finalizers. This would alleviate some of @kpamnany oversubscription concerns, while still moving the cost out of the STW phase.

d-netto · 2023-06-25T19:14:57Z

To keep things consistent with --threads, I was thinking about something like --gcthreads=X,Y, with X being the number of threads that may run parallel marking and Y being the number of threads that may run concurrent page sweeping (0 or 1, chosen to be 0 by default).

Open to suggestions on that.

d-netto · 2023-06-27T03:15:57Z

Bump.

src/jloptions.c

Implements concurrent sweeping of fully empty pages. Concurrent sweeping is disabled by default and may be enabled through the --gcthreads flag. Co-authored-by: Valentin Churavy <v.churavy@gmail.com>

d-netto added the GC Garbage collector label Mar 11, 2023

d-netto marked this pull request as draft March 11, 2023 23:38

d-netto force-pushed the dcn/psweep branch 4 times, most recently from c2c1855 to 1b22c5d Compare May 9, 2023 02:53

d-netto marked this pull request as ready for review May 10, 2023 19:04

d-netto requested review from gbaraldi, vchuravy and vtjnash May 10, 2023 19:04

d-netto changed the title ~~WIP: Parallel sweeping~~ Parallel/concurrent sweeping May 10, 2023

d-netto force-pushed the dcn/psweep branch 9 times, most recently from cf8d9fb to a9ad110 Compare May 14, 2023 15:33

This comment was marked as outdated.

Sign in to view

vchuravy previously requested changes May 15, 2023

View reviewed changes

d-netto force-pushed the dcn/psweep branch from a9ad110 to 74a2c11 Compare May 15, 2023 20:13

vchuravy reviewed May 15, 2023

View reviewed changes

src/gc.c Outdated Show resolved Hide resolved

src/gc.c Outdated Show resolved Hide resolved

vchuravy self-requested a review June 6, 2023 15:27

d-netto force-pushed the dcn/psweep branch 2 times, most recently from e060962 to 2f9f0ff Compare June 24, 2023 07:25

vchuravy force-pushed the dcn/psweep branch from 2f9f0ff to f29fd67 Compare June 25, 2023 09:05

d-netto force-pushed the dcn/psweep branch from f29fd67 to a4592d5 Compare June 25, 2023 14:00

vchuravy self-requested a review June 25, 2023 16:50

vchuravy changed the title ~~Concurrent sweeping~~ Concurrent page sweeping Jun 25, 2023

vchuravy requested changes Jun 25, 2023

View reviewed changes

d-netto force-pushed the dcn/psweep branch 3 times, most recently from cee0701 to 4fedfda Compare June 26, 2023 20:27

vchuravy approved these changes Jun 27, 2023

View reviewed changes

vchuravy reviewed Jun 27, 2023

View reviewed changes

src/jloptions.c Show resolved Hide resolved

vchuravy added the merge me PR is reviewed. Merge when all tests are passing label Jun 27, 2023

d-netto force-pushed the dcn/psweep branch from 7d336e5 to 03fe8c9 Compare June 27, 2023 15:14

implement concurrent sweeping

9eb079d

d-netto force-pushed the dcn/psweep branch from 03fe8c9 to 9eb079d Compare June 27, 2023 20:54

d-netto merged commit 9dc2991 into JuliaLang:master Jun 28, 2023

d-netto removed the merge me PR is reviewed. Merge when all tests are passing label Jun 28, 2023

vchuravy mentioned this pull request Jun 28, 2023

Do page free'ing outside STW #50320

Open

oscardssmith mentioned this pull request Jun 28, 2023

Building nightly with multiple Julia GC threads causes segfault #50327

Closed

d-netto mentioned this pull request Jun 28, 2023

initialize jl_n_markthreads and jl_n_sweepthreads to be consistent with no parallel GC on bootstrap #50332

Merged

d-netto mentioned this pull request Aug 7, 2023

Backport concurrent page sweeping RelationalAI/julia#30

Closed

brenhinkeller mentioned this pull request Aug 8, 2023

"All-threads allocating garbage" Multithreading Benchmark shows significant slowdown as nthreads increases #33033

Closed

d-netto mentioned this pull request Sep 6, 2023

Backport concurrent sweeping RelationalAI/julia#38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent page sweeping #48969

Concurrent page sweeping #48969

d-netto commented Mar 11, 2023 •

edited

Loading

vchuravy commented Apr 20, 2023

oscardssmith commented May 10, 2023

This comment was marked as outdated.

vchuravy left a comment

d-netto commented May 15, 2023

d-netto commented May 16, 2023

gbaraldi commented Jun 6, 2023

PallHaraldsson commented Jun 25, 2023

vchuravy left a comment

d-netto commented Jun 25, 2023

d-netto commented Jun 27, 2023

Concurrent page sweeping #48969

Concurrent page sweeping #48969

Conversation

d-netto commented Mar 11, 2023 • edited Loading

vchuravy commented Apr 20, 2023

oscardssmith commented May 10, 2023

This comment was marked as outdated.

vchuravy left a comment

Choose a reason for hiding this comment

d-netto commented May 15, 2023

d-netto commented May 16, 2023

gbaraldi commented Jun 6, 2023

PallHaraldsson commented Jun 25, 2023

vchuravy left a comment

Choose a reason for hiding this comment

d-netto commented Jun 25, 2023

d-netto commented Jun 27, 2023

d-netto commented Mar 11, 2023 •

edited

Loading