Skip to content

Commit

Permalink
[PROF-10588] Support GVL profiling preview on Ruby 3.2
Browse files Browse the repository at this point in the history
**What does this PR do?**

This PR builds atop #3929 that added support for GVL profiling for
Ruby 3.3+ and makes GVL profiling also work for Ruby 3.2 .

Supporting GVL profiling on Ruby 3.2 needed special additional work.
That's because while in Ruby 3.2 we have the GVL instrumentation API
giving us the events we need to profile the GVL, we're missing:

1. Getting the Ruby thread `VALUE` as an argument in GVL instrumentation
   API events
2. The `rb_internal_thread_specific` API that allows us to attach in
   a thread-safe way data to Ruby thread objects

Both 1 and 2 were only introduced in Ruby 3.3, and our implementation
of GVL profiling relies/relied on them.

This PR... reimplements 1 & 2 in an alternative way, allowing us to
keep our existing design for 3.3+, while also supporting the older
Ruby version.

I've split it into two commits:
 i. Abstracting access and management of 1 & 2 into a new set of files
    (`gvl_profiling_helper.c`/`gvl_profiling_helper.h`). These new files
    are zero-overhead abstractions for most situations.
ii. Implementing 1 & 2 for Ruby 3.2.

**Motivation:**

We believe GVL profiling is quite an important observability feature
for Ruby, and thus we want to support it on all Ruby versions that
provide the GVL instrumentation API.

**Additional Notes:**

To solve 1, we're using native level thread-locals (GCC's
`__thread`) to keep a pointer to the underlying Ruby `rb_thread_t`
structure.

This is more complex than than "just keep it on a thread-local"
because:

a) Ruby reuses native threads. When a Ruby thread dies, Ruby keeps the
   underlying native thread around for a bit, and if another Ruby
   thread is born very quickly after the previous one, Ruby
   will reuse the native thread and attach it to the new Ruby thread.

   To avoid incorrectly reusing the thread-locals, we install an event
   hook on Ruby thread start, and make sure to clean any native
   thread-locals when a new thread stats.

b) Some of the GVL instrumentation API events are emitted while the
   thread does not have the GVL and so we need to be careful when we can
   and cannot read VM information.

   Thus, we only initialize the thread-local during the
   `RUBY_INTERNAL_THREAD_EVENT_RESUMED` which is emitted while the
   thread owns the GVL.

c) Since we don't get the current thread in events, we need to get a
   bit... creative. Thus, what we do is in
   `RUBY_INTERNAL_THREAD_EVENT_RESUMED`, because we know the current
   thread MUST own the GVL, we read from the internal Ruby VM state
   which thread is the GVL owner to find the info we need.

With a + b + c together we are able to keep a pointer to the underlying
`rb_thread_t` up-to-date in a native thread local, thus replacing the
need to get a `VALUE thread` as an argument.

To solve 2, we rely on an important observation: there's a
`VALUE stat_insn_usage` field inside `rb_thread_t` that's unused and
seems to have effectively been forgotten about.

There's nowhere in the VM code that's writing or reading it (other
than marking it for GC), and not even git history reveals a time
where this field was used. I could not find any other references to
this field anywhere else. Thus, we make use of this field to store
the information we need, as a replacement for
`rb_internal_thread_specific`.

Since presumably Ruby 3.2 will never see this field either removed or used
during its remaining maintenance release period this should work fine,
and we have a nice clean solution for 3.3+.

**How to test the change?**

Happily, with the changes on this PR, the existing test coverage
we added for GVL profiling on 3.3 is also green on 3.2! :)
  • Loading branch information
ivoanjo committed Sep 23, 2024
1 parent 568db8e commit 98f968b
Show file tree
Hide file tree
Showing 10 changed files with 179 additions and 46 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -804,6 +804,10 @@ static VALUE release_gvl_and_run_sampling_trigger_loop(VALUE instance) {

if (state->gvl_profiling_enabled) {
#ifndef NO_GVL_INSTRUMENTATION
#ifdef USE_GVL_PROFILING_3_2_WORKAROUNDS
gvl_profiling_state_thread_tracking_workaround();
#endif

state->gvl_profiling_hook = rb_internal_thread_add_event_hook(
on_gvl_event,
(
Expand Down Expand Up @@ -1315,11 +1319,19 @@ static VALUE _native_resume_signals(DDTRACE_UNUSED VALUE self) {
// it tags threads it's tracking, so if a thread is tagged then by definition we know that thread belongs to the main
// Ractor. Thus, if we really really wanted to access the state, we could do it after making sure we're on the correct Ractor.

#ifdef USE_GVL_PROFILING_3_2_WORKAROUNDS
target_thread = gvl_profiling_state_maybe_initialize();
#endif

bool should_sample = thread_context_collector_on_gvl_running(target_thread);

if (should_sample) {
// should_sample is only true if a thread belongs to the main Ractor, so we're good to go
rb_postponed_job_trigger(after_gvl_running_from_postponed_job_handle);
#ifndef NO_POSTPONED_TRIGGER
rb_postponed_job_trigger(after_gvl_running_from_postponed_job_handle);
#else
rb_postponed_job_register_one(0, after_gvl_running_from_postponed_job, NULL);
#endif
}
} else {
// This is a very delicate time and it's hard for us to raise an exception so let's at least complain to stderr
Expand Down
10 changes: 6 additions & 4 deletions ext/datadog_profiling_native_extension/extconf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,6 @@ def skip_building_extension!(reason)

have_func "malloc_stats"

# On older Rubies, there was no GVL instrumentation API and APIs created to support it
# TODO: We can probably support Ruby 3.2 as well here, but we haven't done that work yet
$defs << "-DNO_GVL_INSTRUMENTATION" if RUBY_VERSION < "3.3"

# On older Rubies, rb_postponed_job_preregister/rb_postponed_job_trigger did not exist
$defs << "-DNO_POSTPONED_TRIGGER" if RUBY_VERSION < "3.3"

Expand All @@ -147,6 +143,12 @@ def skip_building_extension!(reason)
# On older Rubies, some of the Ractor internal APIs were directly accessible
$defs << "-DUSE_RACTOR_INTERNAL_APIS_DIRECTLY" if RUBY_VERSION < "3.3"

# On older Rubies, there was no GVL instrumentation API and APIs created to support it
$defs << "-DNO_GVL_INSTRUMENTATION" if RUBY_VERSION < "3.2"

# Supporting GVL instrumentation on 3.2 needs some workarounds
$defs << "-DUSE_GVL_PROFILING_3_2_WORKAROUNDS" if RUBY_VERSION.start_with?("3.2")

# On older Rubies, there was no struct rb_native_thread. See private_vm_api_acccess.c for details.
$defs << "-DNO_RB_NATIVE_THREAD" if RUBY_VERSION < "3.2"

Expand Down
48 changes: 42 additions & 6 deletions ext/datadog_profiling_native_extension/gvl_profiling_helper.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,49 @@
#include <ruby/thread.h>
#include "gvl_profiling_helper.h"

#if !defined(NO_GVL_INSTRUMENTATION) // Ruby 3.3+
#if !defined(NO_GVL_INSTRUMENTATION) && !defined(USE_GVL_PROFILING_3_2_WORKAROUNDS) // Ruby 3.3+
rb_internal_thread_specific_key_t gvl_waiting_tls_key;

rb_internal_thread_specific_key_t gvl_waiting_tls_key;

void gvl_profiling_init(void) {
gvl_waiting_tls_key = rb_internal_thread_specific_key_create();
}
void gvl_profiling_init(void) {
gvl_waiting_tls_key = rb_internal_thread_specific_key_create();
}

#endif

#ifdef USE_GVL_PROFILING_3_2_WORKAROUNDS // Ruby 3.2
__thread gvl_profiling_thread gvl_waiting_tls;
static bool gvl_profiling_state_thread_tracking_workaround_installed = false;

static void on_thread_start(
DDTRACE_UNUSED rb_event_flag_t _unused1,
DDTRACE_UNUSED const rb_internal_thread_event_data_t *_unused2,
DDTRACE_UNUSED void *_unused3
) {
gvl_waiting_tls = (gvl_profiling_thread) {.thread = NULL};
}

// Hack: We're using the gvl_waiting_tls native thread-local to store per-thread information. Unfortunately, Ruby puts a big hole
// in our plan because it reuses native threads -- specifically, in Ruby 3.2, native threads are still 1:1 to Ruby
// threads (M:N wasn't a thing yet) BUT once a Ruby thread dies, the VM will keep the native thread around for a
// bit, and if another Ruby thread starts right after, Ruby will reuse the native thread, rather than create a new one.
//
// This will mean that the new Ruby thread will still have the same native thread-local data that we set on the
// old thread. For the purposes of our tracking, where we're keeping a pointer to the current thread object in
// thread-local storage **this is disastrous** since it means we'll be pointing at the wrong thread (and its
// memory may have been freed or reused since!)
//
// To work around this issue, once GVL profiling is enabled, we install an event hook on thread start
// events that clears the thread-local data. This guarantees that there will be no stale data -- any existing
// data will be cleared at thread start.
//
// Note that once installed, this event hook becomes permanent -- stopping the profiler does not stop this event
// hook, unlike all others. This is because we can't afford to miss any thread start events while the
// profiler is stopped (e.g. during reconfiguration) as that would mean stale data once the profiler starts again.
void gvl_profiling_state_thread_tracking_workaround(void) {
if (gvl_profiling_state_thread_tracking_workaround_installed) return;

rb_internal_thread_add_event_hook(on_thread_start, RUBY_INTERNAL_THREAD_EVENT_STARTED, NULL);

gvl_profiling_state_thread_tracking_workaround_installed = true;
}
#endif
82 changes: 57 additions & 25 deletions ext/datadog_profiling_native_extension/gvl_profiling_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,38 +6,70 @@

#include "extconf.h"

#if !defined(NO_GVL_INSTRUMENTATION) // Ruby 3.3+
#if !defined(NO_GVL_INSTRUMENTATION) && !defined(USE_GVL_PROFILING_3_2_WORKAROUNDS) // Ruby 3.3+
#include <ruby.h>
#include <ruby/thread.h>
#include "datadog_ruby_common.h"

#include <ruby.h>
#include <ruby/thread.h>
#include "datadog_ruby_common.h"
typedef struct { VALUE thread; } gvl_profiling_thread;
extern rb_internal_thread_specific_key_t gvl_waiting_tls_key;

typedef struct { VALUE thread; } gvl_profiling_thread;
extern rb_internal_thread_specific_key_t gvl_waiting_tls_key;
void gvl_profiling_init(void);

void gvl_profiling_init(void);
static inline gvl_profiling_thread thread_from_thread_object(VALUE thread) {
return (gvl_profiling_thread) {.thread = thread};
}

static inline gvl_profiling_thread thread_from_thread_object(VALUE thread) {
return (gvl_profiling_thread) {.thread = thread};
}
static inline gvl_profiling_thread thread_from_event(const rb_internal_thread_event_data_t *event_data) {
return thread_from_thread_object(event_data->thread);
}

static inline gvl_profiling_thread thread_from_event(const rb_internal_thread_event_data_t *event_data) {
return thread_from_thread_object(event_data->thread);
}
static inline intptr_t gvl_profiling_state_get(gvl_profiling_thread thread) {
return (intptr_t) rb_internal_thread_specific_get(thread.thread, gvl_waiting_tls_key);
}

static inline intptr_t gvl_profiling_state_get(gvl_profiling_thread thread) {
return (intptr_t) rb_internal_thread_specific_get(thread.thread, gvl_waiting_tls_key);
}
static inline void gvl_profiling_state_set(gvl_profiling_thread thread, intptr_t value) {
rb_internal_thread_specific_set(thread.thread, gvl_waiting_tls_key, (void *) value);
}
#endif

#ifdef USE_GVL_PROFILING_3_2_WORKAROUNDS // Ruby 3.2
typedef struct { void *thread; } gvl_profiling_thread;
extern __thread gvl_profiling_thread gvl_waiting_tls;

static inline void gvl_profiling_init(void) { }

// This header gets included in private_vm_access.c which can't include datadog_ruby_common.h so we replicate this
// helper here
#ifdef __GNUC__
#define DDTRACE_UNUSED __attribute__((unused))
#else
#define DDTRACE_UNUSED
#endif

static inline void gvl_profiling_state_set(gvl_profiling_thread thread, intptr_t value) {
rb_internal_thread_specific_set(thread.thread, gvl_waiting_tls_key, (void *) value);
}
// NOTE: This is a hack that relies on the knowledge that on Ruby 3.2 the
// RUBY_INTERNAL_THREAD_EVENT_READY and RUBY_INTERNAL_THREAD_EVENT_RESUMED events always get called on the thread they
// are about. Thus, we can use our thread local storage hack to get this data, even though the event doesn't include it.
static inline gvl_profiling_thread thread_from_event(DDTRACE_UNUSED const void *event_data) {
return gvl_waiting_tls;
}

void gvl_profiling_state_thread_tracking_workaround(void);
gvl_profiling_thread gvl_profiling_state_maybe_initialize(void);

// Implementing these on Ruby 3.2 requires access to private VM things, so the following methods are
// implemented in `private_vm_api_access.c`
gvl_profiling_thread thread_from_thread_object(VALUE thread);
intptr_t gvl_profiling_state_get(gvl_profiling_thread thread);
void gvl_profiling_state_set(gvl_profiling_thread thread, intptr_t value);
#endif

static inline intptr_t gvl_profiling_state_thread_object_get(VALUE thread) {
return gvl_profiling_state_get(thread_from_thread_object(thread));
}
#ifndef NO_GVL_INSTRUMENTATION // For all Rubies supporting GVL profiling (3.2+)
static inline intptr_t gvl_profiling_state_thread_object_get(VALUE thread) {
return gvl_profiling_state_get(thread_from_thread_object(thread));
}

static inline void gvl_profiling_state_thread_object_set(VALUE thread, intptr_t value) {
gvl_profiling_state_set(thread_from_thread_object(thread), value);
}
static inline void gvl_profiling_state_thread_object_set(VALUE thread, intptr_t value) {
gvl_profiling_state_set(thread_from_thread_object(thread), value);
}
#endif
51 changes: 51 additions & 0 deletions ext/datadog_profiling_native_extension/private_vm_api_access.c
Original file line number Diff line number Diff line change
Expand Up @@ -755,3 +755,54 @@ static inline int ddtrace_imemo_type(VALUE imemo) {
return GET_VM()->objspace;
}
#endif

#ifdef USE_GVL_PROFILING_3_2_WORKAROUNDS // Ruby 3.2
#include "gvl_profiling_helper.h"

gvl_profiling_thread thread_from_thread_object(VALUE thread) {
return (gvl_profiling_thread) {.thread = thread_struct_from_object(thread)};
}

// Hack: In Ruby 3.3+ we attach gvl profiling state to Ruby threads using the
// rb_internal_thread_specific_* APIs. These APIs did not exist on Ruby 3.2. On Ruby 3.2 we instead store the
// needed data inside the `rb_thread_t` structure, specifically in `stat_insn_usage` as a Ruby FIXNUM.
//
// Why `stat_insn_usage`? We needed some per-thread storage, and while looking at the Ruby VM sources I noticed
// that `stat_insn_usage` has been in `rb_thread_t` for a long time, but is not used anywhere in the VM
// code. There's a comment attached to it "/* statistics data for profiler */" but other than marking this
// field for GC, I could not find any place in the VM commit history or on GitHub where this has ever been used.
//
// Thus, since this hack is only for 3.2, which presumably will never see this field either removed or used
// during its remaining maintenance release period we... kinda take it for our own usage. It's ugly, I know...
intptr_t gvl_profiling_state_get(gvl_profiling_thread thread) {
if (thread.thread == NULL) return 0;

VALUE current_value = ((rb_thread_t *)thread.thread)->stat_insn_usage;
intptr_t result = current_value == Qnil ? 0 : FIX2LONG(current_value);
return result;
}

void gvl_profiling_state_set(gvl_profiling_thread thread, intptr_t value) {
if (thread.thread == NULL) return;
((rb_thread_t *)thread.thread)->stat_insn_usage = LONG2FIX(value);
}

// Because Ruby 3.2 does not give us the current thread when calling the RUBY_INTERNAL_THREAD_EVENT_READY and
// RUBY_INTERNAL_THREAD_EVENT_RESUMED APIs, we need to figure out this info ourselves.
//
// Specifically, this method was created to be called from a RUBY_INTERNAL_THREAD_EVENT_RESUMED callback --
// when it's triggered, we know the thread the code gets executed on is holding the GVL, so we use this
// opportunity to initialize our thread-local value.
gvl_profiling_thread gvl_profiling_state_maybe_initialize(void) {
gvl_profiling_thread current_thread = gvl_waiting_tls;

if (current_thread.thread == NULL) {
// threads.sched.running is the thread currently holding the GVL, which when this gets executed is the
// current thread!
current_thread = (gvl_profiling_thread) {.thread = (void *) rb_current_ractor()->threads.sched.running};
gvl_waiting_tls = current_thread;
}

return current_thread;
}
#endif
4 changes: 2 additions & 2 deletions lib/datadog/profiling/component.rb
Original file line number Diff line number Diff line change
Expand Up @@ -443,9 +443,9 @@ def self.build_profiler_component(settings:, agent_settings:, optional_tracer:)
end

private_class_method def self.enable_gvl_profiling?(settings)
if RUBY_VERSION < "3.3"
if RUBY_VERSION < "3.2"
if settings.profiling.advanced.preview_gvl_enabled
Datadog.logger.warn("GVL profiling is currently not supported in Ruby < 3.3 and will not be enabled.")
Datadog.logger.warn("GVL profiling is currently not supported in Ruby < 3.2 and will not be enabled.")
end

return false
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@
end

context "when gvl_profiling_enabled is true on an unsupported Ruby" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.3." }
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.2." }

let(:gvl_profiling_enabled) { true }

Expand Down
2 changes: 1 addition & 1 deletion spec/datadog/profiling/collectors/thread_context_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1833,7 +1833,7 @@ def sample_and_check(expected_state:)
end

context "on legacy Rubies" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.3." }
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.2." }

it "is not set" do
per_thread_context.each do |_thread, context|
Expand Down
8 changes: 4 additions & 4 deletions spec/datadog/profiling/component_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -598,8 +598,8 @@
settings.profiling.advanced.gc_enabled = false
end

context "on Ruby < 3.3" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.3." }
context "on Ruby < 3.2" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION >= "3.2." }

it "does not enable GVL profiling" do
expect(Datadog::Profiling::Collectors::CpuAndWallTimeWorker)
Expand All @@ -615,8 +615,8 @@
end
end

context "on Ruby >= 3.3" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION < "3.3." }
context "on Ruby >= 3.2" do
before { skip "Behavior does not apply to current Ruby version" if RUBY_VERSION < "3.2." }

context "when timeline is enabled" do
before { settings.profiling.advanced.timeline_enabled = true }
Expand Down
4 changes: 2 additions & 2 deletions spec/datadog/profiling/spec_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,8 @@ def self.maybe_fix_label_range(key, value)
end

def skip_if_gvl_profiling_not_supported(testcase)
if RUBY_VERSION < "3.3."
testcase.skip "GVL profiling is only supported on Ruby >= 3.3"
if RUBY_VERSION < "3.2."
testcase.skip "GVL profiling is only supported on Ruby >= 3.2"
end
end
end
Expand Down

0 comments on commit 98f968b

Please sign in to comment.