-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDTEST-523] Expand test impact analysis with allocation tracing #197
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #197 +/- ##
==========================================
- Coverage 98.87% 98.86% -0.01%
==========================================
Files 231 235 +4
Lines 10368 10477 +109
Branches 475 481 +6
==========================================
+ Hits 10251 10358 +107
- Misses 117 119 +2 ☔ View full report in Codecov by Sentry. |
…source location multiple times; test that allocations between tests are not attributed to a test
…sts. Fix them by safely getting source location with suppressing exceptions.
6e18534
to
b78f095
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few notes! I didn't do a pass very deep into the get_source_location
part (ran out of time -- I can do it in the next pass :D) but hopefully this helps
… in st.c instead and process them once when test ends
just tested this branch as of the latest commit (0b1618c) and verified the segfault is gone and the test impact analysis is working as expected for models. |
Thank you for testing and your feedback @devinburnette! I'll do a couple more passes on Monday and will release soon |
…object_allocation_tracepoint to nil instead
@ivoanjo this is ready for another pass - seems to be working now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left another round of suggestions, but I think overall my comments fall in the "extra things" bucket, and this PR seems reasonable even as-is.
ext/datadog_cov/datadog_cov.c
Outdated
enum ruby_value_type type = rb_type(new_object); | ||
if (type != RUBY_T_OBJECT && type != RUBY_T_STRUCT) | ||
{ | ||
return; | ||
} | ||
|
||
// if ignored_path is provided and the current filename is located under the ignored_path, we skip it too | ||
// this is useful for ignoring bundled gems location | ||
if (dd_cov_data->ignored_path_len != 0 && strncmp(dd_cov_data->ignored_path, filename_ptr, dd_cov_data->ignored_path_len) == 0) | ||
VALUE klass = rb_class_of(new_object); | ||
if (klass == Qnil || klass == 0) | ||
{ | ||
return; | ||
} | ||
// Skip anonymous classes starting with "#<Class". | ||
// it allows us to skip the source location lookup that will always fail | ||
const char *name = rb_obj_classname(new_object); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow-up, on the discussion of what we can do in the new object tracepoint, in some cases rb_obj_classname
will definitely cause new objects to be allocated.
Thus doing this check here may be not be safe (although arguably we've been doing it in the profiler and I've never seen issues...) and you may want to delay it perhaps.
The good news is maybe there's a few, even better options. I was looking what rb_obj_classname
does, and maybe there's an alternative that is even better for our purposes because it doesn't even need to allocate the string to represent the anonymous class -- I'm thinking of rb_mod_name
or some of the other ones that exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your rb_mod_name
suggestion - it returns Qnil
if klass is anonymous, so it works for filtering out anonymous classes, and works a bit faster
// Get source location for a given class name | ||
static VALUE get_source_location(VALUE klass_name) | ||
{ | ||
return rb_funcall(rb_cObject, rb_intern("const_source_location"), 1, klass_name); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on the topic of performance, it occurs to me that if you're starting and stopping coverage on every individual test case, you may be redoing these kinds of lookups again and again.
Maybe not for this PR, but perhaps it's worth considering having some kind of a cache that would live across coverage start/stops? (Of course with proper sizing, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, interesting thing that I tried two kind of caches:
- hashtable (st.c) that tracks source file for every class
- hashtable that for every class stores boolean value indicating whether the source file is from inside project or from external gem
Every combination of these 2 approaches ether do not change overhead for rubocop test source or makes it on about 4% slower (on average)! I did not investigate it further so I don't know yet why intuition does not work in this case: maybe if I have time I'll do another pass.
For now, I am leaving it as is without additional caches as I believe that every optimisation must be backed by solid data, otherwise it will be just another source of bugs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable 👍
…ses_table is NULL
Problem statement
Using line coverage for test impact analysis has a major limitation in Ruby: consider the following examples:
The test "instantiate MyClass" does not cover MyClass because there are no executable lines in MyClass. If initializer was inherited from OtherClass, then this test will have
other_class.rb
in the list of covered files but notmy_class.rb
.This leads to a major intelligent test runner bug: if we change initializer of MyClass like that:
then the test above will start failing because
MyClass.new
expects argument now. But becausemy_class.rb
is not covered by this test, intelligent test runner will skip test by default! It causes broken tests to be merged in the default branch.If this example might seem artificial, unfortunately the same happens with ActiveRecord models or with ActiveModel classes:
Solution
We cannot overcome this limitation by using line coverage: the code coverage approach works correctly in this case and this is just how line coverage works. We need to go deeper in Ruby VM tracing using techniques that are already used by continuous profiler.
For this limitation, I've chosen to reach out for heap allocation tracepoint: it is possible to spy on every new object allocation that happens in Ruby heap. Even if no code from this class is executed during the test, it is enough for us to know that the test instantiates instances of this class to add its filename to the list of impacted files.
Notes on implementation:
RUBY_INTERNAL_EVENT_NEWOBJ
event type is usedrb_tracepoint_new
to registerRUBY_INTERNAL_EVENT_NEWOBJ
tracepointModule.const_source_location(klass_name)
Ruby API is used to get the source of a constant (every class name is a constant). This API is available from Ruby 2.7 - this is exactly the oldest Ruby version supported for test visibility product. We use this API from C withrb_funcall(rb_cObject, rb_intern("const_source_location"), 1, klass_name)
rb_protect
to ignore exceptions when getting source code location of a class (it fails for many internal classes)#<Class:0ff0eabcde>
- we explicitly ignore them because getting source location for these classes always failsKnown limitations
before
hook or during the test). If the test suite has some models cached in global state and shared between tests, and these models don't have any methods implemented on them, the test impact analysis will still miss them. We need to change docs on Intelligent test runner in Ruby reiterating that global state is harmful and can cause flakiness and incompatibility with ITR.rb_tracepoint_new
cannot be attached to a specific thread, so the allocation tracing is enabled for multi threaded coverage mode only (this mode is the default one and the only one that can work for Rails, so it is not a major problem)How to test the change?
Tested using by running test suites of the following open source projects:
Unit tests reproducing the original problem are provided. See performance evaluation below.
Performance evaluation
Median performance overhead for some OSS projects' test suites before this change:
Results from benchmarks after this change:
Overall this change increases code coverage overhead on test suites by 30-40% (compared to overhead before the change) in relative numbers. In absolute numbers depending on project's size and characteristics it means from 7% up to 30% more time spent in tests (the maximum overhead is from rubocop which is particularly challenging for profiling: 21k relatively fast tests).