-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-9226] Add test for Ruby profiler when extension dir is moved and a relative rpath is needed #39
[PROF-9226] Add test for Ruby profiler when extension dir is moved and a relative rpath is needed #39
Conversation
…d relative rpath is needed **What does this PR do?** This PR adds a new test case that validates that DataDog/dd-trace-rb#3582 and DataDog/dd-trace-rb#3683 keep working fine. **Motivation:** As described in DataDog/dd-trace-rb#3683, this a somewhat annoying thing to test, but important to avoid regressing. **Additional Notes:** You can actually see the evolution of both of those fixes in this test. E.g. here's dd-trace-rb 1.21.0 (prior to DataDog/dd-trace-rb#3582 ): ``` W, [2024-06-12T09:34:08.759519 #7] WARN -- ddtrace: [ddtrace] (/app/vendor-moved/bundle/ruby/3.3.0/gems/ddtrace-1.21.1/lib/datadog/core/configuration/components.rb:115:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.3.2_x86_64-linux due to /app/vendor-moved/bundle/ruby/3.3.0/gems/ddtrace-1.21.1/lib/datadog/profiling/../../datadog_profiling_native_extension.3.3.2_x86_64-linux.so: cannot open shared object file: No such file or directory' at '/app/vendor-moved/bundle/ruby/3.3.0/gems/ddtrace-1.21.1/lib/datadog/profiling/load_native_extension.rb:26:in `<top (required)>'' --- FAIL: TestScenarios/scenarios/ruby_extension_dir_and_rpath (14.86s) ``` in this version, we failed because we couldn't load the native extension. Then here's dd-trace-rb 1.23.1 (without DataDog/dd-trace-rb#3683 ) and if we don't move the `vendor` folder (but still delete the so from the lib folder): ``` --- PASS: TestScenarios/scenarios/ruby_extension_dir_and_rpath (18.96s) ``` ...but if we additionally move the vendor folder (aka what this PR does in the Dockerfile): ``` W, [2024-06-12T09:37:33.517188 #6] WARN -- ddtrace: [ddtrace] (/app/vendor-moved/bundle/ruby/3.3.0/gems/ddtrace-1.23.1/lib/datadog/core/configuration/components.rb:116:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.3.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/app/vendor-moved/bundle/ruby/3.3.0/gems/ddtrace-1.23.1/lib/datadog/profiling/load_native_extension.rb:39:in `<top (required)>'' --- FAIL: TestScenarios/scenarios/ruby_extension_dir_and_rpath (3.25s) ``` Notice it fails BUT the error is now different from the one above -- the error is relating to loading `libdatadog_profiling.so`, not `datadog_profiling_native_extension.3.3.2_x86_64-linux.so`. And with the change in DataDog/dd-trace-rb#3683 (which will be in 1.23.2): ``` --- PASS: TestScenarios/scenarios/ruby_extension_dir_and_rpath (9.60s) ``` **NOTE**: For this test, unlike other Ruby tests we have, we're pulling in the latest **released** gem version (e.g. with `gem 'datadog'` on the `gems.rb` file), not the latest from git (as we do for other Ruby tests). This is because gems get installed in different paths when bundler downloads them directly from git, and we want to validate the path when a stable version is installed. This also means that this PR will show up as failed until the latest datadog release (which will be 2.2.0) gets released. (Or 1.23.2, but I left the test setup to test the latest 2.x releases, not the 1.x ones, although I used 1.x on my tests above to show the evolution of the issue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'll wait until release 2.2.0 goes out with the fix to merge this, which may take a bit longer because of the DASH freeze. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for adding this in!
We just want to do a sanity check on data, so let's shorten the test duration.
As I was preparing to merge #39 I noticed that CI had not failed for Ruby prior to version 2.2.0 of the Ruby library being released. (Specifically: it had failed, but not for Ruby, which is why I had not spotted it before). The whole point of this test was that it was _supposed_ to be failing prior to 2.2.0 being released. Under closer examination, what happened was that I had tested with the 1.x branch prior to opening the PR (and all my examples in the PR description are from 1.x), and of course I missed that the test setup was not correct for triggering the issue on 2.x due to the library getting renamed. I've now fixed this, confirmed it's broken with versions < 2.2.0 of the Ruby library, and confirmed it passes for the right reason on >= 2.2.0. This PR is thus ready to be merged :)
As I was preparing to merge #39 I noticed that CI had not failed for Ruby prior to version 2.2.0 of the Ruby library being released. (Specifically: it had failed, but not for Ruby, which is why I had not spotted it before). The whole point of this test was that it was supposed to be failing prior to 2.2.0 being released. Under closer examination, what happened was that I had tested with the 1.x branch prior to opening the PR (and all my examples in the PR description are from 1.x), and of course I missed that the test setup was not correct for triggering the issue on 2.x due to the library getting renamed. I've now fixed this, confirmed it's broken with versions < 2.2.0 of the Ruby library, and confirmed it passes for the right reason on >= 2.2.0. This PR is thus ready to be re-reviewed/merged :) |
Changing the default `test_duration` in the Ruby sources is actually not enough >_>
This whole test is more of a "profiler runs/it doesn't" so let's simplify the assertions on the resulting data as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
This PR adds a new test case that validates that DataDog/dd-trace-rb#3582 and DataDog/dd-trace-rb#3683 keep working fine.
Motivation:
As described in DataDog/dd-trace-rb#3683, this a somewhat annoying thing to test, but important to avoid regressing.
Additional Notes:
You can actually see the evolution of both of those fixes in this test.
E.g. here's dd-trace-rb 1.21.0 (prior to DataDog/dd-trace-rb#3582 ):
in this version, we failed because we couldn't load the native extension.
Then here's dd-trace-rb 1.23.1 (without DataDog/dd-trace-rb#3683 ) and if we don't move the
vendor
folder (but still delete the so from the lib folder):...but if we additionally move the vendor folder (aka what this PR does in the Dockerfile):
Notice it fails BUT the error is now different from the one above -- the error is relating to loading
libdatadog_profiling.so
, notdatadog_profiling_native_extension.3.3.2_x86_64-linux.so
.And with the change in DataDog/dd-trace-rb#3683 (which will be in 1.23.2 / 2.2.0):
NOTE: For this test, unlike other Ruby tests we have, we're pulling in the latest released gem version (e.g. with
gem 'datadog'
on thegems.rb
file), not the latest from git (as we do for other Ruby tests).This is because gems get installed in different paths when bundler downloads them directly from git, and we want to validate the path when a stable version is installed.
This also means that this PR will show up as failed until the latest datadog release (which will be 2.2.0) gets released. (Or 1.23.2, but I left the test setup to test the latest 2.x releases, not the 1.x ones, although I used 1.x on my tests above to show the evolution of the issue).