-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-9926] Fix rpath for linking to libdatadog when loading from extension dir (cherry-pick from 1.x-stable) #3706
Merged
ivoanjo
merged 2 commits into
master
from
ivoanjo/extend-relative-rpath-extensions-folder-master
Jun 12, 2024
Merged
[PROF-9926] Fix rpath for linking to libdatadog when loading from extension dir (cherry-pick from 1.x-stable) #3706
ivoanjo
merged 2 commits into
master
from
ivoanjo/extend-relative-rpath-extensions-folder-master
Jun 12, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ension dir **What does this PR do?** This PR is a follow-up to #3582 . In that PR, we fixed loading the profiling native extension so that it could be loaded from the Ruby extensions directory (see the original PR for more details). It turns out this was not enough! Specifically, the customer reported that they saw the following error > Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling > native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux > due to libdatadog_profiling.so: cannot open shared object file: No such file or directory Specifically, what this message tells is that we're finding the profiling native extension BUT it's failing to load BECAUSE the dynamic loader is not able to find its `libdatadog_profiling.so` dependency. From debugging the issue with the customer, I suspect that what we're seeing here is a repeat of #2067 / #2125 , that is, the paths where the profiler is compiled are changed at deployment, and so we also need to adjust the relative rpath to account for this. I haven't yet confirmed with the customer that this is their issue, BUT I was able to reproduce the exact problem if I moved the installation of the library in the way I mention above (see "how to test the change", below). **Motivation:** Fix this weird corner case that made the profiler not load. **Additional Notes:** This is a really really weird corner case, so I'm happy to further describe what the issue is if my description above + the comments in the code are still too cryptic to understand. **How to test the change?** I've added test code for the helper, but actually validating the whole rpath thing is a bit annoying. Here's how I triggered the issue myself, and then used it to validate the fix: ``` # Build fixed gem into folder, will be used later $ bundle exec rake build datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem. # Open a clean Ruby docker installation $ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash # I've created a minimal test gemfile ahead of time /working/rpathtest# cat gems.rb source 'https://rubygems.org' gem 'datadog' # Tell bundler to install the gem into a folder /working/rpathtest# bundle config set --local path 'vendor/bundle' /working/rpathtest# bundle install # Confirm profiler works: /working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1" # ... No errors loading profiler ... # Now let's simulate the native extension being loaded from the # extensions directory: /working/rpathtest# find | grep \.so$ | grep datadog ./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so ./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so ./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so ./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so /working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so # Confirm profiler still works: /working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1" # ... No errors loading profiler ... # Now let's simulate the folders being moved (the issue being fixed): /working/rpathtest# cat /usr/local/bundle/config --- BUNDLE_PATH: "vendor/bundle" # Update this to vendor2... working/rpathtest# cat /usr/local/bundle/config --- BUNDLE_PATH: "vendor2/bundle" # and move the folder /working/rpathtest# mv vendor/ vendor2 # Now we've triggered the exact same error message as reported by the # customer /working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1" W, [2024-06-05T15:51:12.488843 #517] WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>'' # Now let's test the fix. Let's start by recreating the issue: # Put the fixed version into the bundler cache... /working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem # force bundler to reinstall... working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/ working/rpathtest# bundle install # Force gem to be loaded from extension directory /working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so # Confirm it works: /working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1" # ... No errors loading profiler ... # Let's now change the vendor folder again: /working/rpathtest# cat /usr/local/bundle/config --- BUNDLE_PATH: "vendor3/bundle" /working/rpathtest# mv vendor2/ vendor3 # And it now doesn't fail: /working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1" # ... No errors loading profiler ... # And extra confirmation that the relative paths are working: /working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000) ```
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3706 +/- ##
==========================================
- Coverage 98.11% 98.11% -0.01%
==========================================
Files 1225 1225
Lines 72806 72836 +30
Branches 3482 3485 +3
==========================================
+ Hits 71431 71460 +29
- Misses 1375 1376 +1 ☔ View full report in Codecov by Sentry. |
p-datadog
approved these changes
Jun 12, 2024
ivoanjo
deleted the
ivoanjo/extend-relative-rpath-extensions-folder-master
branch
June 12, 2024 13:31
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This is a cherry-pick of #3683 which was first applied to the 1.x-stable branch. See that PR for details.
The diff is slightly different in the spec file since 1.x-stable needed a few tweaks to support older Rubies, which 2.x doesn't need. Everything else is the same.
Motivation:
Fix the rpath issue.
Additional Notes:
N/A
How to test the change?
See #3683 && and additional integration test for this was added on DataDog/prof-correctness#39 .