Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a build matrix for Ruby with ASAN #10

Merged
merged 1 commit into from
Jun 14, 2024

Conversation

KJTsanaktsidis
Copy link
Contributor

It compiles some of Ruby's key dependencies with ASAN enabled, and statically links them. It builds with Clang 18, which is the minimum required for ASAN to work.

We might want to make ASAN a replacement for debug instead of in addition to it, but I want to see if I can get this to build first and then we can decide.

@KJTsanaktsidis
Copy link
Contributor Author

I don't think this is going to build yet (building with OpenSSL with ASAN enabled causes a test failure on my machine) - @eregon is there a way to run the build on the branch and find out?

@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/build_with_asan branch 2 times, most recently from dfa5f8f to ea17618 Compare June 1, 2024 11:45
@eregon
Copy link
Member

eregon commented Jun 2, 2024

is there a way to run the build on the branch and find out?

You should be able to run it on your fork, e.g. via workflow_dispatch at https://github.com/KJTsanaktsidis/ruby-dev-builder/actions/workflows/build.yml or just adding a commit to trigger on push

@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/build_with_asan branch 11 times, most recently from 02d4f4e to 9a41694 Compare June 7, 2024 05:37
@KJTsanaktsidis
Copy link
Contributor Author

@eregon this should be ready for a review now - WDYT?

Green build with this patch: https://github.com/KJTsanaktsidis/ruby-dev-builder-workflow-runs/actions/runs/9411904718

Note I also wanted to compile libffi with ASAN as well, but I need to get libffi/libffi#839 merged upstream first.

@@ -118,6 +120,9 @@ jobs:
with:
ruby-version: 3.2

- run: chmod 755 $HOME # https://github.com/actions/virtual-environments/issues/267
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? The Rubies built by setup-ruby remove this warnings so it shouldn't happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already there, I was just moving it around. 755 does seem like the wrong permission for $HOME...

I'll try removing it in a separate PR perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, it was not clear in the diff. I think it can be removed, I'll try it.

@eregon
Copy link
Member

eregon commented Jun 7, 2024

Ah so you merged the ASAN stuff with ruby-debug now?

I wonder if it should be separate. The fact ASAN needs to disable M:N threading makes me think it's better to be separate.

It looks like the ASAN stuff doesn't work on macOS? But that should be no problem we can just use matrix include.

One thing which is not ideal is it seems to take ~30min with ASAN vs ~20min before for the CI jobs here.
But I guess there is little we can do about it.
But the build was already taking ~25 min total due to macOS. so it seems no big deal.

.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/build.yml Outdated Show resolved Hide resolved
run: |
# These flags get applied only to the Ruby build, not to the libs.
# Clang > 17 does not work with M:N threading: https://bugs.ruby-lang.org/issues/20243
echo "cppflags=-DENABLE_PATH_CHECK=0 -DRUBY_DEBUG=1 -DVM_CHECK_MODE=1 -DUSE_MN_THREADS=0" >> $GITHUB_ENV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some duplication + override and it's far from the previous usage, that seems problematic.
I think it could be addressed by moving the other configure command lines closer to here.

.github/workflows/build.yml Outdated Show resolved Hide resolved
@@ -129,13 +134,80 @@ jobs:
echo "optflags=-O3 -fno-inline" >> $GITHUB_ENV
if: matrix.name == 'debug'

- name: Install Clang 18 (debug asan)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the clang(s) available on the runners not recent enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in Ubuntu 22.04 and 20.04. 24.04's is.

.github/workflows/build.yml Outdated Show resolved Hide resolved
@KJTsanaktsidis
Copy link
Contributor Author

I wonder if it should be separate

I'm happy to do that if you prefer - I don't have a strong opinion on it. But..

The fact ASAN needs to disable M:N threading

This isn't an ASAN thing. It's actually totally broken in Clang > 17 regardless of ASAN being enabled or not. Clang has started optimising out loads from the thread-local storage pointer register (%fs0?) in some circumstances that cause M:N threading to read the wrong rb_execution_context_t. There are some assertions in VM_CHECK_MODE=1 that make it blow up, but I'm sure it can't be doing the right thing even without those assertions.

@KJTsanaktsidis
Copy link
Contributor Author

It looks like the ASAN stuff doesn't work on macOS?

I honestly haven't tried very hard on macOS yet. I was thinking to kind of finish the job with ASAN on linux before I tried to get it properly working on macOS.

It compiles some of Ruby's key dependencies with ASAN enabled, ships
those dynamic libraries to a location in Ruby's prefix (which is part of
the tarball deployed with setup-ruby). -Wl,-rpath is used to make sure
that these ASANified libs are used in preference to any
distribution-provided libs at runtime.

Only Ubuntu 24.04 is supported, because the version of clang in 22.04 is
too old. MacOS is also not yet supported.
@KJTsanaktsidis
Copy link
Contributor Author

@eregon I've simplified it all a little bit and separated the ASAN build out into it's own thing. It's now doing dynamic linking (with RPATH) instead of static as well. WDYT?

Passing build: https://github.com/KJTsanaktsidis/ruby-dev-builder-workflow-runs/actions/runs/9441682848/job/26002778721

# Set Ruby configure flags
# Clang > 17 does not work with M:N threading, so we disable it: https://bugs.ruby-lang.org/issues/20243
echo "cppflags=-DENABLE_PATH_CHECK=0 -DRUBY_DEBUG=1 -DVM_CHECK_MODE=1 -DUSE_MN_THREADS=0" >> $GITHUB_ENV
echo "optflags=-O3 -fno-omit-frame-pointer" >> $GITHUB_ENV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set optflags, is -fno-omit-frame-pointer needed?

@@ -118,6 +120,9 @@ jobs:
with:
ruby-version: 3.2

- run: chmod 755 $HOME # https://github.com/actions/virtual-environments/issues/267
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, it was not clear in the diff. I think it can be removed, I'll try it.

@eregon
Copy link
Member

eregon commented Jun 14, 2024

This looks good, I'll merge now. Sorry for the delay.

@eregon eregon merged commit 8cee0b2 into ruby:master Jun 14, 2024
ivoanjo added a commit to DataDog/dd-trace-rb that referenced this pull request Aug 23, 2024
**What does this PR do?**

This PR builds atop the work from #3852 and #3862 to enable running
the profiler test suite using the AddressSanitizer ("asan") tool
(see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer ).

This check will enable us to find bugs in the profiler that may not
be otherwise caught.

**Motivation:**

Improve validation of the profiler native extension.

**Additional Notes:**

The "asan" tool is built into the clang compiler toolchain, and
needs Ruby to be built with a special configuration.

The special configuration is not yet available in the upstream
`ruby/setup-ruby` github action, so I needed to fork it and push
a small tweak to make it available.

(The Ruby builds were added in
ruby/ruby-dev-builder#10 ).

**How to test the change?**

Here's a passing CI run:
https://github.com/DataDog/dd-trace-rb/actions/runs/10524502494/job/29161364590

I've also tested locally by adding a memory leak (e.g. for instance
commenting out `ddog_Vec_Tag_drop(tags);` in `http_transport.c` and
confirmed the issue was flagged.
ivoanjo added a commit to DataDog/dd-trace-rb that referenced this pull request Aug 23, 2024
**What does this PR do?**

This PR builds atop the work from #3852 and #3862 to enable running
the profiler test suite using the AddressSanitizer ("asan") tool
(see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer ).

This check will enable us to find bugs in the profiler that may not
be otherwise caught.

**Motivation:**

Improve validation of the profiler native extension.

**Additional Notes:**

The "asan" tool is built into the clang compiler toolchain, and
needs Ruby to be built with a special configuration.

The special configuration is not yet available in the upstream
`ruby/setup-ruby` github action, so I needed to fork it and push
a small tweak to make it available.

(The Ruby builds were added in
ruby/ruby-dev-builder#10 ).

**How to test the change?**

Here's a passing CI run:
https://github.com/DataDog/dd-trace-rb/actions/runs/10524502494/job/29161364590

I've also tested it by adding a memory leak (e.g. for instance
commenting out `ddog_Vec_Tag_drop(tags);` in `http_transport.c` and
confirmed the issue was flagged.

Here's the failing CI run:
https://github.com/DataDog/dd-trace-rb/actions/runs/10524803685/job/29162274392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants