Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) #292

ukolovda · 2023-09-01T07:43:19Z

Segfault occured when I stop the docker container.

I'm not sure that is Mini_racer trouble, but call stack is in it.

Dump is in the attachment.

tisba · 2023-11-05T19:25:36Z

Can you provide us with some more information? Most importantly: Can you reproduce it? And if so, can you provide a minimal example?

jasoncodes · 2023-11-29T01:53:53Z

Here’s a small Dockerfile which reproduces this issue on both amd64 and arm64:

FROM ruby:3.2.2-slim-bookworm

RUN \
  apt-get update --yes && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
    build-essential libjemalloc2 && \
  ln -s /usr/lib/*-linux-gnu/libjemalloc.so.2 /usr/lib/libjemalloc.so.2
ENV LD_PRELOAD=/usr/lib/libjemalloc.so.2

RUN gem install mini_racer

CMD ruby -r mini_racer -e 'p MiniRacer::Context.new.eval("1+2")'

I did some testing with other base images and it seems like ruby:3.1-slim-bullseye with libjemalloc2 5.2.1-3 works fine but ruby:3.1-slim-bookworm with 5.3.0-1 does not. Installing the older jemalloc into Bookworm did not work so presumably it must be something else other than the jemalloc version?

tisba · 2023-11-29T08:57:58Z

ah, jemalloc 😞

We had several Ruby and mini_racer issues with jemalloc in the past, e.g. #242.

I think that might be something @lloeki or @SamSaffron can take a look?

ukolovda · 2023-11-29T09:16:02Z

I use jemalloc too...

@jasoncodes , thank you!

lloeki · 2023-11-29T16:33:53Z

I'm wondering whether it would get confused at times as to which malloc/free symbol is which depending on how the ruby jemalloc linking happens (I think glibc is always dynamic, dunno about musl).

https://github.com/jemalloc/jemalloc/wiki/Getting-Started

lloeki · 2023-11-29T16:37:35Z

Oh wait, I didn't see that:

ENV LD_PRELOAD=/usr/lib/libjemalloc.so.2

So you're injecting jemalloc, not linking Ruby against jemalloc at Ruby build time? Do you get the same result with a Ruby build against jemalloc?

jasoncodes · 2023-12-01T02:57:23Z

Do you get the same result with a Ruby build against jemalloc?

Good question. Looks to work fine if the Ruby is built using jemalloc instead of injecting.

I tested this using the first Ruby 3.2.2 w/ jemalloc base image I found which I could confirm has a working jemalloc with MALLOC_CONF=stats_print:true ruby -e exit:

FROM jiting/ruby:3.2.2-slim-bookworm
RUN apt-get update && apt-get -y install build-essential
RUN gem install mini_racer
CMD ruby -r mini_racer -e 'p MiniRacer::Context.new.eval("1+2")'

lloeki · 2023-12-01T08:04:00Z

Okay so I think what might happen is:

libv8 is built against glibc and links dynamically to it
This usually resolves the dynamic malloc+free symbols to glibc
When LD_PRELOAD is used to hook jemalloc in it applies to Ruby but it also applies to any other thing that dynamically links to libc's malloc+free (which would be any Ruby extension having linked to glibc)
We have mini_racer_loader that binds symbols with RTLD_LOCAL and RTLD_DEEPBIND when dlopening mini_racer_extension, this is supposed to make mini_racer_extension symbols hidden from outside and prefer its own internal symbols, but there is no malloc+free inside, so they resolve to glibc's malloc+free.
Even if there were malloc+free internal to mini_racer_extension , depending of how linking is done inside the mini_racer_extension shared library for these symbols (e.g malloc+free are internal but dynamic symbols) it might be so that LD_PRELOAD takes precedence over RTLD_DEEPBIND. Not the case here but we can entertain the thought.
Thus with LD_PRELOAD, libv8 will use jemalloc instead of glibc's malloc+free
libv8 does not like that, hilarity ensues

When ruby is built statically against jemalloc, the malloc+free symbols are entirely static (and might even be elided entirely, instead replaced by pure addresses), in which case they are unavailable to extensions. Thus Ruby can do its things with jemalloc and libv8 its things with glibc, and since ownership is split (nothing allocated from one is ever freed by the other) they both live happily ever after.

The remaining question is why libv8 would not like having its allocator swapped with jemalloc via LD_PRELOAD but I guess that's an upstream libv8 question.

jasoncodes · 2023-12-13T04:12:11Z

This morning I came across docker-library/ruby#182 (comment) which suggests using patchelf instead of LD_PRELOAD:

FROM ruby:3.2.2-slim-bookworm

RUN \
  apt-get update --yes && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
    build-essential libjemalloc2 patchelf && \
  patchelf --add-needed libjemalloc.so.2 /usr/local/bin/ruby

RUN gem install mini_racer

CMD ruby -r mini_racer -e 'p MiniRacer::Context.new.eval("1+2")'

This seems like a nicer way to patch an existing Ruby as it avoids affecting all processes. Unfortunately it doesn’t solve the problem here as dynamic linking still results in libv8 using jemalloc and crashing on exit. I’m guessing within the single Ruby process, patchelf and LD_PRELOAD are effectively the same thing.

Only option right now to use mini_racer with jemalloc seems to be building Ruby with jemalloc.

tisba · 2023-12-13T10:56:36Z

Only option right now to use mini_racer with jemalloc seems to be building Ruby with jemalloc.

At least it seems to be more reliable. I'm running with LD_PRELOAD in production for a while and haven't had an issue so far; I'm still using ruby:3.2.2-bullseye (docker) though.

Since this is popping up again and again over time, I'm wondering if we should add something to the troubleshooting section of the README to recommend not using jmalloc for now, or use Ruby statically build against it. What do you think, @lloeki?

jasoncodes · 2024-02-10T00:40:02Z

FYI Rails has merged a PR to LD_PRELOAD jemalloc in their default Dockerfile template: rails/rails#50943.

In lieu of something in the README, may I suggest renaming this Issue to mention jemalloc?

LD_PRELOAD can potentially cause issues with the "symbols hidden" loader for mini_racer This will unconditionally skip the loader when LD_PRELOAD is specified and points to *malloc (eg jemalloc or tcmalloc) see: #292 (comment)

SamSaffron · 2024-08-14T04:12:22Z

@lloeki maybe we simply try this?

#309

no loader if LD_PRELOAD is set to *malloc.

LD_PRELOAD can potentially cause issues with the "symbols hidden" loader for mini_racer This will unconditionally skip the loader when LD_PRELOAD is specified and points to *malloc (eg jemalloc or tcmalloc) see: #292 (comment) (also remove minitest/pride since is it no longer loading in CI)

SamSaffron · 2024-08-14T04:40:22Z

Note I have confirmation from @tgxworld that it resolved the issue for us, I went ahead and merged and pushed a release.

will follow up with a simply run of CI with jemalloc so this does not regress.

nightpool · 2024-08-14T04:45:57Z

In lieu of something in the README, may I suggest renaming this Issue to mention jemalloc?

I don't really understand the loaders or dynamic linking enough to comment on the meat of this issue, but this suggestion seems sensible to me to help anyone trying to figure out what's crashing in the future ^

SamSaffron · 2024-08-14T04:54:35Z

that is easy enough @nightpool ... edited the title.

The fix feels fine, not too many memory allocators that people use in production don't have the word malloc in them.

tisba · 2024-08-14T05:17:28Z

will follow up with a simply run of CI with jemalloc so this does not regress

Do you have a stable case to reproduce, @SamSaffron? I tried several times for a while, but either issues were Ruby related and have been fixed upstream, or I could not reliably reproduce it.

SamSaffron · 2024-08-14T06:06:13Z

I am not sure how easy it is to get a repro, a lot of these style of issues tend to require a rather hefty application behind it.

For example #300 is very hard to repro, only reliable repro I have is to spin up an entire Discourse application and run stuff on it. When I fish out all the eval mini racer makes, we no longer get a segfault.

I suspect this one is similar, we would probably need a reasonably heavy Ruby process with lots of allocations to and then to spin it down.

At a minimum though we can validate that require mini_racer_extension continues to work over time so people who ld preload do not get the symbol isolation (and can no longer run multiple versions of v8 under a single Ruby process) ... we can document this in the readme.

tisba added the bug label Nov 29, 2023

tisba added bug/crash Bugs specific to crashes, segfaults, etc. Gem can be installed though. jemalloc The issue is related to the use of jemalloc labels May 6, 2024

pascalbetz mentioned this issue May 22, 2024

Ruby with jemalloc docker-library/ruby#182

Open

SamSaffron mentioned this issue Aug 14, 2024

FIX: skip mini_racer_loader when malloc is overridden #309

Merged

SamSaffron changed the title ~~Segmentation fault when application stops (Ruby 3.2.2,~~ Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) Aug 14, 2024

SamSaffron closed this as completed Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) #292

Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) #292

ukolovda commented Sep 1, 2023 •

edited

Loading

tisba commented Nov 5, 2023

jasoncodes commented Nov 29, 2023

tisba commented Nov 29, 2023

ukolovda commented Nov 29, 2023

lloeki commented Nov 29, 2023

lloeki commented Nov 29, 2023 •

edited

Loading

jasoncodes commented Dec 1, 2023

lloeki commented Dec 1, 2023 •

edited

Loading

jasoncodes commented Dec 13, 2023

tisba commented Dec 13, 2023

jasoncodes commented Feb 10, 2024

SamSaffron commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

nightpool commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

tisba commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) #292

Segmentation fault when application stops (Ruby 3.2.2, tcmalloc or jemalloc) #292

Comments

ukolovda commented Sep 1, 2023 • edited Loading

tisba commented Nov 5, 2023

jasoncodes commented Nov 29, 2023

tisba commented Nov 29, 2023

ukolovda commented Nov 29, 2023

lloeki commented Nov 29, 2023

lloeki commented Nov 29, 2023 • edited Loading

jasoncodes commented Dec 1, 2023

lloeki commented Dec 1, 2023 • edited Loading

jasoncodes commented Dec 13, 2023

tisba commented Dec 13, 2023

jasoncodes commented Feb 10, 2024

SamSaffron commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

nightpool commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

tisba commented Aug 14, 2024

SamSaffron commented Aug 14, 2024

ukolovda commented Sep 1, 2023 •

edited

Loading

lloeki commented Nov 29, 2023 •

edited

Loading

lloeki commented Dec 1, 2023 •

edited

Loading