-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passenger resets SIGPROF signal handlers causing profilers to crash application #2489
Comments
This seems fairly reasonable, if you submit a PR, please be sure to sign the contributor agreement. |
Sounds good, I should have a PR for this within the next days! (Travelling a bit right now!) |
ivoanjo
added a commit
to DataDog/passenger
that referenced
this issue
Aug 31, 2023
**What does this PR do?** This PR tweaks the list of trappable signals (`Signal.list_trappable` in `ruby_core_enhancements.rb`) to consider `SIGPROF` a non-trappable signal. This fixes the issue reported in phusion#2489 where a profiler (or other application) installed a `SIGPROF` handler, and then the handler was reset and thus the application would crash upon receiving it. **How to test the change?** The following `config.ru` reproducer is able to show the problem: ``` puts "Installing SIGPROF signal handler!" Signal.trap("PROF") do puts " !! Got sigprof!" end puts "Testing signal handler!" Process.kill("PROF", Process.pid) puts "Running rack app!" app = ->(env) { puts " ** Got request!" if env['REQUEST_URI'].start_with?('/signal') puts "Sending signal to myself!" Process.kill("PROF", Process.pid) end [200, {}, ['Hello, World!']] } run app ``` Before this change: ``` $ bundle exec passenger start =============== Phusion Passenger(R) Standalone web server started =============== PID file: passenger.3000.pid Log file: passenger.3000.log Environment: development Accessible via: http://0.0.0.0:3000/ You can stop Phusion Passenger(R) Standalone by pressing Ctrl-C. Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/ =============================================================================== [ N 2023-07-25 10:13:12.5640 204259/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours) App 204333 output: Installing SIGPROF signal handler! App 204333 output: Testing signal handler! App 204333 output: !! Got sigprof! App 204333 output: Running rack app! App 204368 output: ** Got request! # <-- Regular request App 204368 output: ** Got request! # <-- I hit /signal App 204368 output: Sending signal to myself! [ W 2023-07-25 10:13:27.9117 204259/Te age/Cor/Con/InternalUtils.cpp:96 ]: [Client 4-1] Sending 502 response: application did not send a complete response [ W 2023-07-25 10:13:30.0219 204259/T3 age/Cor/App/Poo/AnalyticsCollection.cpp:101 ]: Process (pid=204368, group=example (development)) no longer exists! Detaching it from the pool. [ N 2023-07-25 10:13:30.0220 204259/T3 age/Cor/CoreMain.cpp:1146 ]: Checking whether to disconnect long-running connections for process 204368, application example (development) ``` (Every request fails because the worker crashes, and then gets replaced with another one that crashes when it gets the next request, and so on) After this change: ``` $ bundle exec passenger start =============== Phusion Passenger(R) Standalone web server started =============== PID file: passenger.3000.pid Log file: passenger.3000.log Environment: development Accessible via: http://0.0.0.0:3000/ You can stop Phusion Passenger(R) Standalone by pressing Ctrl-C. Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/ =============================================================================== [ N 2023-08-25 14:39:17.6888 540283/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours) App 540356 output: Installing SIGPROF signal handler! App 540356 output: Testing signal handler! App 540356 output: !! Got sigprof! App 540356 output: Running rack app! App 540391 output: ** Got request! App 540391 output: ** Got request! App 540391 output: Sending signal to myself! # <-- No issues in request! App 540391 output: !! Got sigprof! App 540391 output: ** Got request! App 540391 output: Sending signal to myself! ! <-- Works! :) App 540391 output: !! Got sigprof! ``` I've also checked with the dd-trace-rb, it no longer needs a workaround when run under passenger (this workaround reduces the accuracy of profiling data, hence my interest in not having it). Once this fix gets released, I plan to add newer versions of passenger to the allow-list, so dd-trace-rb will not apply the workaround for them. Fixes phusion#2489
CamJN
pushed a commit
that referenced
this issue
Sep 11, 2023
**What does this PR do?** This PR tweaks the list of trappable signals (`Signal.list_trappable` in `ruby_core_enhancements.rb`) to consider `SIGPROF` a non-trappable signal. This fixes the issue reported in #2489 where a profiler (or other application) installed a `SIGPROF` handler, and then the handler was reset and thus the application would crash upon receiving it. **How to test the change?** The following `config.ru` reproducer is able to show the problem: ``` puts "Installing SIGPROF signal handler!" Signal.trap("PROF") do puts " !! Got sigprof!" end puts "Testing signal handler!" Process.kill("PROF", Process.pid) puts "Running rack app!" app = ->(env) { puts " ** Got request!" if env['REQUEST_URI'].start_with?('/signal') puts "Sending signal to myself!" Process.kill("PROF", Process.pid) end [200, {}, ['Hello, World!']] } run app ``` Before this change: ``` $ bundle exec passenger start =============== Phusion Passenger(R) Standalone web server started =============== PID file: passenger.3000.pid Log file: passenger.3000.log Environment: development Accessible via: http://0.0.0.0:3000/ You can stop Phusion Passenger(R) Standalone by pressing Ctrl-C. Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/ =============================================================================== [ N 2023-07-25 10:13:12.5640 204259/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours) App 204333 output: Installing SIGPROF signal handler! App 204333 output: Testing signal handler! App 204333 output: !! Got sigprof! App 204333 output: Running rack app! App 204368 output: ** Got request! # <-- Regular request App 204368 output: ** Got request! # <-- I hit /signal App 204368 output: Sending signal to myself! [ W 2023-07-25 10:13:27.9117 204259/Te age/Cor/Con/InternalUtils.cpp:96 ]: [Client 4-1] Sending 502 response: application did not send a complete response [ W 2023-07-25 10:13:30.0219 204259/T3 age/Cor/App/Poo/AnalyticsCollection.cpp:101 ]: Process (pid=204368, group=example (development)) no longer exists! Detaching it from the pool. [ N 2023-07-25 10:13:30.0220 204259/T3 age/Cor/CoreMain.cpp:1146 ]: Checking whether to disconnect long-running connections for process 204368, application example (development) ``` (Every request fails because the worker crashes, and then gets replaced with another one that crashes when it gets the next request, and so on) After this change: ``` $ bundle exec passenger start =============== Phusion Passenger(R) Standalone web server started =============== PID file: passenger.3000.pid Log file: passenger.3000.log Environment: development Accessible via: http://0.0.0.0:3000/ You can stop Phusion Passenger(R) Standalone by pressing Ctrl-C. Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/ =============================================================================== [ N 2023-08-25 14:39:17.6888 540283/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours) App 540356 output: Installing SIGPROF signal handler! App 540356 output: Testing signal handler! App 540356 output: !! Got sigprof! App 540356 output: Running rack app! App 540391 output: ** Got request! App 540391 output: ** Got request! App 540391 output: Sending signal to myself! # <-- No issues in request! App 540391 output: !! Got sigprof! App 540391 output: ** Got request! App 540391 output: Sending signal to myself! ! <-- Works! :) App 540391 output: !! Got sigprof! ``` I've also checked with the dd-trace-rb, it no longer needs a workaround when run under passenger (this workaround reduces the accuracy of profiling data, hence my interest in not having it). Once this fix gets released, I plan to add newer versions of passenger to the allow-list, so dd-trace-rb will not apply the workaround for them. Fixes #2489
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Issue report
Question 1: What is the problem?
Hey there 👋! I work at @DataDog on our continuous profiler for Ruby. We've had a couple of customers report issues when our profiler is used together with passenger.
TL;DR when Passenger starts it resets a bunch of signal handlers:
passenger/src/ruby_supportlib/phusion_passenger/request_handler.rb
Lines 329 to 345 in 3e0794a
...including the
SIGPROF
signal handler (which is returned bySignal.list_trappable
).This breaks profiling Ruby apps because the following happens:
SIGPROF
signal handler used for data collection.SIGPROF
signal, causing the Ruby app to crash because it no longer has a signal handler for this signal.To add a bit more context, using
SIGPROF
for profiling is a common approach, not just something weird we do at Datadog:I suspect the solution would be to extend the existing
list_trappable
with aresult.delete("PROF")
so thatSIGPROF
signal handlers remain intact.Applications that have a
SIGPROF
signal handler installed and then receive aSIGPROF
signal correctly handle it, rather than crashing.These applications crash (as in, Ruby process goes poof)
Here's an example pure-Ruby reproducer
config.ru
:Here's how it looks in puma:
and in passenger:
Question 2: Passenger version and integration mode:
6.0.18
(Latest version at time of writing)Question 3: OS or Linux distro, platform (including version):
Question 4: Passenger installation method:
[x] RubyGems + Gemfile
Question 5: Your app's programming language (including any version managers) and framework (including versions):
Question 6: Are you using a PaaS and/or containerization? If so which one?
Can be reproduced locally.
Question 7: Anything else about your setup that we should know?
Happy to submit a PR :)
We strive for quality and appreciate you taking the time to submit a report! Please note that if you want guaranteed response times and priority issue support we encourage you to join our enterprise customer base. They also provide us with the means to continue our high level of open source support!
The text was updated successfully, but these errors were encountered: