-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock error since version 1.14.0 #3385
Comments
to add some details here, We maintain a ruby app that uses Ougai for structured logging. We have followed your instructions for adding trace correlation to our structured logging. So we are adding Datadog trace correlation data to the Ougai logger, and we are configuring Datadog to use the Ougai logger as its logger. It seems that something changed in 1.14.0 to make that no longer supported. Is this a bug in this gem, or is there something we need to change about our usage of it? |
Hey! This is definitely an interesting one. The ddtrace logger can get called during initialization, and by chaining it back to ddtrace, it hits this issue. 😅 I'm guessing even if we didn't have locking, we might've had some recursion on our hands... 🤔 What may be happening here is that something is trying to log information in 1.19 that didn't before in 1.14. To investigate this further, do you mind sharing the full backtrace from the deadlock? The backtrace you shared unfortunately doesn't show what thread deadlocked and in what part of the code it was when it tried to log. |
@ivoanjo take a look at this: https://gist.github.com/benhutton/c28099f46d6a582c3c9e567dc481d32e That's a backtrace I got when trying to book up a console on heroku. I think the "real" error is the bottom one, then it triggered the ones above it. Also, to be clear, 1.13.1 is the last version that work. Whatever broke it broke it in 1.14.0. |
Ahh, it's the environment logger... that would fit with the change in #3020 that we merged for 1.14.0. That information used to be logged later in the gem life cycle, and we moved it to earlier. It's possible to disable this behavior by setting the Datadog.configure { |c| c.diagnostics.startup_logs.enabled = false } |
That solution gets us going. What's the long term fix for this scenario? |
Great! Let me bring this back to the team and see what we can do to avoid this sharp edge. |
**What does this PR do?** This PR fixes #3385 by preventing components initialization when calling the `Datadog::Tracing.correlation` API. Instead, we use the `allow_initialization: false` option, which was added exactly to help break these kinds of initialization cycles. **Motivation:** This fixes a deadlock issue such as: ``` lib/datadog/core/configuration.rb:228:in `synchronize': deadlock; recursive locking (ThreadError) from lib/datadog/core/configuration.rb:228:in `safely_synchronize' from lib/datadog/core/configuration.rb:195:in `components' from lib/datadog/tracing.rb:134:in `components' from lib/datadog/tracing.rb:138:in `tracer' from lib/datadog/tracing.rb:76:in `correlation' from repro.rb:5:in `add' from logger-1.6.0/lib/logger.rb:691:in `debug' from lib/datadog/core/remote/component.rb:29:in `initialize' from lib/datadog/core/remote/component.rb:156:in `new' from lib/datadog/core/remote/component.rb:156:in `build' from lib/datadog/core/configuration/components.rb:90:in `initialize' from datadog-ci-0.7.0/lib/datadog/ci/configuration/components.rb:31:in `initialize' from lib/datadog/core/configuration.rb:248:in `new' from lib/datadog/core/configuration.rb:248:in `build_components' from lib/datadog/core/configuration.rb:89:in `block in configure' from lib/datadog/core/configuration.rb:230:in `block in safely_synchronize' from lib/datadog/core/configuration.rb:228:in `synchronize' from lib/datadog/core/configuration.rb:228:in `safely_synchronize' from lib/datadog/core/configuration.rb:84:in `configure' from lib/datadog/tracing/contrib/extensions.rb:64:in `configure' from repro.rb:9:in `<main>' ``` that happens when a logger is configured to also print correlation **Additional Notes:** While it may seem that I've changed the API semantics by returning an `Identifier` when there is no tracer, the previous code was actually misleading. There is actually no way (that I know of) for `tracer` to be `nil` on dd-trace-rb currently. It can be disabled, but an instance will still exist. The disabled instance, when called, returns an empty `Identifier`, so here I'm replicating that behavior, without needing the components to be initialized. I guess this is open for debate? We could return `nil` instead in the future. **How to test the change?** This tiny snippet can be used to simulate the exact logger issue reported by the customer: ```ruby require 'ddtrace' class TestLogger < Datadog::Core::Logger def add(severity, message = nil, progname = nil, &block) puts Datadog::Tracing.correlation end end Datadog.configure do |c| c.logger.instance = TestLogger.new(STDOUT) end ``` Fixes #3385
PR to fix this #3426 :) |
Thank you. |
👋 @joeosburn , |
Success! Works great. |
I've been using DataDog for several years without issue on a ruby project. Since version 1.14.0, up to the most recent version, 1.19.0, I've run into a deadlocking issue related to the logger configuration I use.
I now get this error:
It appears to be related to the interaction of DataDog within my logger code.
Here is what my DataDog configuration looks like:
If I comment out the
c.logger.instance =
line, the error goes away.In my
Api.logger
code, I have this code:This is because my logger instance needs tracing data from DataDog. Again, if I comment this code out, the error goes away.
This was working fine until version 1.14.0 and then something changed. I am currently on Ruby 3.2.2
The text was updated successfully, but these errors were encountered: