Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollbar can quietly ignores errors outside of web processes. #183

Closed
jonah-williams opened this issue Nov 24, 2014 · 17 comments
Closed

Rollbar can quietly ignores errors outside of web processes. #183

jonah-williams opened this issue Nov 24, 2014 · 17 comments

Comments

@jonah-williams
Copy link

I was recently surprised to discover that Rollbar (using rollbar-resque) was not reporting errors for failing Resque jobs. Digging further I found that there are a number of cases where the Rollbar gem could surprisingly fail to report errors to the service. In a Rails web process Rollbar avoids this issue thanks to RollbarMiddleware making a call to Rollbar.reset_notifier!. However background job processes (Resque, Sidekiq), rake tasks, console sessions, or any other process which loads the Rails environment but does not run middleware can fail to report errors throughout its life.

I see two paths to get into this state:

  1. Something makes a call to Rollbar.log prior to the Rollbar initializer calling Rollbar.configure. In my case this was an info level warning in another initializer.
  2. A new thread is created prior to the Rollbar initializer calling Rollbar.configure. In my case an app had a call to segment.io's AnalyticsRuby.init which eventually calls Thread.new.

In the first case Rollbar.log delegates to Rollbar.notifier which returns Thread.current[:_rollbar_notifier] ||= Notifier.new(self) (rollbar.rb#L705). In the second case Thread.initialize_with_rollbar calls Rollbar.notifier.scope. Either way we have a call to Rollbar.notifier which sets Thread.current[:_rollbar_notifier].

The Notifier instance created for the thread is passed the Rollbar class as its parent_notifier. The Notifier then calls parent_notifier.configuration.clone (rollbar.rb#L42) where configuration is implemented as @configuration ||= Configuration.new (rollbar.rb#L683). Now we have a Notifier for the current thread configured with a new Configuration. Since this Configuration has not yet been enabled Notifier.log will abort due to return 'disabled' unless configuration.enabled (rollbar.rb#L112).

In a Rails web process the Rollbar initializer would overwrite any existing Configuration when it calls Rollbar.configure do .... If a Notifier has already been created as above then it remains disabled and missing any configuration changes from the initializer. That's ok because RollbarMiddleware eventually calls Rollbar.reset_notifier! (rollbar.rb#L16) which calls self.notifier = nil which leads to Thread.current[:_rollbar_notifier] = notifier and the unconfigured Notifier is removed. The next calls to Rollbar.notifier will then create a new Notifier and inherit the Rollbar.configuration set in the initializer.

A similar call to reset_notifier! exists in the Rollbar Rack Builder and Sinatra middleware.

Is there a better hook we could use to trigger these reset_notifier! calls? Would it be reasonable to always reset the current thread's notifier in Rollbar.configure so that every integration does not need to repeat this setup?

As is it seems easy for users of rollbar-resque to miss error reports. At a glance I don't see a call to reset_notifier! in https://github.com/allenwei/sidekiq-rollbar either but I'm not familiar with the forking model in use there so maybe it isn't an issue in that case. having added a Rollbar initializer I expected any environment which loads that initializer to be able to successfully report errors to Rollbar. Does that seem like a reasonable expectation, if not maybe we can at least make a readme update?

jonah-williams pushed a commit to jonah-williams/resque-rollbar that referenced this issue Nov 24, 2014
Fixes cases where a `Notifier` may have been created prior to the Rollbar initializer calling `Rollbar.configure`. Previously Resque would continue to use this prematurely created `Notifier` and never use the `Configuration` defined by the initializer. See rollbar/rollbar-gem#183
@jondeandres
Copy link
Contributor

@jonah-williams isn't the initialization/configuration order issues solved with this PR? #170

We fixed this so every Thread create before execute Rollbar.configure doesn't create a new notifier by thread.

The call to Rollbar.reset_notifier is used to remove the scope options we set for every received request. So between request the options are cleaned.

Please let us know which version is failing for you and we'll try to fix it soon :-D.

@jonah-williams
Copy link
Author

@jondeandres it might well be, I was running against 1.2.7. Let me take a look.

@jonah-williams
Copy link
Author

@jondeandres At a glance I think #170 only resolves one of the two paths above. Deferring the monkey patch to Thread would mean that calls to Thread.new do not immediately create a Notifier but it looks like any call to a Rollbar log method prior to the Rollbar initializer being run will still call Rollbar.notifer which will still attach an unconfigured Notifer to the current thread (see rollbar.rb#L705).

In my case this app had a line something like Rollbar.warn('<optional dependency> unavailable') in another initializer and when that initializer ran before Rollbar's I lost the ability to report Resque job failures.

@jondeandres
Copy link
Contributor

@jonah-williams that wrong configured Notifier instance is disabled? And then all the next calls will be also disabled?

Perhaps we can reset the notifier when Rollbar.configure is called.

@jonah-williams
Copy link
Author

@jondeandres exactly, the current thread gets a disabled Notifier which is never replaced. I agree the "simple" fix seems to be to call reset_notifier! in Rollbar.configure but I don't know if that would have surprising side effects for any gem users.

@jondeandres
Copy link
Contributor

Perhaps we can just don't cache the notifier in Rollbar.notifier if it's not configured yet, and save a @configured = true in Rollbar::Configuration.

@jonah-williams
Copy link
Author

@jondeandres I don't think that (discarding disabled Notifiers) works as well. Take a look at what resque-rollbar does here: lib/resque/rollbar.rb#L6. I think it's reasonable to expect to be able to amend a configuration like that but then I'd want to discard any existing Notifier using the old configuration afterwards. In that case the existing Notifier is likely to be enabled but we still want to remove it or at least update it's Configuration.

Updating a Configuration like resque-rollbar does could still be surprising since we won't know if child Notifiers inherited the old Configuration from their parent but I don't have an example where that would actually be an issue. Probably only a concern if someone is creating their own Notifiers instead of using the thread's default. Similarly reset_notifier! only resets the current thread's Notifier so anyone changing a Configuration post-fork needs to figure out how to reset each thread.

@jondeandres
Copy link
Contributor

ok @jonah-williams.

I think we can try to reset the notifier in Rollbar.configure and check how it's working. All reports sent before that call are with an unconfigured Notifier so reseting it should not retrieve expected problems.

BTW, if the initialization process of a app sends errors to rollbar before the gem is configured, perhaps users should take account of this and prepend the Rollbar configuration before any other one.

We'll fix this and check how it works ok?

Thank you @jonah-williams!

@jonah-williams
Copy link
Author

Agreed, those early reports are probably a warning sign that should be fixed but it would be great if they didn't also disable reporting more broadly ;)

I like moving the call to reset_notifier! into configure because I think that means you could remove it from all the middlewares (but don't take my word for that):

jondeandres added a commit that referenced this issue Dec 2, 2014
Reset thread's notifier when Rollbar is configured. Fixes #183.
@jonah-williams
Copy link
Author

Great, thanks for the fix!

@brianr
Copy link
Member

brianr commented Dec 2, 2014

Released in 1.2.11. Thanks for investigating and reporting this!

@pcasaretto
Copy link

This might be related.
When on the heroku console, Rollbar seems to ignore any exceptions raised.
I'm on 1.5.0.

@jondeandres
Copy link
Contributor

@pcasaretto is your environment being loaded correctly? Did you check config.enabled option? How are you trying it? Just Rollbar.error(exception) ?

@pcasaretto
Copy link

I ran a piece of code that raised an exception, and also tried a raise/fail manually.

> Rollbar.configure { |c| p c.enabled }
true

Using Rollbar.error worked normally.

@jondeandres
Copy link
Contributor

@pcasaretto can you paste that code?

I think you are expecting that any exception in the console will be reported. That automatic reports are only done in the middleware, so when there is a request. Exceptions in the console are not reported unless you report them manually.

So you should wrap your code with:

begin
  # your code
rescue => e
  Rollbar.error(e)
end

@pcasaretto
Copy link

AH, I see. Thanks for clarifying that.

@jondeandres
Copy link
Contributor

Thanks to you @pcasaretto!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants