Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP error: can't move from the enclosed thread group #713

Closed
hirogeek opened this issue Feb 9, 2023 · 11 comments
Closed

HTTP error: can't move from the enclosed thread group #713

hirogeek opened this issue Feb 9, 2023 · 11 comments

Comments

@hirogeek
Copy link

hirogeek commented Feb 9, 2023

Hi,
I've this bug, but not specific call of Notify, but in my aribrake.log every 5 second, I've
**Airbrake: HTTP error: can't move from the enclosed thread group

Almost similar of #576

I use the latest airbrake and airbre-ruby gem, with ruby 3.2 ans rails 7.0.4

@gamecreature
Copy link

Same here! Rails 7 / Ruby 3.2.0
Workaround for now for me is to add the following to the initializer:
(effectively disabling async :( )

def Airbrake.notify(...)
  Airbrake.notify_sync(...)
end

More info about this workaround

davidbalbert added a commit to recursecenter/community that referenced this issue Feb 9, 2023
Hopefully it fixes this issue: airbrake/airbrake-ruby#713
@gamecreature
Copy link

gamecreature commented Feb 10, 2023

I've been debugging the airbrake-ruby/airbrake code. The problem is that Net:HTTP of ruby starts a timer-thread for add a timeout on the HTTP request that being done.

It tries to add the watcher/time thread to the default thread-group, which has been enclosed. (/Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:123)

    ThreadGroup::Default.add(watcher)

This is the backtrace of the error message: 'can't move from the enclosed thread group'

#<Thread:0x0000000107da1e40 /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/thread_pool.rb:131 run> terminated with exception (report_on_exception is true):
/Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:123:in `add': can't move from the enclosed thread group (ThreadError)
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:123:in `create_timeout_thread'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:134:in `block in ensure_timeout_thread_created'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:132:in `synchronize'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:132:in `ensure_timeout_thread_created'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:181:in `timeout'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1269:in `connect'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1248:in `do_start'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1237:in `start'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1817:in `request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rails/net_http.rb:11:in `block in request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rack.rb:21:in `capture_timing'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rails/net_http.rb:10:in `request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/sync_sender.rb:49:in `send'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/async_sender.rb:51:in `block in thread_pool'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/thread_pool.rb:135:in `block in spawn_worker'	

Looking at this backtrace, it's possible the airbrake gem is the cause of it (instead of the airbrake-ruby gem)

@gamecreature
Copy link

gamecreature commented Feb 10, 2023

It can be 'solved' when I remove the 'enclose' from the airbrake-gem gem code.
But I don't think the enclose of the thread-group is optional.

lib/airbrake-ruby/thread_pool.rb

def spawn_workers
  @worker_size.times { @workers.add(spawn_worker) }
  # @workers.enclose  # << When I comment out this, the eclosing issues is gone (of course)
end

What happens is the following situation. (Found after debugging)

  • The timeout-function from the ruby Timeout core creates a thread via Thread.new.
  • This thread is created in the @worker group thread (which has been enclosed!!)
  • The timeout-code from ruby tries to move this thread to the ThreadGroup::Default which causes this error

Why does this happen?

  • the worker thread runs in the enclosed @worker threadpool.
  • the workers spawns a timer-thread for the HTTP request. This causes the thread to be created in the @worker threadpool, which has been enclosed
  • the ruby Timeout.timeout code tries to move tis thread to the ThreadPool::Default
  • this fails, because the @worker threadpool is enclosed and doesn't allow the thread to be moved.

Beste way to solve it?

  • I don't know what the best way is, I hope the airbrake writers can figure out the best solution for this. (disable enclose doesn't probably isn't the best way to do this)
  • I think it's strange the Timeout.timeout moves the timer-thread to the default thread group. This means that a Timeout.timeout call never be performed in an enclosed threadgroup. (Because Net::Http uses this Timeout code it breaks and cannot be used in this situation)

This commit seems related: ruby/net-http@98caa38

@gamecreature
Copy link

gamecreature commented Feb 11, 2023

I've opened an issue at the Timer repository of Ruby.

They mentioned the enclosing support for thread groups is pretty limited.

Is it really required for Airbrake to enclose this thread group?

@gamecreature
Copy link

A better workaround is to call the following method in for example puma.rb, before any Airbrake initialization

Timeout.ensure_timeout_thread_created

This makes sure the Timeout thread is already created and can be moved to the ThreadGroup::Default

davidbalbert added a commit to recursecenter/community that referenced this issue Feb 13, 2023
@davidbalbert
Copy link

davidbalbert commented Feb 13, 2023

I spent a while banging my head against the wall trying to figure out why this was happening to us in production, but not in development. This might be helpful for others trying to reproduce:

Per what @gamecreature has written above, the de facto contract for Timeout.timeout is "make sure the first call to this method isn't in an enclosed, non-default ThreadGroup." That's not a good contract, but right now, that's what it is.

Coincidentally, as long as you have Airbrake::Config#remote_config set to true (which is the default value), airbrake-ruby makes a Net::HTTP request which uses Timeout.timeout in a non-enclosed ThreadGroup during initialization (probably on the main thread/default thread group, but I haven't verified). This means that most of the time, airbrake-ruby actually fulfills the above contract.

So, there are two ways to reproduce this issue:

  • Run Airbrake with remote_config set to false.
  • Run Airbrake in a forking webserver, but load airbrake before forking (e.g. Puma in cluster mode with preload_app! set).

The latter is what made it hard for me to reproduce in development – we had ENV["WEB_CONCURRENCY"] set in production, but not in development. When WEB_CONCURRENCY=N is set, Puma forks N workers and turns on preload_app!.

Getting rid of WEB_CONCURRENCY in production is a good enough workaround for us until this issue gets fixed.

If you need a workaround in cluster-mode Puma, you can call Timeout.ensure_timeout_thread_created in an on_worker_boot block in your Puma config file.

@eregon
Copy link

eregon commented Feb 15, 2023

IMHO airbrake should not use enclose here, see ruby/timeout#24 (comment) and ruby/timeout#24 (comment)

@phylor
Copy link

phylor commented Mar 17, 2023

I'm using the default puma configuration generated by Rails 7. Adding Timeout.ensure_timeout_thread_created as the first line in config/puma.rb as suggested by @gamecreature fixed the error reporting for me.

@gamecreature
Copy link

gamecreature commented Mar 17, 2023

@phylor the ruby timeout gem solves the issue. (Updating to the latest 0.3.2, fixes the issue)

@phylor
Copy link

phylor commented Mar 17, 2023

@gamecreature That also works, thank you! 🎉

@mmcdaris
Copy link
Member

Thanks for all the details provided it really made replicating and fixing this one a lot easier. @thompiler and I have removed the usage of enclose and released version 6.2.1 of airbrake-ruby. Please let us know if you have any issues 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants