Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving Threads to Threadgroup::Default should not happen for enclosed ThreadGroups #24

Closed
gamecreature opened this issue Feb 10, 2023 · 10 comments · Fixed by #25
Closed

Comments

@gamecreature
Copy link
Contributor

gamecreature commented Feb 10, 2023

Moving the Thread to ThreadGroup::Default causes issues with enclosed ThreadGroups. (this change happend in commit c4f1385)

It goes wrong in for example the airbrake-ruby gem. (airbrake/airbrake-ruby#713 (comment))

  • The airbrake-ruby notifier creates a ThreadGroup with workers.
  • It encloses this group.
  • A worker starts a Net::Http request. This request creates the timeout.
  • The timeout thread is created in the enclosed worker ThreadGroup
  • An attempt is made to move the Thread to ThreadGroup::Default
  • Which results in the following exception:
#<Thread:0x0000000107da1e40 /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/thread_pool.rb:131 run> terminated with exception (report_on_exception is true):
/Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:123:in `add': can't move from the enclosed thread group (ThreadError)
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:123:in `create_timeout_thread'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:134:in `block in ensure_timeout_thread_created'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:132:in `synchronize'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:132:in `ensure_timeout_thread_created'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/timeout.rb:181:in `timeout'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1269:in `connect'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1248:in `do_start'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1237:in `start'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/net/http.rb:1817:in `request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rails/net_http.rb:11:in `block in request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rack.rb:21:in `capture_timing'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-13.0.3/lib/airbrake/rails/net_http.rb:10:in `request'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/sync_sender.rb:49:in `send'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/async_sender.rb:51:in `block in thread_pool'
	from /Users/rick/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/airbrake-ruby-6.2.0/lib/airbrake-ruby/thread_pool.rb:135:in `block in spawn_worker'	

This problem could be solved by making lib/timeout.rb:123 conditional

ThreadGroup::Default.add(watcher) unless watcher.group.enclosed?

Example to illustrate this behavior:

# This sample fails!
t1 = Thread.new {
  sleep(2)
  t2 = Thread.new { }
  puts "t2.group: #{t2.group}"
  ThreadGroup::Default.add(t2) 
  puts "Success!"
}
group = ThreadGroup.new
group.add(t1)
group.enclose
puts "t1.group: #{t1.group}"
t1.join
# This sample succeeds by adding the conditional
t1 = Thread.new {
  sleep(2)
  t2 = Thread.new { }
  puts "t2.group: #{t2.group}"
  ThreadGroup::Default.add(t2) unless t2.group.enclosed?
  puts "Success!"
}
group = ThreadGroup.new
group.add(t1)
group.enclose
puts "t1.group: #{t1.group}"
t1.join
@gamecreature gamecreature changed the title Moving Threads to Threadgroup::Default should not happen for enclosed ThreadGroups. Moving Threads to Threadgroup::Default should not happen for enclosed ThreadGroups Feb 10, 2023
@eregon
Copy link
Member

eregon commented Feb 11, 2023

I think this shows Ruby should really support creating a Thread in a different ThreadGroup than the caller's thread's group, so we could create the timeout thread directly in the default ThreadGroup (or with its own ThreadGroup or no ThreadGroup if that was possible)

It's quite wrong that the Timeout thread stays in the airbrake ThreadGroup as with your suggested change.
What if airbrake decides to kill all threads in that ThreadGroup or assumes something about all threads in that group? It wouldn't work.

Part of this is why does airbrake use ThreadGroup#enclose?
That seems a not well supported API, the docs also show it's not well defined.
Particularly the fact new threads can be added to the group, but can't be removed or added, that's pretty strange, it seems a useless feature that shouldn't be used to me.

------------------------------------------------------------------------
  thgrp.enclose   -> thgrp

------------------------------------------------------------------------

Prevents threads from being added to or removed from the receiving
ThreadGroup.

New threads can still be started in an enclosed ThreadGroup.

  ThreadGroup::Default.enclose        #=> #<ThreadGroup:0x4029d914>
  thr = Thread.new { Thread.stop }    #=> #<Thread:0x402a7210 sleep>
  tg = ThreadGroup.new                #=> #<ThreadGroup:0x402752d4>
  tg.add thr
  #=> ThreadError: can't move from the enclosed thread group

I get this is an annoying failure, which also depends on what uses Timeout first.
Could you raise it as a airbrake issue, see what they think and why they use ThreadGroup#enclose?

@gamecreature
Copy link
Contributor Author

I totally agree with that. Creating a thread in specific ThreadGroup is the best solution. You could have a timer group for example.

I’ve already opened the issue at Airbrake, I tried tot dive in deeper and found the cause of it, in this Timer class.

I hope Ruby improves the support for creating threads in a specific group.

Why airbrake encloses it, I don't know. I don’t think it’s very useful there.

@gamecreature
Copy link
Contributor Author

gamecreature commented Feb 11, 2023

The proposed change could be a fix/workaround for other libraries which are experiencing strange issues.

Net::Http uses this Timer, which makes the library useless in enclosed Threads. (Whatever reasons the authors had to enclose the threads). (This change was added to Net::Http, October last year, so suddenly the library is broken)

Btw I really don't think it's strange those timers are kept in the enclosed group. It's the place where the timers have been created. They cannot, perhaps should not 'escape' the group.
It could be incorrect to kill those timers, but probably less of an issue then the current situation, which prevents the usage of Timers in an enclosed group.

@eregon
Copy link
Member

eregon commented Feb 11, 2023

Right, the workaround seems the best we can do for existing Ruby releases, besides airbrake/gems not using ThreadGroup#enclose.
Could you make a PR and add a test for it?

@kyrylo
Copy link

kyrylo commented Feb 11, 2023

@eregon
Part of this is why does airbrake use ThreadGroup#enclose?

I'm the author of the code in question. The code has been there for quite some time, and if I am not mistaken, it's there just to ensure that we do not accidentally add more worker threads to the thread group because the library has an option that specifies how many workers it should spawn.

I inspected the current code of the airbrake gem and it seems like we can safely remove the enclose call. However I no longer work for Airbrake, so it's up to the next maintainer.

That seems a not well supported API, the docs also show it's not well defined.

Could you please show me where the docs say it's not well-supported? I am checking ruby-doc.org but I cannot see any mentions of the API being unstable or poorly defined.

@gamecreature
An attempt is made to move the Thread to ThreadGroup::Default

I know nothing about the implementation of the timeout library, but I am curious, why does it need to call ThreadGroup::Default.add(watcher)?

@gamecreature
Copy link
Contributor Author

gamecreature commented Feb 12, 2023

Currently I'm writing the PR.
I noticed I could not (directly) simulate the situation in a test-case.
Same code which directly worked in irb, didn't trigger in the test.

The reason for this, is that the Timeout thread was already created outside the enclosed thread group, by the other test-cases. In irb no timeouts had been started yet.

So another workaround for libraries is to ensure the thread is created before starting any Timeouts.

Timeout.ensure_timeout_thread_created

See remark of @eregon in next message, method above is an internal api.

Better to just start a timeout, to ensure the thread is created

Timeout.timeout(0.001) {}  # Use > 0 as timeout value, or it will not start the thread

@eregon
Copy link
Member

eregon commented Feb 14, 2023

So another workaround for libraries is to ensure the thread is created before starting any Timeouts.
Timeout.ensure_timeout_thread_created

Note this method should be considered private API (the only public API is Timeout[.#]timeout), there is no guarantee it stays and keeps that name.
It's not actually a private class method because the timeout method needs to call it and unclear if the overhead of send is worth that.

@eregon
Copy link
Member

eregon commented Feb 15, 2023

@kyrylo Thank you for the reply.

Part of this is why does airbrake use ThreadGroup#enclose?

I'm the author of the code in question. The code has been there for quite some time, and if I am not mistaken, it's there just to ensure that we do not accidentally add more worker threads to the thread group because the library has an option that specifies how many workers it should spawn.

I inspected the current code of the airbrake gem and it seems like we can safely remove the enclose call. However I no longer work for Airbrake, so it's up to the next maintainer.

That seems a not well supported API, the docs also show it's not well defined.

Could you please show me where the docs say it's not well-supported? I am checking ruby-doc.org but I cannot see any mentions of the API being unstable or poorly defined.

Right, this is my interpretation of it.
I would think the main use case would be as you said above "to not accidentally add more threads", but this API doesn't prevent that since Thread.new {} inside a Thread already in the group will not raise and just proceed.

And then the lack of API to create a Thread directly in a ThreadGroup seems a big oversight, it's inconvenient and racy to move it only after it's created.
Finally, ThreadGroup#list is quite inefficient, so overall there is basically no useful method in ThreadGroup and as far as I can see there seems to be almost no useful functionality in it.
Maybe we should consider deprecating ThreadGroup in Ruby: https://bugs.ruby-lang.org/issues/19440

@kyrylo
Copy link

kyrylo commented Feb 15, 2023

Thanks for the response, @eregon. I can see now why it's poorly defined.
P.S. It looks like the ThreadGroup API hasn't been touched since 2006, when YARV was introduced.

@rubyFeedback
Copy link

This should probably be put to koichi-san's attention since he kind of oversees a lot of the thread-related aspects in ruby (e. g. the whole ractor model), if the code is old such as 2006 or even older.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

4 participants