Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Reaping #125

Closed
arjes opened this issue Apr 27, 2020 · 16 comments
Closed

Connection Reaping #125

arjes opened this issue Apr 27, 2020 · 16 comments

Comments

@arjes
Copy link

arjes commented Apr 27, 2020

Hello,

I wanted to open a conversation up about connection reaping and see if this is a feature you would be accept if I opened a PR to provide it.

Our usecase:

We are using AWS's Serverless Aurora, and one of their scaling thresholds is based on number of concurrent connections. Currently we have implemented this by having our connection a layer below ConnectionPool track last query execution time reap the actual RDS connection.

This is getting us our desired behavior, but lacks in one critical area: ConnectionPool still considers that object a full fledged connection and will route requests for connections to them. Ideally the connections would be reaped from the ConnectionPool queue so new connection requests are sent to still-live connections.

If this is behavior you would be willing to accept I'll open the conversation some approaches.

The most obvious approach

 ConnectionPool.new(size: 5, timeout: 5, reap_in: 3600, reaping_frequency: 5) { Dalli::Client.new }

Every 5 seconds a connection which hasn't been yielded in 3600s receives shutdown(or some other name) and is removed from the queue. It is eligible to be replaced should it be requested.

Advantages

  • It is probably how people expect this feature to work
  • Puts all the configuration in one place

Disadvantages

  • A reaping monitoring thread is one more thing for this library to handle
  • It makes assumptions on the pooled objects shape
    • while it would be simple enough to check for a shutdown method not having it would probably lead to a leaked connection or memory. So the lack of a shutdown method would probably require reaping be disabled

Option 2

connection_pool_instance.remove_from_pool(instance)

Let the connection itself handle it's removal and just expose a method for it to call. While your internal structure is called a Stack you are currently using an array under the hood making this an O(n) operation. For the size of most connection pools I don't consider this a concern.

Advantages

  • Removes the responsibility for reaping from this library
  • Code to implement would be simple

Disadvantages

  • Removes the responsibility for reaping from this library, it likely won't be implemented since most connection libraries are not going to implement this. Any reaping would need to be written into a wrapper like we are currently doing.

Thanks for your time, if you are not interested in this feature no problem I'll just fork it for my needs! If not I'll prep a PR with the features we agree are useful.

@bmalinconico
Copy link

I see you mentioned some validity to the overall idea in the other health check thread. I'd be happy to craft the PR for this functionality if we can agree on public facing shape.

If you plan on implementing it I'll just keep an eye out.

@mperham
Copy link
Owner

mperham commented Apr 30, 2020

I can see a use case for idle connection reaping. I don't like the thought of spinning up a single thread just to reap the pool every 5 minutes (for example) but I don't see any alternative other than checking upon every checkin, which can happen 1000s of times per second.

I'll let others implement since I'm not one that would take advantage of the feature.

@bmalinconico
Copy link

@mperham I reviewed the connection reaping in ActiveRecord and it is spawning a sleeping thread (https://github.com/rails/rails/blob/157920aead96865e3135f496c09ace607d5620dc/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L335)

If ConnectionPool will handle the termination of the Connections, should it also take a argument to configure the shutdown command?

  • reddis-rb uses disconnect
  • pg uses finish
  • mysql appears to use close
ConnectionPool.new(size: 5, timeout: 5, reap_in: 3600, reaping_frequency: 5, reaping_command: :finish) do  
   PG.connect
end

@mperham
Copy link
Owner

mperham commented Apr 30, 2020

What's old is new again. Sigh.

redis/redis-rb#516

@mperham
Copy link
Owner

mperham commented Apr 30, 2020

This is ridiculous, there's no reason for every client to have their own bespoke method name for the same purpose. I would implement it as conn.close if conn.respond_to?(:close) and force the drivers to implement or alias close. Print out a warning if we reap something that doesn't implement close?

@byroot
Copy link
Collaborator

byroot commented Apr 30, 2020

there's no reason for every client to have their own bespoke method name for the same purpose

I mean.. It's not like there's a standardisation body or something like that. So different libraries are bound to use different synonyms.

As said in your PR I'm fine with merging the alias in redis-rb, but ultimately you'd have dozens and dozens of similar PRs to submit to many different gems. So allowing to pass a block or method name to ConnectionPool.new to customize the reaping behavior might be the pragmatic solution, at least short term.

@bmalinconico
Copy link

I'm fine with picking an expected method and calling it a day.

Print out a warning if we reap something that doesn't implement close?

Printing a warning would be fine, but I'm not sure the connection should be reaped since there is no guarantee it is GCable. We could end up leaking the connections by just forgetting them in ConnectionPool.

@mperham
Copy link
Owner

mperham commented Apr 30, 2020

Reaping will be opt-in; users need to enable it. The usecase is SaaS databases that have connection limits and/or autoscale based on connections in use. Once connections become idle, they can be closed and then autoscale down events can fire.

@zaidakram
Copy link

Any update on this? I've a usecase where we have a connection pool of IMAP connections. One pool per account. Gmail and Office365 close the connections uncermoniously after some idle timeout.

Any idea what are my options? Reloading the whole pool on one connection timeout seems counter-intutive. Any help is appreciated.

Thanks

@mperham
Copy link
Owner

mperham commented Jul 5, 2021

@zaidakram send a PR to implement it. No one has done the work.

@arjes
Copy link
Author

arjes commented Jul 5, 2021 via email

@zaidakram
Copy link

@arjes This looks like a very nice start. Thanks. I'll definately give it a spin. And I agree, that with @mperham's guidance this can be turned into an even better solution to this problem.

@mperham
Copy link
Owner

mperham commented Jul 5, 2021

@arjes I don't want a separate reaping thread.

@arjes
Copy link
Author

arjes commented Jul 5, 2021 via email

@shayonj
Copy link
Contributor

shayonj commented Mar 29, 2024

From what I understand - we could either have a separate reaping thread and close conns by tracking last_used or similar, or reap conns during push or pop, but that could lead to some performance tax. Are there other approaches that would be more suitable here?

I can see why connection pool having its own thread might create a sort of an indirect liability on applications using it. Esp. considering the after fork behaviors when a fork happens inside process like puma/sidekiq. The thread based reaping would also be in contrast with how activerecord is handling it - https://github.com/rails/rails/blob/a2a870a7360e9590e14e1be989792cd5b607f944/activerecord/lib/active_record/connection_adapters/abstract/connection_pool/reaper.rb#L41C17-L41C29

At the same time, I wonder if it still OK to have a dedicated reaping thread if this feature would be opt-in only?

Curious about other possible approaches too. Also happy to propose a PR accordingly.

@mperham
Copy link
Owner

mperham commented Apr 1, 2024

I want to keep connection pool simple and have no plans to implement this in the gem. As always, connections should be self-repairing and the caller can shutdown old pools and create new pool instances as necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants