Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support manual pool termination #299

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sasa1977
Copy link

@sasa1977 sasa1977 commented Feb 6, 2025

This PR introduces Finch.stop_pool/2, which can be used to manually stop the connection pool of a given SHP.

We have tried it out in production, and it works fine, and it helped us fix our process leak.

See #294 for details.

It's worth noting that this function might not be enough. The client needs to make sure that no request to the given SHP is taking place while the pool is being stopped. This can actually be tricky. A simple solution of channeling all requests through a single process means that you can only have one req at a time to the given SHP, which might be a problem.

In our case we've implemented a basic reference counted GC mechanism, which tracks all the processes currently making a request to the given SHP. When there are no requests taking place anymore, the pool is stopped.

I could be open to submit a PR for something like that, but the proper implementation will be much more complicated, because it has to support the differences between HTTP 1 and 2 implementations, especially for async requests.

@sneako
Copy link
Owner

sneako commented Feb 12, 2025

Thank you @sasa1977 !

It seems that this implementation will only shut down ONE pool, but Finch allows you to start many pools for a scheme/host/port.

What exactly happens if you shut down a pool with an ongoing request? I suppose we could consider sending a terminate message to each pool process instead of immediately telling the supervisor to terminate the child.

If a user if ok with sending all of their requests through a single process, then there is no reason for them to be using Finch in the first place, and they should probably just use Mint directly in that case.

I don't really like the idea of adding reference counting. It's probably minimal overhead, but even just a little bit of overhead is not worth it to support this feature IMO as it goes against the original design goals of the library.

@sasa1977
Copy link
Author

It seems that this implementation will only shut down ONE pool, but Finch allows you to start many pools for a scheme/host/port.

Ah, this is a good catch! I could adapt the implementation to stop them all, assuming you're still ok with adding the support for this.

What exactly happens if you shut down a pool with an ongoing request? I suppose we could consider sending a terminate message to each pool process instead of immediately telling the supervisor to terminate the child.

Not really sure, but I suppose the likely outcome is the failure of the request, or some kind of a process crash, since the pools are taken down while the request is running?

If a user if ok with sending all of their requests through a single process, then there is no reason for them to be using Finch in the first place, and they should probably just use Mint directly in that case.

Finch provides additional value with a much simpler API. Another benefit is that we can use req for more convenient goodies and features. Reimplementing all of this with Mint would require a lot of extra work, not to mention that the implementation would not be as mature, stable, or feature-rich as what Finch + req can give us.

Furthermore, in our case, we need this behaviour only for one some shps (essentially short-lived internal servers which are dynamically started in our cluster ala Flame). For other shps we don't stop the pool, and so we don't need to channel those requests through a single process.

Finally, it's worth noting that we're not channeling the actual requests through the process. Just some initial process registration is performed in this fashion (before the request is issued). The requests themselves are then issued concurrently, and then finally, deregistration is again channeled through the single process.

I don't really like the idea of adding reference counting. It's probably minimal overhead, but even just a little bit of overhead is not worth it to support this feature IMO as it goes against the original design goals of the library.

That's fair enough, I just thought to mention it.

@sneako
Copy link
Owner

sneako commented Feb 12, 2025

Gotcha!

I'm definitely still on board with adding this feature as long as you update it to terminate all pools associated with a SHP.

I don't think it would be all too complicated to implement as send a message to all associated pools asking them to terminate, which would probably be more of a graceful shutdown, but I'm also fine with the currrent immediate termination approach as long as we mention in the docs that terminating a pool that is actively being used could cause some errors, if that sounds reasonable to you.

@sasa1977
Copy link
Author

as long as we mention in the docs that terminating a pool that is actively being used could cause some errors

I've already mentioned something like that in the docs:

Note that this function is not safe with respect to concurrent requests. Invoking it while
another request to the same SHP is taking place might result in the failure of that request. It
is the responsibility of the client to ensure that no request to the same SHP is taking place
while this function is being invoked.

Is that enough, or do you need something else?

@sneako
Copy link
Owner

sneako commented Feb 12, 2025

Right of course, that's perfect.

@sasa1977
Copy link
Author

I don't think it would be all too complicated to implement as send a message to all associated pools asking them to terminate

This could actually lead to subtle race conditions, because the pool might be stopped later when another request is started. So we really need a synchronous stop feature.

@sneako
Copy link
Owner

sneako commented Feb 12, 2025

I don't think it would be all too complicated to implement as send a message to all associated pools asking them to terminate

This could actually lead to subtle race conditions, because the pool might be stopped later when another request is started. So we really need a synchronous stop feature.

Good point! Ok let's roll with the sync route.

Without start_pool_metrics?: true, the final assertion always succeeds. With this option added, the assertion may occasionally fail immediately, due to asynchronism (because the process may not be immediately deregistered)
@sasa1977
Copy link
Author

I'm definitely still on board with adding this feature as long as you update it to terminate all pools associated with a SHP.

Did it. Could you please take a look when you find the time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants