-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single timeout thread never stops and could block #909
Comments
Hi @sehrope , Thank you for creating the issue. We will investigate and get back to you. |
Hi @sehrope , Thank you for bringing this to our attention. The only concern I have is the change in behavior. In a scenario when all connections are closed and the application requests another connection, the driver will have to spin up a new timer thread. @shayaantx thoughts? |
If the lifecycle of the shared timer is tied to the connections then that shouldn't be an issue as an active application will continuously have idle connections in a pool that will retain a reference to the timer. The timer would shutdown only when either:
Yes, an application transitioning from idle to active and specifying a query timeout would need to spin up a new timer thread. But that same application must also have a pool size of zero and that means it'd have to open a new TCP socket, perform an SSL handshake, handle TDS init, authenticate, etc. All of that is going to take much longer than spinning up a new thread. Separately, spawning a new thread to handle the In practice I doubt many applications are invoking the |
@ulvii I think @sehrope brings up some valid points. If the classloader unloads the driver classes (either through garbage collection or someone manually unloading them), then this thread would hang around. Although I don't usually see a lot of that happen happening since people usually keep idle connections around in some pool for the duration of their application. Regardless seems worth addressing. The possible blocking behavior in the lone timeout thread would of existed prior to my changes, as I reused the interrupt logic that already existed (it would of been even worse before since we used an unlimited thread pool for timeout handling). I only looked briefly and it looks like non blocking sockets are being used in the tds channel for tds writes/reads (didn't look that in depth to fully confirm), so the likelihood of this type of behavior occurring seems low. If the interrupt lock had synchronization issues, then yes this could happen. Regardless I think addressing both would be ideal, I can attempt to open a PR for these changes this weekend. Let me know of any questions or concerns Thanks |
@ulvii one thought I had regarding an actual implementation, wouldn't be appropriate to just start the timer thread when the driver is constructed and use finalize to stop the timer thread (in the sqlserverdriver class)? In the scenario where the driver gets unloaded, when it gets registered again, the timer would start. Then if the driver gets unregistered finalize would be called and we can stop the timer thread. If the driver never gets unloaded, then the thread just sits idle. Normally I wouldn't use finalize, but with this presented use case I think this would work well here. I guess its possible the jvm only unloads some of the mssql package. Is there already something where I can check all the active connections easily? Thoughts? |
Hi @sehrope, @shayaantx, Agreed that the issues needs to be addressed. Please also note that the driver uses blocking sockets only. I also think the driver should not start a new timer thread every time it is constructed, simply because query timeouts are not used that often. @shayaantx, would you be able to create a PR? |
I can make a PR, I'm not entirely sure regarding the complete fix. Regarding the interrupt blocking: If I'm understanding sehropes recommendation (please let me know if I'm missing something obvious), use a ScheduledThreadPoolExecutor per connection for timeouts? That would be essentially old implementation of the timeouts where you could have many timeout threads (basically a timeout thread per connection). If the suggestion is to use a single threaded pool for all timeouts and set some arbitrary timeout for how long we wait for an interrupt, that makes most sense to me so far. Regarding the class loader leak: The other thing I don't really know if its available in the api, are we recording anywhere in the driver where I can check all connections are closed or open? Are we storing some in-memory collection of connections anywhere? If not seems like I'd need to implement that to actually check if the timeout thread can be shutdown when all connections are closed. Otherwise I can stop/restart the thread every-time timeouts become available. Let me know Thanks |
-1 to using finalizers. Better to handle things directly as finalizers both slow things down and add non-deterministic behavior. The model I'm suggesting for this is similar to the pgjdbc code I linked to above with the main change being to internally use a ScheduledThreadPoolExecutor rather than a Timer (since we're on JDK 8+). With a core pool size of 1 it acts similar to a Timer with a cleaner API. Let's say the new class is called, The To work around issues with tasks blocking the cancellation thread they should be executed in their own thread so the primary thread receives the work to perform the cancellation but immediately defers it to a new thread. While this does incur some thread thrashing during timeouts, it's only paid by actions that actually time out which should be the minority of real world usage (most actions are expected to succeed, not fail). This model has a couple of advantages:
|
Driver version
7.1.3 (preview release)
Problem description
The shared single threaded timeout handler introduced in #842, once started via running a query on a connection with a non-zero
queryTimeout
, never stops. It keeps running even after all connections are closed. This will lead to a classloader issue in any environment with multiple classloaders (ex: Tomcat) that dynamically add and remove the driver class as the associated classes can never be unloaded.I haven't recreated that situation myself (I don't use Tomcat) but recognized it as an issue as the PostgreSQL JDBC driver had a similar issue a couple years back. For background, the original pgjdbc issue reported by an end user is is here: pgjdbc/pgjdbc#188, PR that fixes it is here: pgjdbc/pgjdbc#197.
The
TimeoutPoller
also executes the timeout specificinterrupt()
action in the single timer thread. From skimming through the code I don't think that's an issue right now as the interrupts being executed are non-blocking. If that changes down the road, then it'd be possible for a single hanginginterrupt()
execution to cause the timer thread to stall indefinitely. That'd stall all timeouts on all connections using the same classloader as the thread is shared. Having the interrupt execution happen in a separate thread (say from a cached thread pool) should fix that.Again, I'm not sure if that specific problem could happen as the only IO operations inside the
interrupt()
calls I could see are closing sockets but it could happen down the road with unrelated code changes. Synchronized blocks may lead to to a similar issue.Expected Behavior
interrupt()
operations from one command should not be able to stall timeouts for other commands.Regarding actual implementation, since the newer versions of the driver target JDK 8+, I'd suggest using a shared
ScheduledThreadPoolExecutor
with a pool size of one (1) that could shutdown when there are no more connections referencing it. That'd be a slightly more modern version of what pgjdbc implemented.One other advantage compared vs the current implementation of a single thread in a sleep / check loop is that that the standard implementation of ScheduledThreadPoolExecutor uses a heap to sort the timeout commands rather than looping through the entire list every time and it sleeps until the next timeout period rather than waking up every second.
If there's confirmation and internal API agreement that command specific
interrupt()
calls cannot invoke any blocking operation then it should be fine to have a single thread invoking them. Otherwise, theinterupt()
calls should be handled via a separate thread pool to ensure that one operation does not block all other timeouts.The text was updated successfully, but these errors were encountered: