Replies: 2 comments 3 replies
-
Hi,
Note that the wording in the issue is to try and better organize tbe issue tracker and the discussions. People can of course submit issues! I would argue that in this case a discussion is the correct one, since I don't think there is an issue in sqlalchemy, but maybe it could do things better? Regarding the timeout, can't you pass a different connection timeout to the connect_args? About using terminate instead of close i guess we could consider it at least when discarding a connection. @zzzeek what do you think? |
Beta Was this translation helpful? Give feedback.
-
I had no idea there was a terminate command and it's even not a coroutine, which means it can fix #8145 . we can finally close connections without them being awaitable. it's not clear if it's appropriate to call terminate() for all closes. I dont know what the difference is between these two methods. "gracefully" vs. "without waiting for pending data", is the latter not "graceful" ? What does the server see ?
not sure if you are doing this but you need to set pool_recycle to a value less than that timeout. so your application would not be exposed to connections that have been shut off from the server side. we definitely should have some terminate() in the mix but it's not clear where without understanding what this method does first. any mixture of close() / terminate() is going to require API changes within the pool and such and in particular some detailed changes within finalize_fairy, there likely needs to be a new dialect method do_terminate() that differentiates between do_close(). hopefully pool_recycle can fix your issues for now because this is more of a 2.0 thing. |
Beta Was this translation helpful? Give feedback.
-
Hello, I'm not allowed to submit a bug, so I'm opening a discussion:
Hello,
I'm implementing a FastAPI web-app with SQLAlchemy, asyncio and asyncpg. Since several weeks we're fighting with a very hard to track down problem of long time-outs on in response to queries that happen from time to time.
It turned out that these problems were related to long idle-times of connections pooled by SQLAlchemy to our DB. There is a load-balancer fronting our DB that reaps idle connections on a specified timeout. Since our service is relatively new, apparently there are large portions of time that the system is not used actively and connections to DB are dropped, which is fine, but the problem arises when SA tries to "recycle" the connections. Using
asyncpg
dbapi SA tries toclose()
the dropped connection when it detects that this connection is in a bad state. However, callingclose()
on asyncpg connection is waiting for response from DB server for the termination request. Since SA is not supplying timeout value for that termination request, the default timeout is used of 15min (!see the logs below). Unfortunately, the DB server doesn't respond to that termination request, but drops the request -- here, I can't give details to why this is happening and if it's normal. Anyway, this triggers a ConnectionTimeout error that we can see in the logs below.We've tried to combat the problem using the Pessimistic (with, poo_preping=True) and Optimistic (pool_recycle=) approaches, both of them can't overcome the issue.
In the case of pre-ping: After connection reaping the pre-ping discovers that the connection is dead, but again, tries to call
close()
to finish the invalidation logic which would lead to the mentioned problem. This renders the pre-ping with asyncpg non-working.In the case of pool_recycle: With recycle time set lower than the reap timeout, if we trigger request before the reap timeout, the recycle logic will work, but if it's after, it'll fail on
close()
.Now, I looked into the code of asyncpg and they provide two methods for connection close: close() and terminate(). The difference is that close waits for response from server, while terminate is fire-and-forget type of action. Since SA calling
.rollback()
before closing, I think that it's safe to useterminate()
instead ofclose()
?Error on trying to close the dropped connection (please see the timestamp: there is a 15min timeout, not controlled by SQLAlchemy):
Details on the system:
Beta Was this translation helpful? Give feedback.
All reactions