-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use aiohttp for chainlet <-> chainlet communication #1194
Conversation
truss-chains/truss_chains/stub.py
Outdated
# Check `_client_cycle_needed` before and after locking to avoid | ||
# needing a lock each time the client is accessed. | ||
if self._client_cycle_needed(self._cached_async_client): | ||
async with self._async_lock: | ||
if self._client_cycle_needed(self._cached_async_client): | ||
connector = aiohttp.TCPConnector( | ||
limit=self._client_limits.max_connections, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is max_connections still 1000? wondering if we should have a lower number since we saw decreased performance at 1000?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is max_connections still 1000? wondering if we should have a lower number since we saw decreased performance at 1000?
It is -- I think the tough thing with this is that the right number is highly dependent on workload/hardware run. I am strongly leaning towards keeping this high & leaving it up to users to limit the # of outgoing connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that makes sense. 👍 Probably something we should document somewhere, either in docs or truss example (or both)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to approve, but this creates a big gotcha of sync vs async. Can we please keep sync in lockstep with async here?
@@ -81,17 +82,20 @@ def _client_sync(self) -> httpx.Client: | |||
assert self._cached_sync_client is not None | |||
return self._cached_sync_client[0] | |||
|
|||
async def _client_async(self) -> httpx.AsyncClient: | |||
async def _client_async(self) -> aiohttp.ClientSession: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the typing of everything in the BasetenSession class definition need to change. Additionally, the sync client created above also needs to change. We should ideally try to drop the httpx import and httpx limit definiteion above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would you propose changing sync to? requests
? I don't think there's a problem with using httpx for sync, this bug is only apparent in async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay. that's fine then. Maybe worth adding a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to drop the sync path anyway, but there is an unresolved question around that: https://basetenlabs.slack.com/archives/C06RAC0JT5J/p1727900427264919
@@ -277,36 +306,31 @@ def handle_response(response: httpx.Response, remote_name: str) -> Any: | |||
"Could not get JSON from error response. Status: " | |||
f"`{response.status_code}`." | |||
) from e | |||
_handle_response_error(response_json=response_json, remote_name=remote_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call to refactor this, thanks!
@@ -108,7 +109,7 @@ httpx = { extras = ["cli"], version = "*" } | |||
mypy = "^1.0.0" | |||
pytest-split = "^0.8.1" | |||
requests-mock = ">=1.11.0" | |||
types-requests = ">=2.31.0.2" | |||
types-requests = "==2.31.0.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I did this downgrade to fix an issue where poetry lock --no-update
hangs forever. We can't use later versions of types-requests
because it depends on urllib > 2, which is not supported by boto.
🚀 What
While working on a chain in production, noticed some latency issues in the chain <-> chain communication. After a bunch of debugging, stumbled across this unresolved github issue: encode/httpx#3215, which seemed to suggest that httpx has bad performance in high concurrent scenarios.
After subbing out httpx for aiohttp, I was able to notice a big performance increase on a toy example, see this loom: https://www.loom.com/share/fb1fbe06529e473499901766711787b4.
Note: I didn't touch the synchronous path, but that's not appropriate for highly concurrent scenarios anyway.
💻 How
🔬 Testing
Tested this out on a production chain (using an RC) & also ran the chain integration tests.
After this goes out, we should redeploy chains and see what kind of speed up we get from this.