Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use aiohttp for chainlet <-> chainlet communication #1194

Merged
merged 7 commits into from
Oct 30, 2024

Conversation

squidarth
Copy link
Collaborator

🚀 What

While working on a chain in production, noticed some latency issues in the chain <-> chain communication. After a bunch of debugging, stumbled across this unresolved github issue: encode/httpx#3215, which seemed to suggest that httpx has bad performance in high concurrent scenarios.

After subbing out httpx for aiohttp, I was able to notice a big performance increase on a toy example, see this loom: https://www.loom.com/share/fb1fbe06529e473499901766711787b4.

Note: I didn't touch the synchronous path, but that's not appropriate for highly concurrent scenarios anyway.

💻 How

🔬 Testing

Tested this out on a production chain (using an RC) & also ran the chain integration tests.

After this goes out, we should redeploy chains and see what kind of speed up we get from this.

# Check `_client_cycle_needed` before and after locking to avoid
# needing a lock each time the client is accessed.
if self._client_cycle_needed(self._cached_async_client):
async with self._async_lock:
if self._client_cycle_needed(self._cached_async_client):
connector = aiohttp.TCPConnector(
limit=self._client_limits.max_connections,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is max_connections still 1000? wondering if we should have a lower number since we saw decreased performance at 1000?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is max_connections still 1000? wondering if we should have a lower number since we saw decreased performance at 1000?

It is -- I think the tough thing with this is that the right number is highly dependent on workload/hardware run. I am strongly leaning towards keeping this high & leaving it up to users to limit the # of outgoing connections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense. 👍 Probably something we should document somewhere, either in docs or truss example (or both)

Copy link
Collaborator

@bolasim bolasim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to approve, but this creates a big gotcha of sync vs async. Can we please keep sync in lockstep with async here?

pyproject.toml Show resolved Hide resolved
@@ -81,17 +82,20 @@ def _client_sync(self) -> httpx.Client:
assert self._cached_sync_client is not None
return self._cached_sync_client[0]

async def _client_async(self) -> httpx.AsyncClient:
async def _client_async(self) -> aiohttp.ClientSession:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the typing of everything in the BasetenSession class definition need to change. Additionally, the sync client created above also needs to change. We should ideally try to drop the httpx import and httpx limit definiteion above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would you propose changing sync to? requests? I don't think there's a problem with using httpx for sync, this bug is only apparent in async.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. that's fine then. Maybe worth adding a comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to drop the sync path anyway, but there is an unresolved question around that: https://basetenlabs.slack.com/archives/C06RAC0JT5J/p1727900427264919

@@ -277,36 +306,31 @@ def handle_response(response: httpx.Response, remote_name: str) -> Any:
"Could not get JSON from error response. Status: "
f"`{response.status_code}`."
) from e
_handle_response_error(response_json=response_json, remote_name=remote_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call to refactor this, thanks!

@squidarth squidarth merged commit f544e05 into main Oct 30, 2024
5 checks passed
@squidarth squidarth deleted the sshanker/aiohttp-chains branch October 30, 2024 21:49
@@ -108,7 +109,7 @@ httpx = { extras = ["cli"], version = "*" }
mypy = "^1.0.0"
pytest-split = "^0.8.1"
requests-mock = ">=1.11.0"
types-requests = ">=2.31.0.2"
types-requests = "==2.31.0.2"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I did this downgrade to fix an issue where poetry lock --no-update hangs forever. We can't use later versions of types-requests because it depends on urllib > 2, which is not supported by boto.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants