-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance]: Block manager v2 has low throughput with prefix caching warmup #7619
Labels
performance
Performance-related issues
Comments
More observations:
|
some investigation results:
likely a bug in |
This was referenced Aug 21, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Report of performance regression
Benchmark prefix caching with block manager v1 and v2 on L4:
v1:
v2:
We can see that v2 uses 10 more seconds in the warmup batch, but the latency of the second batch is same as v1. So, if we change the warmup batch size to 1:
v1
v2
The text was updated successfully, but these errors were encountered: