Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency option for benchmark #2136

Merged
merged 6 commits into from
Nov 23, 2024

Conversation

cermeng
Copy link
Contributor

@cermeng cermeng commented Nov 23, 2024

Motivation

There could be a load balancer above the server to control the request traffic since the server can't reject requests. This pr can simulate this situation. Code borrowed from vllm-project/vllm#9390

Modifications

  • add an option --max-concurrency to bench_serving.py
  • make sure there will not exceed max-concurrency requests coming to the server concurrently

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@zhyncs zhyncs requested a review from merrymercy November 23, 2024 08:38
@zhyncs zhyncs merged commit 60769be into sgl-project:main Nov 23, 2024
@zhyncs
Copy link
Member

zhyncs commented Nov 23, 2024

I'll fix the SimpleNamespace issue cc @merrymercy @cermeng

@zhyncs zhyncs mentioned this pull request Nov 23, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants