-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[serve] remove support for nested DeploymentResponses
#47209
Conversation
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
@zcin let's make sure a good error message is raised if someone tries to pass a nested deployment response. You can do this by defining a |
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Let's add a test for it too, then LGTM |
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Why are these changes needed?
Remove support for passing
DeploymentResponses
in nested objects to downstream serve deployment handle calls, e.g:This doesn't affect the no-op latency, but improves latencies of requests that carry a large payload.
Current handle latencies
New handle latencies:
Related issue number
closes #46428
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.