-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUGFIX] [FRONTEND] Correct chat logprobs #5029
Conversation
Seems the failing builds are only due to some infrastructure issues... |
I think |
Also, the existing code for |
Good catch. But this means that some clients might break. I have to test this, but I think clients in statically typed languages will not work with this, since it’s not part of the official API. |
Can you share some details about what the type checker is complaining about? I'm not seeing any errors. |
Currently, cross-directory checking is disabled ( mypy
pyright
|
@DarkLight1337 if you can help review and shepherd this PR that would be great (I'll merge upon your approval) Regarding the output compliance if it is proven wide spread (can type safe languages to ignore extra fields? Otherwise it's difficult for forward compatibility I would have imagine), we can maybe return it via extra headers. (cc @njhill) |
Re additional fields in outputs, I feel that sticking to only what's supported by the current OpenAI APIs will be too restrictive. Just like how we have a bunch of extra request parameters (though I understand that doesn't have the same client compatibility concern). AFAIK clients can ignore unrecognized response fields in most cases. If this turns out to be a problem, we could consider adding a "strict API response compatibility" config flag, and omit any extras if it's set? |
I have tested today with the Azure OpenAI client for Java (cf. https://platform.openai.com/docs/libraries/azure-openai-libraries) and it worked. So at least with this client the extra While this is not an exhaustive proof, I'll add it back. Then I'll see that I merge the changes from @DarkLight1337 into this PR. |
I've added the following changes following your PR:
There are some parts in your PR that I disagree with:
|
Thanks for going into depth regarding the API and raising these issues!
My PR does follow this, as specified in Edit: Note that the spec allows
I've looked through the docs again, and it indeed seems that Edit: I have also updated my PR to test that the length of the output is exactly equal to |
@DarkLight1337 I'm confused. This is from your PR: This does not conform to an object like this (taken from the examples in the official OpenAI API spec): "choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"logprobs": {
"content": [
{
"token": "Hello",
"logprob": -0.31725305,
"bytes": [72, 101, 108, 108, 111],
"top_logprobs": [
{
"token": "Hello",
"logprob": -0.31725305,
"bytes": [72, 101, 108, 108, 111]
},
{
"token": "Hi",
"logprob": -1.3190403,
"bytes": [72, 105]
}
]
},
... The "root" logprob contains already the values for Can you help me understand where we are misunderstanding each other?
You're right. I think this is an error in the OpenAI spec, though. I don't see when this output is ever going to be generated, since "logprobs": {
"content": null
} Currently vLLM will only ever generate I'd prefer to keep things like this, as it reduces unnecessary complexity in the code. |
Notice that |
Yeah I think that is fine as well. I originally copied the types from OpenAI's Python library, feel free to remove the |
@DarkLight1337 I believe all issues are now addressed. Thanks for your help! |
I'm having an issue with the formatting. It seems the formatter wants the imports in a different way than isort... |
@simon-mo this is the formatting produced by isort after the yapf formatting. When run locally, there is no check whether yapf or isort changed the sources. When run in the CI environment, the fact that yapf changes this leads to build errors. |
@rkooo567 any suggestion to this? |
There are a few other places where yapf is disabled via comments. Perhaps you also have to do that here. |
While you're here, please also resolve the merge conflict. |
@simon-mo can enable auto-merge |
#5031 was just merged which seems to create conflict |
@simon-mo conflicts resolved. Build should be through soon. |
@simon-mo all tests green. We can merge! |
@br3no I'll merge this on behalf of Simon - thank you for fixing this issue! |
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
FIX #5008 – correct
logprobs
formatBEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE
PR Checklist (Click to Expand)
Thank you for your contribution to vLLM! Before submitting the pull request, please ensure the PR meets the following criteria. This helps vLLM maintain the code quality and improve the efficiency of the review process.
PR Title and Classification
Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
[Bugfix]
for bug fixes.[CI/Build]
for build or continuous integration improvements.[Doc]
for documentation fixes and improvements.[Model]
for adding a new model or improving an existing model. Model name should appear in the title.[Frontend]
For changes on the vLLM frontend (e.g., OpenAI API server,LLM
class, etc.)[Kernel]
for changes affecting CUDA kernels or other compute kernels.[Core]
for changes in the core vLLM logic (e.g.,LLMEngine
,AsyncLLMEngine
,Scheduler
, etc.)[Hardware][Vendor]
for hardware-specific changes. Vendor name should appear in the prefix (e.g.,[Hardware][AMD]
).[Misc]
for PRs that do not fit the above categories. Please use this sparingly.Note: If the PR spans more than one category, please include all relevant prefixes.
Code Quality
The PR need to meet the following code quality standards:
format.sh
to format your code.docs/source/
if the PR modifies the user-facing behaviors of vLLM. It helps vLLM user understand and utilize the new features or changes.Notes for Large Changes
Please keep the changes as concise as possible. For major architectural changes (>500 LOC excluding kernel/data/config/test), we would expect a GitHub issue (RFC) discussing the technical design and justification. Otherwise, we will tag it with
rfc-required
and might not go through the PR.What to Expect for the Reviews
The goal of the vLLM team is to be a transparent reviewing machine. We would like to make the review process transparent and efficient and make sure no contributor feel confused or frustrated. However, the vLLM team is small, so we need to prioritize some PRs over others. Here is what you can expect from the review process:
action-required
label on the PR if there are changes required. The contributor should address the comments and ping the reviewer to re-review the PR.Thank You
Finally, thank you for taking the time to read these guidelines and for your interest in contributing to vLLM. Your contributions make vLLM a great tool for everyone!