Fix: incorrect top_logprobs in chat completion #2088
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
I noticed that chat completion with a defined
top_logprobs
value was returning incorrecttop_logprobs
beyond position 0 in the sequence. In the return object'slogprobs.content
field, each token in the sequence should have some number of top log probabilities, captured from the decoding step for that token. However, I noticed that each element inlogprobs.content
had the same "top tokens" (i.e. eachChatCompletionTokenLogprob
object had an identicaltop_logprobs
field). In other words, then
th element inlogprobs.content
contained the top logprob tokens from the0
-position decoding step instead of then
-position step.Below is an example. Observe the
top_logprobs
fields.Code:
Output:
Modifications
I modified the function
v1_chat_generate_response
insglang/srt/openai_api/adapter.py
I made a minor modification to logprob list indexing to produce the desired behavior.Checklist