Fix: incorrect top_logprobs in chat completion #2088

ajwaitz · 2024-11-19T04:07:24Z

Motivation

I noticed that chat completion with a defined top_logprobs value was returning incorrect top_logprobs beyond position 0 in the sequence. In the return object's logprobs.content field, each token in the sequence should have some number of top log probabilities, captured from the decoding step for that token. However, I noticed that each element in logprobs.content had the same "top tokens" (i.e. each ChatCompletionTokenLogprob object had an identical top_logprobs field). In other words, the nth element in logprobs.content contained the top logprob tokens from the 0-position decoding step instead of the n-position step.

Below is an example. Observe the top_logprobs fields.

Code:

client = openai.Client(base_url=server_url, api_key="None")

response = client.chat.completions.create(
    model="meta-llama/Llama-3.2-1B-Instruct",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    temperature=0,
    max_tokens=128,
    logprobs=True,
    top_logprobs=3
)

print(response.choices[0].logprobs.content)

Output:

[ChatCompletionTokenLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token=' capital', bytes=[32, 99, 97, 112, 105, 116, 97, 108], logprob=-7.915183232398704e-05, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token=' of', bytes=[32, 111, 102], logprob=-0.00027843413408845663, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token=' France', bytes=[32, 70, 114, 97, 110, 99, 101], logprob=-1.3470558769768104e-05, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token=' is', bytes=[32, 105, 115], logprob=-2.5033637939486653e-05, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token=' Paris', bytes=[32, 80, 97, 114, 105, 115], logprob=-0.0005902693956159055, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token='.', bytes=[46], logprob=-0.00025328766787424684, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)]),
 ChatCompletionTokenLogprob(token='<|eot_id|>', bytes=[60, 124, 101, 111, 116, 95, 105, 100, 124, 62], logprob=-0.004491835366934538, top_logprobs=[TopLogprob(token='The', bytes=[84, 104, 101], logprob=-0.006512844935059547), TopLogprob(token='Paris', bytes=[80, 97, 114, 105, 115], logprob=-5.084637641906738), TopLogprob(token='R', bytes=[82], logprob=-9.506512641906738)])]

Modifications

I modified the function v1_chat_generate_response in sglang/srt/openai_api/adapter.py I made a minor modification to logprob list indexing to produce the desired behavior.

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

merrymercy

Good catch!

fix: incorrect logprobs in sequence positions beyond 0

2757e60

ajwaitz requested review from merrymercy, Ying1123, hnyls2002, zhyncs, ispobock and ByronHsu as code owners November 19, 2024 04:07

merrymercy approved these changes Nov 19, 2024

View reviewed changes

Merge branch 'main' into logprobs-fix

c8ca7e7

merrymercy enabled auto-merge (squash) November 19, 2024 12:04

merrymercy merged commit 929c762 into sgl-project:main Nov 19, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: incorrect top_logprobs in chat completion #2088

Fix: incorrect top_logprobs in chat completion #2088

ajwaitz commented Nov 19, 2024

merrymercy left a comment

Fix: incorrect top_logprobs in chat completion #2088

Fix: incorrect top_logprobs in chat completion #2088

Conversation

ajwaitz commented Nov 19, 2024

Motivation

Modifications

Checklist

merrymercy left a comment

Choose a reason for hiding this comment