You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using a prompt of 100001 seq length & generate 100 tokens.
With temp 0, the gibberish does NOT match itself across iterations
E.g.: Good response 1 = What? What?”\n\n“Why, the bridge was mined [...]
Bad response 1 = So far as Jiedgilliesgillies-illies-illies-er. A Jemel-er-illies-ied-: \xa0 [...]
Bad response 2 is entirely different from 1 = \xa0gillies in England-ied. A Jiedgeld-eren [...]
I haven't looked into the VLLM impl yet. It seems like maybe the tensors are not initialized correctly somewhere and are inheriting whatever values were already in memory at the time.
I have seen this kind of thing happen before when someone uses x = torch.empty(size) -- which initializes to whatever memory already had set for that segment -- when they meant to use / wanted zeros.
The text was updated successfully, but these errors were encountered:
fmmoret
changed the title
[Bug]: batched prefill returning gibberish in some cases.
[Bug]: Chunked prefill returning gibberish in some cases.
May 10, 2024
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
main branch Dockerfile.rocm default dependencies.
🐛 Describe the bug
--max-num-batched-tokens=131072 --enable-chunked-prefill -- perfect response. temp 0
--max-num-batched-tokens=16384 --enable-chunked-prefill -- gibberish response. temp 0
Using a prompt of 100001 seq length & generate 100 tokens.
With temp 0, the gibberish does NOT match itself across iterations
E.g.: Good response 1 =
What? What?”\n\n“Why, the bridge was mined [...]
Bad response 1 =
So far as Jiedgilliesgillies-illies-illies-er. A Jemel-er-illies-ied-: \xa0 [...]
Bad response 2 is entirely different from 1 =
\xa0gillies in England-ied. A Jiedgeld-eren [...]
I haven't looked into the VLLM impl yet. It seems like maybe the tensors are not initialized correctly somewhere and are inheriting whatever values were already in memory at the time.
I have seen this kind of thing happen before when someone uses
x = torch.empty(size)
-- which initializes to whatever memory already had set for that segment -- when they meant to use / wanted zeros.The text was updated successfully, but these errors were encountered: