Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The get_ppl missed the last token of each iteration during multi-iter prefill #2499

Merged
merged 14 commits into from
Sep 26, 2024

Conversation

lvhan028
Copy link
Collaborator

@lvhan028 lvhan028 commented Sep 23, 2024

  1. fix the missed last token in each prefill iteration
  2. Assume the max memory footprint of logits is 2G, preventing OOM when batch_size * seq_len is too large

lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
@lvhan028 lvhan028 requested a review from grimoire September 23, 2024 13:56
@lvhan028
Copy link
Collaborator Author

@grimoire can we remove the decode in class Engine?
The pipeline calls engine_instance's decode and get_ppl directly.

@grimoire
Copy link
Collaborator

can we remove the decode in class Engine?

Accept

lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
lmdeploy/serve/utils.py Outdated Show resolved Hide resolved
@lvhan028 lvhan028 changed the title The last token of each iteration is missed during multi-iter prefill The get_ppl missed the last token of each iteration during multi-iter prefill Sep 26, 2024
@irexyc
Copy link
Collaborator

irexyc commented Sep 26, 2024

@lvhan028 lvhan028 merged commit 4812b5a into InternLM:main Sep 26, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants