Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize retract #440

Merged
merged 10 commits into from
May 25, 2024
Merged

Optimize retract #440

merged 10 commits into from
May 25, 2024

Conversation

hnyls2002
Copy link
Collaborator

@hnyls2002 hnyls2002 commented May 14, 2024

  • Deprecated the old style of appending fast-forwarded str directly to input_ids, introducing prev_output_str and prev_output_ids instead.
  • When prefilling, input_ids = origin_input_ids + prev_output_ids and we can still hit cache here.
  • Add the retracted tokens into prev_output_ids instead of discarding them.
  • Make it compatible with logprobs.

@merrymercy
Copy link
Contributor

@hnyls2002 please fix the conflicts. Is this ready for merge?

@hnyls2002 hnyls2002 merged commit f06e90c into main May 25, 2024
@hnyls2002 hnyls2002 deleted the optimize-retract branch May 25, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants