Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : fix smart selection of available slot #10120

Merged
merged 3 commits into from
Nov 1, 2024

Conversation

sasha0552
Copy link
Contributor

Fixes smart selection of available slot that was broken in #10023, replaces algorithm with Longest Common Subsequence (due to #9866), cleans up unnecessary code.

cc: @chrisstankevitz, @ngxson

@sasha0552 sasha0552 changed the title Fix smart selection of available slot server : fix smart selection of available slot Nov 1, 2024
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. With the recent context reuse updates, it makes sense to use LCS (you were on the right track back in #7728!).

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good, thanks for implementing this.

@ggerganov Could you have a quick look too? Btw with this change, does it make sense to increase the default value of n_cache_reuse?

Edit: hah sorry I didn't saw your comment above

examples/server/utils.hpp Outdated Show resolved Hide resolved
@ggerganov
Copy link
Owner

Btw with this change, does it make sense to increase the default value of n_cache_reuse?

Probably yes. Though I am hoping to see more feedback from this option to make sure it does not have some side-effects, as it is a bit fringe science at this point. So far I think it works fine based on my code-completion experiments, but I could be testing a very narrow case.

@ngxson ngxson merged commit d865d14 into ggerganov:master Nov 1, 2024
53 checks passed
@sasha0552 sasha0552 deleted the sss-fixes branch November 1, 2024 14:21
@chrisstankevitz
Copy link

Thanks for dealing with this @sasha0552

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* Fix smart selection of available slot

* minor fix

* replace vectors of tokens with shorthands
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* Fix smart selection of available slot

* minor fix

* replace vectors of tokens with shorthands
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants