-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : fix smart selection of available slot #10120
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. With the recent context reuse updates, it makes sense to use LCS (you were on the right track back in #7728!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very good, thanks for implementing this.
@ggerganov Could you have a quick look too? Btw with this change, does it make sense to increase the default value of n_cache_reuse
?
Edit: hah sorry I didn't saw your comment above
Probably yes. Though I am hoping to see more feedback from this option to make sure it does not have some side-effects, as it is a bit fringe science at this point. So far I think it works fine based on my code-completion experiments, but I could be testing a very narrow case. |
Thanks for dealing with this @sasha0552 |
* Fix smart selection of available slot * minor fix * replace vectors of tokens with shorthands
* Fix smart selection of available slot * minor fix * replace vectors of tokens with shorthands
Fixes smart selection of available slot that was broken in #10023, replaces algorithm with Longest Common Subsequence (due to #9866), cleans up unnecessary code.
cc: @chrisstankevitz, @ngxson