-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add alternative choices selection methods #835
Conversation
Really cool to see progress on this issue, as it's a major blocker we're facing. The most effective solution I've found is to use the probability of some kind of end token to distinguish the priority of choices which are prefixes of other choices. It does require me to specify which suffix token(s) to take into account. E.g. if I'm generating a JSON string, I look for a double quote; if I ask the model to answer with only the choice and nothing else, I look for EOT, etc. Would be nice to be able to support that through this same API. |
Could you resolve the conflicts? I will review it later this week. |
Done — thanks! |
Hi @AidanCooper I've fixed the CI issue with the fork. Could you merge the latest main branch? Thanks. |
8ab9126
to
3a50179
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Although you call it a proposal, I like the overall design. I left some minor comments. We can merge this after you resolve them.
Thanks @Ying1123! The downside to resolving the default behaviour at the API layer is that we can't specify backend-dependent values, but it's workable in this instance. Thanks for merging this. I think it's possible that the new selection algorithms could be further optimised for real-world use with some further tweaking, but that will be easier done in follow-on PRs. |
This will likely need refinement and optimisation, so consider this a proposal that I'd like to seek feedback on.
Motivation
SGLang's current choices normalisation method (token length normalised) often performs poorly due to bias towards longer-token options. This arises in cases where the later tokens of an option with many tokens are highly predictable based on its earlier tokens. This is the most succinct example I can come up with that illustrates the flaw, which will trip up even highly capable models:
This PR provides solutions to the above example, and should resolve #523, #608, and possibly other open issues.
Modification
This PR enables the choices normalisation methodology to be configurable, and alongside the existing token length normalised strategy, introduces two new alternatives:
Both of these implementations probably need to be further refined. One potential issue I've noticed with greedy selection is that it if there are differences in the tokens prepended to the options for token healing purposes, then the selection will be based on this rather than the actual option, which doesn't seem right. It's outside the scope of this PR, but the token healing process in general seems to have an outsized impact on the choices selection.
Checklist
pre-commit run --all-files
or other linting tools are used to fix potential lint issues.