Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #53; replace beam search with greedy sampling #60

Merged
merged 2 commits into from
Mar 19, 2024

Conversation

nkoppel
Copy link
Contributor

@nkoppel nkoppel commented Mar 14, 2024

Fixes #53. Changes all mutexes/rwlocks to their blocking std counterparts and uses tokio::task::block_in_place so we can call UnboundedReceiver::blocking_recv in an async context.

Furthermore, replaces the beam search of LlamaSession::start_completing with a call to LlamaSession::start_completing_with with a greedy sampler and max_predictions as the number of unused tokens in context. I did this because I'm fairly sure that the beam search wasn't working correctly and was not updating LlamaSessionInner.tokens correctly.

@pedro-devv pedro-devv merged commit fa6c834 into edgenai:main Mar 19, 2024
3 checks passed
@nkoppel nkoppel deleted the fix_async branch March 19, 2024 15:11
@pedro-devv pedro-devv mentioned this pull request Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'block_on' loops infinitely after a certain number of calls when used in an async context
2 participants