Fix #53; replace beam search with greedy sampling #60

nkoppel · 2024-03-14T02:42:40Z

Fixes #53. Changes all mutexes/rwlocks to their blocking std counterparts and uses tokio::task::block_in_place so we can call UnboundedReceiver::blocking_recv in an async context.

Furthermore, replaces the beam search of LlamaSession::start_completing with a call to LlamaSession::start_completing_with with a greedy sampler and max_predictions as the number of unused tokens in context. I did this because I'm fairly sure that the beam search wasn't working correctly and was not updating LlamaSessionInner.tokens correctly.

crates/llama_cpp/src/session/completion.rs

nkoppel added 2 commits March 13, 2024 17:55

Use std locks instead of tokio locks

09a7aa4

Remove all calls to block_on

2f34ff7

pedro-devv reviewed Mar 19, 2024

View reviewed changes

crates/llama_cpp/src/session/completion.rs Show resolved Hide resolved

pedro-devv merged commit fa6c834 into edgenai:main Mar 19, 2024
3 checks passed

nkoppel deleted the fix_async branch March 19, 2024 15:11

pedro-devv mentioned this pull request Mar 19, 2024

parking_lot #64

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #53; replace beam search with greedy sampling #60

Fix #53; replace beam search with greedy sampling #60

nkoppel commented Mar 14, 2024 •

edited

Loading

Fix #53; replace beam search with greedy sampling #60

Fix #53; replace beam search with greedy sampling #60

Conversation

nkoppel commented Mar 14, 2024 • edited Loading

nkoppel commented Mar 14, 2024 •

edited

Loading