-
-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to get unbuffered responses? #4
Comments
This is not implemented yet - however it's something I'm interested in supporting. |
one way would be to restrict the binding to send back data directly to a golang channel, for instance: https://github.com/matiasinsaurralde/cgo-channels/tree/master however, I see still that could incur in a huge penalty, as context switch from golang and C in a loop have a high computational cost. I think we could offer a low-level functionality to address the specific case and scope it to have just a few functions exposed, but wouldn't suggest usage when performance is a requirement. |
can you create this functionality? |
To answer the original question: llama.SetTokenCallback(func(token string) bool {
fmt.Print(token)
return true // we want the predictor to continue
}) |
I think we can close this now, thanks @noxer ❤️! |
I noticed that .predict returns a complete string, which is the model's response. However, I need to give the user a feeling of iteration in which the model must send what it is "typing" in time to the user's client. But the .predict function only returns post-finished responses. How can I get an answer as predicted by the model? A feeling of "typing"
The text was updated successfully, but these errors were encountered: