-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] existing streaming latency is still takes time, #417
Comments
PR Welcome |
Please compile the model, or try the quantized version. |
@PoTaTo-Mika what do you mean by compile the model ? also how to do quantized version? since I only do steps for inference in english documentation https://speech.fish.audio/en/inference/#2-create-a-directory-structure-similar-to-the-following-within-the-ref_data-folder |
This issue is stale because it has been open for 30 days with no activity. |
streaming in 4090 tooks more than 2 second depend on length of token, is there a way to yield it/return while the engine still generating?
The text was updated successfully, but these errors were encountered: