chat history #85
hoarycrippl3
started this conversation in
General
Replies: 1 comment 1 reply
-
It has been replaced by the "maximum prompt size in tokens" option. This has been suggested here: #77 The idea is that, since the sizes of the messages can vary a lot, it is more robust to limit the prompt size in terms of tokens. Try sending lots of messages with this parameter set to a low value (like 500), then increase it 100 at a time before sending a new message until the script crashes. Then you will know the limit for your GPU. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I was able to run the text generation webui with pygmalion-6b model on my RTX 2070 super with 8gb vram by using the following options:
--load-in-8bit --auto-devices --disk --gpu-memory 6 --no-stream --share
but I would have to set the chat history to "6" instead of "0" for unlimited. Now it seems that option is gone. I am still able to run the model, but I get out of memory errors much sooner now and sometimes it just doesn't want to run at all. Am I just overlooking this option or has the chat history option been removed? Thank you!
Beta Was this translation helpful? Give feedback.
All reactions