chat history #85

hoarycrippl3 · 2023-02-18T17:37:00Z

hoarycrippl3
Feb 18, 2023

Hello, I was able to run the text generation webui with pygmalion-6b model on my RTX 2070 super with 8gb vram by using the following options:
--load-in-8bit --auto-devices --disk --gpu-memory 6 --no-stream --share
but I would have to set the chat history to "6" instead of "0" for unlimited. Now it seems that option is gone. I am still able to run the model, but I get out of memory errors much sooner now and sometimes it just doesn't want to run at all. Am I just overlooking this option or has the chat history option been removed? Thank you!

oobabooga · 2023-02-18T20:13:21Z

oobabooga
Feb 18, 2023
Maintainer

It has been replaced by the "maximum prompt size in tokens" option. This has been suggested here: #77

The idea is that, since the sizes of the messages can vary a lot, it is more robust to limit the prompt size in terms of tokens.

Try sending lots of messages with this parameter set to a low value (like 500), then increase it 100 at a time before sending a new message until the script crashes. Then you will know the limit for your GPU.

1 reply

hoarycrippl3 Feb 18, 2023
Author

Thank you! Working perfectly now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat history #85

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

chat history #85

hoarycrippl3 Feb 18, 2023

Replies: 1 comment · 1 reply

oobabooga Feb 18, 2023 Maintainer

hoarycrippl3 Feb 18, 2023 Author

hoarycrippl3
Feb 18, 2023

Replies: 1 comment 1 reply

oobabooga
Feb 18, 2023
Maintainer

hoarycrippl3 Feb 18, 2023
Author