-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial prototype #1
Comments
Getting prompts out of the model was very easy: gpt_model = GPT4All(self.model.filename())
output = gpt_model.generate(prompt.prompt, streaming=True)
# yield from output What was harder is that the underlying C++ libraries log a LOT of stuff to stdout and stderr, in a way that can't be easily surpressed from Python. With GPT-4's help I figured out this pattern: class SuppressOutput:
def __enter__(self):
# Save a copy of the current file descriptors for stdout and stderr
self.stdout_fd = os.dup(1)
self.stderr_fd = os.dup(2)
# Open a file to /dev/null
self.devnull_fd = os.open(os.devnull, os.O_WRONLY)
# Replace stdout and stderr with /dev/null
os.dup2(self.devnull_fd, 1)
os.dup2(self.devnull_fd, 2)
# Writes to sys.stdout and sys.stderr should still work
self.original_stdout = sys.stdout
self.original_stderr = sys.stderr
sys.stdout = os.fdopen(self.stdout_fd, "w")
sys.stderr = os.fdopen(self.stderr_fd, "w")
def __exit__(self, exc_type, exc_val, exc_tb):
# Restore stdout and stderr to their original state
os.dup2(self.stdout_fd, 1)
os.dup2(self.stderr_fd, 2)
# Close the saved copies of the original stdout and stderr file descriptors
os.close(self.stdout_fd)
os.close(self.stderr_fd)
# Close the file descriptor for /dev/null
os.close(self.devnull_fd)
# Restore sys.stdout and sys.stderr
sys.stdout = self.original_stdout
sys.stderr = self.original_stderr Used like this: class Response(llm.Response):
def iter_prompt(self, prompt):
with SuppressOutput():
gpt_model = GPT4All(self.model.filename())
output = gpt_model.generate(prompt.prompt, streaming=True)
yield from output I wanted |
OK, it works well enough for a first draft. An interesting wart is that a lot of these models aren't configured for instructions - instead, the JSON file here https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json includes suggested prompts to get them to respond to a question, e.g. {
"order": "a",
"md5sum": "4acc146dd43eb02845c233c29289c7c5",
"name": "Hermes",
"filename": "nous-hermes-13b.ggmlv3.q4_0.bin",
"filesize": "8136777088",
"requires": "2.4.7",
"ramrequired": "16",
"parameters": "13 billion",
"quant": "q4_0",
"type": "LLaMA",
"description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
"url": "https://huggingface.co/TheBloke/Nous-Hermes-13B-GGML/resolve/main/nous-hermes-13b.ggmlv3.q4_0.bin",
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
} I'm not yet doing anything with those, but maybe I should. |
llm models list
|
I haven't set up aliases for these yet. I should probably include aliases for a specific list of the more popular models, once I figure out what those are. Here's how the installation detection code works: Lines 70 to 77 in 1cb087b
|
My code fetches https://gpt4all.io/models/models.json at most once an hour: Lines 27 to 31 in 1cb087b
|
llm -m 'ggml-replit-code-v1-3b' 'A python function that donwloads a JSON file and saves it to disk, but only if it has not yet been saved'
|
That "ggml_metal_free: deallocating" note at the end seems to be some C++ logging code that escaped from my It does at least go to stderr so |
Best result so far: llm -m 'ggml-vicuna-13b-1' 'Ten fun names for a pet pelican'
|
Using https://docs.gpt4all.io/gpt4all_python.html
The text was updated successfully, but these errors were encountered: