OpenAI default plugin should support registering additional models #107

simonw · 2023-07-14T20:54:44Z

While thinking about:

Support openai compatible APIs #106

I realized that there's a limitation hard-coded into LLM here:

llm/llm/default_plugins/openai_models.py

Lines 13 to 18 in 3f1388a

    
           @hookimpl 
        
           def register_models(register): 
        
               register(Chat("gpt-3.5-turbo"), aliases=("3.5", "chatgpt")) 
        
               register(Chat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k")) 
        
               register(Chat("gpt-4"), aliases=("4", "gpt4")) 
        
               register(Chat("gpt-4-32k"), aliases=("4-32k",))

What if OpenAI release a new model with a new name, like they did with gpt-4-0613 - at the moment, there's no way to use that in LLM without releasing a new version of the software (or using a custom plugin).

The text was updated successfully, but these errors were encountered:

simonw · 2023-07-14T20:57:47Z

Potential solution: the OpenAI default plugin could support a $USER_DIR/openai-extra-models.yml file which looks something like this:

- model_id: gpt-4-0613
  name: gpt-4-0613
  aliases: ["4-0613"]

It could then extend to support external models with compatible APIs like this:

- model_id: your-model
  name: your-model.bin
  OPENAI_API_BASE: "http://localhost:8080/"

simonw · 2023-07-14T21:00:36Z

Relevant code from the OpenAI CLI utility: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/_openai_scripts.py#L62-L75

    openai.debug = True
    if args.api_key is not None:
        openai.api_key = args.api_key
    if args.api_base is not None:
        openai.api_base = args.api_base
    if args.organization is not None:
        openai.organization = args.organization
    if args.proxy is not None:
        openai.proxy = {}
        for proxy in args.proxy:
            if proxy.startswith('https'):
                openai.proxy['https'] = proxy
            elif proxy.startswith('http'):
                openai.proxy['http'] = proxy

simonw · 2023-07-14T21:04:58Z

I don't like how those look like global variables on openai. when I want to be able to use these APIs in a threaded web environment which might have multiple calls happening at the same time - so changes made to openai.api_base need to not affect other prompts happening at the same time.

From this code it looks like there's a way to avoid that:

https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/api_requestor.py#L128-L145

class APIRequestor:
    def __init__(
        self,
        key=None,
        api_base=None,
        api_type=None,
        api_version=None,
        organization=None,
    ):
        self.api_base = api_base or openai.api_base
        self.api_key = key or util.default_api_key()
        self.api_type = (
            ApiType.from_str(api_type)
            if api_type
            else ApiType.from_str(openai.api_type)
        )
        self.api_version = api_version or openai.api_version
        self.organization = organization or openai.organization

Which lead me to: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/openai_object.py#L11-L39

class OpenAIObject(dict):
    api_base_override = None

    def __init__(
        self,
        id=None,
        api_key=None,
        api_version=None,
        api_type=None,
        organization=None,
        response_ms: Optional[int] = None,
        api_base=None,
        engine=None,
        **params,
    ):
        super(OpenAIObject, self).__init__()

        if response_ms is not None and not isinstance(response_ms, int):
            raise TypeError(f"response_ms is a {type(response_ms).__name__}.")
        self._response_ms = response_ms

        self._retrieve_params = params

        object.__setattr__(self, "api_key", api_key)
        object.__setattr__(self, "api_version", api_version)
        object.__setattr__(self, "api_type", api_type)
        object.__setattr__(self, "organization", organization)
        object.__setattr__(self, "api_base_override", api_base)
        object.__setattr__(self, "engine", engine)

And ChatCompletion is a subclass of a subclass of that, so I think I should be able to pass those arguments to the ChatCompletion constructor.

simonw · 2023-07-14T21:06:36Z

Yes, it looks like ChatCompletion.create() ends up here: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/api_resources/abstract/engine_api_resource.py#L127-L151

    @classmethod
    def create(
        cls,
        api_key=None,
        api_base=None,
        api_type=None,
        request_id=None,
        api_version=None,
        organization=None,
        **params,
    ):

simonw · 2023-07-14T22:26:28Z

The structure of the stream of chunks that comes back from LocalAI isn't quite the same as the OpenAI API - it looks like this:

{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "delta": {
        "role": "assistant"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}
{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "delta": {
        "content": " Hello"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Then at the end:

{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "finish_reason": "stop",
      "delta": {}
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

This doesn't fit the expected shape, especially for this code:

llm/llm/default_plugins/openai_models.py

Lines 202 to 224 in 58d1f92

    
           def combine_chunks(chunks: List[dict]) -> dict: 
        
               content = "" 
        
               role = None 
        
               for item in chunks: 
        
                   for choice in item["choices"]: 
        
                       if "role" in choice["delta"]: 
        
                           role = choice["delta"]["role"] 
        
                       if "content" in choice["delta"]: 
        
                           content += choice["delta"]["content"] 
        
                       if choice["finish_reason"] is not None: 
        
                           finish_reason = choice["finish_reason"] 
        
               return { 
        
                   "id": chunks[0]["id"], 
        
                   "object": chunks[0]["object"], 
        
                   "model": chunks[0]["model"], 
        
                   "created": chunks[0]["created"], 
        
                   "index": chunks[0]["choices"][0]["index"], 
        
                   "role": role, 
        
                   "content": content, 
        
                   "finish_reason": finish_reason, 
        
               }

It's missing the id field and created field, and some records are missing the finish_reason` key.

simonw · 2023-07-14T22:30:32Z

This almost works:

diff --git a/llm/default_plugins/openai_models.py b/llm/default_plugins/openai_models.py
index c6d74c5..002f899 100644
--- a/llm/default_plugins/openai_models.py
+++ b/llm/default_plugins/openai_models.py
@@ -8,6 +8,7 @@ from pydantic import field_validator, Field
 import requests
 from typing import List, Optional, Union
 import json
+import yaml
 
 
 @hookimpl
@@ -16,6 +17,21 @@ def register_models(register):
     register(Chat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k"))
     register(Chat("gpt-4"), aliases=("4", "gpt4"))
     register(Chat("gpt-4-32k"), aliases=("4-32k",))
+    # Load extra models
+    extra_path = llm.user_dir() / "extra-openai-models.yaml"
+    if not extra_path.exists():
+        return
+    with open(extra_path) as f:
+        extra_models = yaml.safe_load(f)
+    for model in extra_models:
+        model_id = model["model_id"]
+        aliases = model.get("aliases", [])
+        model_name = model["model_name"]
+        api_base = model.get("api_base")
+        register(
+            Chat(model_id, model_name=model_name, api_base=api_base),
+            aliases=aliases,
+        )
 
 
 @hookimpl
@@ -141,9 +157,11 @@ class Chat(Model):
 
             return validated_logit_bias
 
-    def __init__(self, model_id, key=None):
+    def __init__(self, model_id, key=None, model_name=None, api_base=None):
         self.model_id = model_id
         self.key = key
+        self.model_name = model_name
+        self.api_base = api_base
 
     def __str__(self):
         return "OpenAI Chat: {}".format(self.model_id)
@@ -169,13 +187,17 @@ class Chat(Model):
             messages.append({"role": "system", "content": prompt.system})
         messages.append({"role": "user", "content": prompt.prompt})
         response._prompt_json = {"messages": messages}
+        kwargs = dict(not_nulls(prompt.options))
+        if self.api_base:
+            kwargs["api_base"] = self.api_base
+        if self.key:
+            kwargs["api_key"] = self.key
         if stream:
             completion = openai.ChatCompletion.create(
-                model=prompt.model.model_id,
+                model=self.model_name or self.model_id,
                 messages=messages,
                 stream=True,
-                api_key=self.key,
-                **not_nulls(prompt.options),
+                **kwargs,
             )
             chunks = []
             for chunk in completion:
@@ -186,10 +208,10 @@ class Chat(Model):
             response.response_json = combine_chunks(chunks)
         else:
             completion = openai.ChatCompletion.create(
-                model=prompt.model.model_id,
+                model=self.model_name or self.model_id,
                 messages=messages,
-                api_key=self.key,
                 stream=False,
+                **kwargs,
             )
             response.response_json = completion.to_dict_recursive()
             yield completion.choices[0].message.content
@@ -209,11 +231,11 @@ def combine_chunks(chunks: List[dict]) -> dict:
                 role = choice["delta"]["role"]
             if "content" in choice["delta"]:
                 content += choice["delta"]["content"]
-            if choice["finish_reason"] is not None:
+            if choice.get("finish_reason") is not None:
                 finish_reason = choice["finish_reason"]
 
     return {
-        "id": chunks[0]["id"],
+        "id": chunks[0].get("id") or "no-id",
         "object": chunks[0]["object"],
         "model": chunks[0]["model"],
         "created": chunks[0]["created"],

I put this in /Users/simon/Library/Application Support/io.datasette.llm/extra-openai-models.yaml:

- model_id: orca-openai-compat
  model_name: orca-mini-3b.ggmlv3
  api_base: "http://localhost:8080"

Then ran this:

llm -m 'orca-openai-compat' 'Say hello in french'

And got back:

Hello!

To complete the request, I will need to practice speaking French for a bit and become familiar with common greetings. Additionally, it would be helpful to have a copy of the French language dictionary nearby to look up some common vocabulary words that may not be familiar to someone who is just starting out in learning a new language.Error: 'created'

simonw · 2023-07-14T22:30:51Z

Error: 'created' is because of the API not having the created field. Need to fix that.

simonw · 2023-07-14T22:44:10Z

Idea: a command which hits the models API for a custom endpoint and writes out a cached file recording those models so they can show up automatically as registered models.

Example from LocalAI:

curl http://localhost:8080/v1/models | jq

{
  "object": "list",
  "data": [
    {
      "id": "ggml-gpt4all-j",
      "object": "model"
    },
    {
      "id": "orca-mini-3b.ggmlv3",
      "object": "model"
    }
  ]
}

simonw · 2023-07-14T22:46:36Z

Maybe it's OK to hit that endpoint every time the LLM command runs rather than messing around with caching.

I don't want to hit that endpoint URL on localhost every time I use llm for regular ChatGPT though.

simonw · 2023-07-15T16:04:57Z

Got this working.

In /Users/simon/Library/Application Support/io.datasette.llm/extra-openai-models.yaml:

- model_id: orca-openai-compat
  model_name: orca-mini-3b.ggmlv3
  api_base: "http://localhost:8080"

Then:

llm -m orca-openai-compat '3 names for a pet cow'

 I can do that! Here are three different names for a pet cow: 
1. Milo 2. Daisy 3. Max

llm -c '2 more with descriptions'

 Thank you for your prompt service! Here are two more options for a pet cow's name:

1. Lily - She's gentle and kind, just like a lily.
2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.

llm logs -n 1

[
  {
    "id": "01h5d4nthj4ncdntz2ap56ffz5",
    "model": "orca-openai-compat",
    "prompt": "2 more with descriptions",
    "system": null,
    "prompt_json": {
      "messages": [
        {
          "role": "user",
          "content": "3 names for a pet cow"
        },
        {
          "role": "assistant",
          "content": " I can do that! Here are three different names for a pet cow: \n1. Milo 2. Daisy 3. Max"
        },
        {
          "role": "user",
          "content": "2 more with descriptions"
        }
      ]
    },
    "options_json": {},
    "response": " Thank you for your prompt service! Here are two more options for a pet cow's name:\n\n1. Lily - She's gentle and kind, just like a lily.\n2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.",
    "response_json": {
      "content": " Thank you for your prompt service! Here are two more options for a pet cow's name:\n\n1. Lily - She's gentle and kind, just like a lily.\n2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.",
      "role": "assistant",
      "finish_reason": "stop",
      "object": "chat.completion.chunk",
      "model": "orca-mini-3b.ggmlv3"
    },
    "conversation_id": "01h5d4my74mqyjxc24fhcf86ry",
    "duration_ms": 8729,
    "datetime_utc": "2023-07-15T16:03:17.655636",
    "conversation_name": "3 names for a pet cow",
    "conversation_model": "orca-openai-compat"
  }
]

simonw · 2023-07-15T16:08:27Z

Needs documentation and tests.

simonw · 2023-07-15T17:03:33Z

Documentation: https://llm.datasette.io/en/latest/other-models.html#adding-more-openai-models

Refs #106, #107, #108, #109

simonw added the enhancement New feature or request label Jul 14, 2023

simonw added a commit that referenced this issue Jul 14, 2023

Don't use openai.api_key global, refs #107

58d1f92

simonw mentioned this issue Jul 15, 2023

Reuse models from GPT4All desktop app, if installed simonw/llm-gpt4all#5

Open

simonw closed this as completed in e2072f7 Jul 15, 2023

simonw added this to the 0.6 milestone Jul 15, 2023

simonw added a commit that referenced this issue Jul 15, 2023

Improved compatible models documentation, refs #106, #107

723d349

simonw added a commit that referenced this issue Jul 18, 2023

Release 0.6

9a177ab

Refs #106, #107, #108, #109

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI default plugin should support registering additional models #107

OpenAI default plugin should support registering additional models #107

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 15, 2023 •

edited

Loading

simonw commented Jul 15, 2023

simonw commented Jul 15, 2023

OpenAI default plugin should support registering additional models #107

OpenAI default plugin should support registering additional models #107

Comments

simonw commented Jul 14, 2023 • edited Loading

simonw commented Jul 14, 2023 • edited Loading

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023 • edited Loading

simonw commented Jul 14, 2023 • edited Loading

simonw commented Jul 15, 2023 • edited Loading

simonw commented Jul 15, 2023

simonw commented Jul 15, 2023

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 14, 2023 •

edited

Loading

simonw commented Jul 15, 2023 •

edited

Loading