Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI default plugin should support registering additional models #107

Closed
simonw opened this issue Jul 14, 2023 · 12 comments
Closed

OpenAI default plugin should support registering additional models #107

simonw opened this issue Jul 14, 2023 · 12 comments
Labels
enhancement New feature or request
Milestone

Comments

@simonw
Copy link
Owner

simonw commented Jul 14, 2023

While thinking about:

I realized that there's a limitation hard-coded into LLM here:

@hookimpl
def register_models(register):
register(Chat("gpt-3.5-turbo"), aliases=("3.5", "chatgpt"))
register(Chat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k"))
register(Chat("gpt-4"), aliases=("4", "gpt4"))
register(Chat("gpt-4-32k"), aliases=("4-32k",))

What if OpenAI release a new model with a new name, like they did with gpt-4-0613 - at the moment, there's no way to use that in LLM without releasing a new version of the software (or using a custom plugin).

@simonw simonw added the enhancement New feature or request label Jul 14, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Potential solution: the OpenAI default plugin could support a $USER_DIR/openai-extra-models.yml file which looks something like this:

- model_id: gpt-4-0613
  name: gpt-4-0613
  aliases: ["4-0613"]

It could then extend to support external models with compatible APIs like this:

- model_id: your-model
  name: your-model.bin
  OPENAI_API_BASE: "http://localhost:8080/"

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Relevant code from the OpenAI CLI utility: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/_openai_scripts.py#L62-L75

    openai.debug = True
    if args.api_key is not None:
        openai.api_key = args.api_key
    if args.api_base is not None:
        openai.api_base = args.api_base
    if args.organization is not None:
        openai.organization = args.organization
    if args.proxy is not None:
        openai.proxy = {}
        for proxy in args.proxy:
            if proxy.startswith('https'):
                openai.proxy['https'] = proxy
            elif proxy.startswith('http'):
                openai.proxy['http'] = proxy

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

I don't like how those look like global variables on openai. when I want to be able to use these APIs in a threaded web environment which might have multiple calls happening at the same time - so changes made to openai.api_base need to not affect other prompts happening at the same time.

From this code it looks like there's a way to avoid that:

https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/api_requestor.py#L128-L145

class APIRequestor:
    def __init__(
        self,
        key=None,
        api_base=None,
        api_type=None,
        api_version=None,
        organization=None,
    ):
        self.api_base = api_base or openai.api_base
        self.api_key = key or util.default_api_key()
        self.api_type = (
            ApiType.from_str(api_type)
            if api_type
            else ApiType.from_str(openai.api_type)
        )
        self.api_version = api_version or openai.api_version
        self.organization = organization or openai.organization

Which lead me to: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/openai_object.py#L11-L39

class OpenAIObject(dict):
    api_base_override = None

    def __init__(
        self,
        id=None,
        api_key=None,
        api_version=None,
        api_type=None,
        organization=None,
        response_ms: Optional[int] = None,
        api_base=None,
        engine=None,
        **params,
    ):
        super(OpenAIObject, self).__init__()

        if response_ms is not None and not isinstance(response_ms, int):
            raise TypeError(f"response_ms is a {type(response_ms).__name__}.")
        self._response_ms = response_ms

        self._retrieve_params = params

        object.__setattr__(self, "api_key", api_key)
        object.__setattr__(self, "api_version", api_version)
        object.__setattr__(self, "api_type", api_type)
        object.__setattr__(self, "organization", organization)
        object.__setattr__(self, "api_base_override", api_base)
        object.__setattr__(self, "engine", engine)

And ChatCompletion is a subclass of a subclass of that, so I think I should be able to pass those arguments to the ChatCompletion constructor.

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Yes, it looks like ChatCompletion.create() ends up here: https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/api_resources/abstract/engine_api_resource.py#L127-L151

    @classmethod
    def create(
        cls,
        api_key=None,
        api_base=None,
        api_type=None,
        request_id=None,
        api_version=None,
        organization=None,
        **params,
    ):

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

The structure of the stream of chunks that comes back from LocalAI isn't quite the same as the OpenAI API - it looks like this:

{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "delta": {
        "role": "assistant"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}
{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "delta": {
        "content": " Hello"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Then at the end:

{
  "object": "chat.completion.chunk",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "finish_reason": "stop",
      "delta": {}
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

This doesn't fit the expected shape, especially for this code:

def combine_chunks(chunks: List[dict]) -> dict:
content = ""
role = None
for item in chunks:
for choice in item["choices"]:
if "role" in choice["delta"]:
role = choice["delta"]["role"]
if "content" in choice["delta"]:
content += choice["delta"]["content"]
if choice["finish_reason"] is not None:
finish_reason = choice["finish_reason"]
return {
"id": chunks[0]["id"],
"object": chunks[0]["object"],
"model": chunks[0]["model"],
"created": chunks[0]["created"],
"index": chunks[0]["choices"][0]["index"],
"role": role,
"content": content,
"finish_reason": finish_reason,
}

It's missing the id field and created field, and some records are missing the finish_reason` key.

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

This almost works:

diff --git a/llm/default_plugins/openai_models.py b/llm/default_plugins/openai_models.py
index c6d74c5..002f899 100644
--- a/llm/default_plugins/openai_models.py
+++ b/llm/default_plugins/openai_models.py
@@ -8,6 +8,7 @@ from pydantic import field_validator, Field
 import requests
 from typing import List, Optional, Union
 import json
+import yaml
 
 
 @hookimpl
@@ -16,6 +17,21 @@ def register_models(register):
     register(Chat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k"))
     register(Chat("gpt-4"), aliases=("4", "gpt4"))
     register(Chat("gpt-4-32k"), aliases=("4-32k",))
+    # Load extra models
+    extra_path = llm.user_dir() / "extra-openai-models.yaml"
+    if not extra_path.exists():
+        return
+    with open(extra_path) as f:
+        extra_models = yaml.safe_load(f)
+    for model in extra_models:
+        model_id = model["model_id"]
+        aliases = model.get("aliases", [])
+        model_name = model["model_name"]
+        api_base = model.get("api_base")
+        register(
+            Chat(model_id, model_name=model_name, api_base=api_base),
+            aliases=aliases,
+        )
 
 
 @hookimpl
@@ -141,9 +157,11 @@ class Chat(Model):
 
             return validated_logit_bias
 
-    def __init__(self, model_id, key=None):
+    def __init__(self, model_id, key=None, model_name=None, api_base=None):
         self.model_id = model_id
         self.key = key
+        self.model_name = model_name
+        self.api_base = api_base
 
     def __str__(self):
         return "OpenAI Chat: {}".format(self.model_id)
@@ -169,13 +187,17 @@ class Chat(Model):
             messages.append({"role": "system", "content": prompt.system})
         messages.append({"role": "user", "content": prompt.prompt})
         response._prompt_json = {"messages": messages}
+        kwargs = dict(not_nulls(prompt.options))
+        if self.api_base:
+            kwargs["api_base"] = self.api_base
+        if self.key:
+            kwargs["api_key"] = self.key
         if stream:
             completion = openai.ChatCompletion.create(
-                model=prompt.model.model_id,
+                model=self.model_name or self.model_id,
                 messages=messages,
                 stream=True,
-                api_key=self.key,
-                **not_nulls(prompt.options),
+                **kwargs,
             )
             chunks = []
             for chunk in completion:
@@ -186,10 +208,10 @@ class Chat(Model):
             response.response_json = combine_chunks(chunks)
         else:
             completion = openai.ChatCompletion.create(
-                model=prompt.model.model_id,
+                model=self.model_name or self.model_id,
                 messages=messages,
-                api_key=self.key,
                 stream=False,
+                **kwargs,
             )
             response.response_json = completion.to_dict_recursive()
             yield completion.choices[0].message.content
@@ -209,11 +231,11 @@ def combine_chunks(chunks: List[dict]) -> dict:
                 role = choice["delta"]["role"]
             if "content" in choice["delta"]:
                 content += choice["delta"]["content"]
-            if choice["finish_reason"] is not None:
+            if choice.get("finish_reason") is not None:
                 finish_reason = choice["finish_reason"]
 
     return {
-        "id": chunks[0]["id"],
+        "id": chunks[0].get("id") or "no-id",
         "object": chunks[0]["object"],
         "model": chunks[0]["model"],
         "created": chunks[0]["created"],

I put this in /Users/simon/Library/Application Support/io.datasette.llm/extra-openai-models.yaml:

- model_id: orca-openai-compat
  model_name: orca-mini-3b.ggmlv3
  api_base: "http://localhost:8080"

Then ran this:

llm -m 'orca-openai-compat' 'Say hello in french'

And got back:

Hello!

To complete the request, I will need to practice speaking French for a bit and become familiar with common greetings. Additionally, it would be helpful to have a copy of the French language dictionary nearby to look up some common vocabulary words that may not be familiar to someone who is just starting out in learning a new language.Error: 'created'

french

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Error: 'created' is because of the API not having the created field. Need to fix that.

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Idea: a command which hits the models API for a custom endpoint and writes out a cached file recording those models so they can show up automatically as registered models.

Example from LocalAI:

curl http://localhost:8080/v1/models | jq
{
  "object": "list",
  "data": [
    {
      "id": "ggml-gpt4all-j",
      "object": "model"
    },
    {
      "id": "orca-mini-3b.ggmlv3",
      "object": "model"
    }
  ]
}

@simonw
Copy link
Owner Author

simonw commented Jul 14, 2023

Maybe it's OK to hit that endpoint every time the LLM command runs rather than messing around with caching.

I don't want to hit that endpoint URL on localhost every time I use llm for regular ChatGPT though.

@simonw
Copy link
Owner Author

simonw commented Jul 15, 2023

Got this working.

In /Users/simon/Library/Application Support/io.datasette.llm/extra-openai-models.yaml:

- model_id: orca-openai-compat
  model_name: orca-mini-3b.ggmlv3
  api_base: "http://localhost:8080"

Then:

llm -m orca-openai-compat '3 names for a pet cow'
 I can do that! Here are three different names for a pet cow: 
1. Milo 2. Daisy 3. Max
llm -c '2 more with descriptions'
 Thank you for your prompt service! Here are two more options for a pet cow's name:

1. Lily - She's gentle and kind, just like a lily.
2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.
llm logs -n 1
[
  {
    "id": "01h5d4nthj4ncdntz2ap56ffz5",
    "model": "orca-openai-compat",
    "prompt": "2 more with descriptions",
    "system": null,
    "prompt_json": {
      "messages": [
        {
          "role": "user",
          "content": "3 names for a pet cow"
        },
        {
          "role": "assistant",
          "content": " I can do that! Here are three different names for a pet cow: \n1. Milo 2. Daisy 3. Max"
        },
        {
          "role": "user",
          "content": "2 more with descriptions"
        }
      ]
    },
    "options_json": {},
    "response": " Thank you for your prompt service! Here are two more options for a pet cow's name:\n\n1. Lily - She's gentle and kind, just like a lily.\n2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.",
    "response_json": {
      "content": " Thank you for your prompt service! Here are two more options for a pet cow's name:\n\n1. Lily - She's gentle and kind, just like a lily.\n2. Thunder - He's strong and fierce, just like thunderstorms on a summer day.",
      "role": "assistant",
      "finish_reason": "stop",
      "object": "chat.completion.chunk",
      "model": "orca-mini-3b.ggmlv3"
    },
    "conversation_id": "01h5d4my74mqyjxc24fhcf86ry",
    "duration_ms": 8729,
    "datetime_utc": "2023-07-15T16:03:17.655636",
    "conversation_name": "3 names for a pet cow",
    "conversation_model": "orca-openai-compat"
  }
]

@simonw
Copy link
Owner Author

simonw commented Jul 15, 2023

Needs documentation and tests.

@simonw simonw closed this as completed in e2072f7 Jul 15, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 15, 2023

@simonw simonw added this to the 0.6 milestone Jul 15, 2023
simonw added a commit that referenced this issue Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant