Skip to content

Creating an LLM Hanlder

Francesco Caracciolo edited this page Oct 31, 2024 · 7 revisions

This page will explain how to create an handler for a certain LLM API/library.

First of all, read Creating handlers.

An LLM API can be in one of these three categories:

It uses APIs compatible with OpenAI

You can recognize if a service falls into this category if:

  • In the python examples, they use the openai library
  • In the curl examples, they have endpoints ending in /chat/completions

In these cases, you can just extend the OpenAIHandler class and make minimal changes.

Generally, we don't add handlers for specific services that user OpenAI compatible API since you can easily use it with the OpenAI option in settings and just changing the URL.

However, we might consider adding it anyways if the API is good and free or very cheap. This is the example for OperRouterHandler or GroqHandler

If you are building an extension, you can add wathever you want and you can extend that class.

The API is already implemented by gpt4free

If the API is already implemented in gpt4free and it's working, you can consider creating an handler that extends G4FHandler.

It uses a totally different format

Then you need to code some methods yourself. Please refer to the documentation of LLMHandler inside the code in order to understand every method. Usually, you only need to do these steps:

  • Specify extra requirements
  • Specify extra settings
  • Convert Newelle history to the right format
  • Override the generate_text and generate_text_stream functions

Vision

Vision is the ability of the LLM to receive and analyze images.

If an handler has support for vision, you have to override the supports_vision method. The method is not static, so you can also return the value based on the user settings (ex. the model)

    def supports_vision(self) -> bool:
        """ Return if the LLM supports receiving images"""
        return True

When a user sends an image to the model, it's treated like a normal message but it starts with an image codeblock.

The content of the codeblock can be:

  • The path to the image
  • A base64 encoded image (if it's pasted)

To make the process to extract the image more simple, you can use the extra.extract_image function to extract text and image from a message. You can see an example in the ollama handler tutorial.

Examples

Hyperbolic

We will now build an handler for hyperbolic.xyz. From the documentation, we can clearly see that this handler uses OpenAI compatible APIs. image

  1. So we create an handler that extends OpenAIHandler.
class HyperbolicHandler(OpenAIHandler):
  1. Now, we specify what is the API endpoint (the base_url parameter) in the constructor of the class:
class HyperbolicHandler(OpenAIHandler):
    def __init__(self, settings, path):
        super().__init__(settings, path)
        self.set_setting("endpoint", "https://api.hyperbolic.xyz/v1")

If the API does not support parameters like temperature, max tokens etc, also add

self.set_setting("advanced_params", False)

To disable them. 3. Then, we change the settings in order to specify the relevant parameters. You might want to take a look to settings management.

Note: Always add the streaming setting in the OpenAIHandlers.

    def get_extra_settings(self) -> list:
        plus = [
            {
                "key": "api",
                "title": _("API Key"),
                "description": _("API Key for Hyperbolic"),
                "type": "entry",
                "default": ""
            },
            {
                "key": "model",
                "title": _("Hyperbolic Model"),
                "description": _("Name of the Hyperbolic Model"),
                "type": "entry",
                "default": "meta-llama/Meta-Llama-3.1-70B-Instruct",
                "website": "https://app.hyperbolic.xyz/models",
            }, 
        ]
        plus += [super().get_extra_settings()[3]]
        return plus

Where plus += [super().get_extra_settings()[3]] adds the streaming setting.

Full code:

class HyperbolicHandler(OpenAIHandler):
    key = "hyperbolic"

    def __init__(self, settings, path):
        super().__init__(settings, path)
        self.set_setting("endpoint", "https://api.hyperbolic.xyz/v1/")
        self.set_setting("advanced_params", False)

    def get_extra_settings(self) -> list:
        plus = [
            {
                "key": "api",
                "title": _("API Key"),
                "description": _("API Key for Hyperbolic"),
                "type": "entry",
                "default": ""
            },
            {
                "key": "model",
                "title": _("Hyperbolic Model"),
                "description": _("Name of the Hyperbolic Model"),
                "type": "entry",
                "default": "meta-llama/Meta-Llama-3.1-70B-Instruct",
                "website": "https://app.hyperbolic.xyz/models",
            }, 
        ]
        plus += [super().get_extra_settings()[3]]
        return plus

DuckDuckGo

DuckDuckGo falls in the category of API already implemented by gpt4free.

  1. We will create an handler that extends G4FHandler.
class DDGHandler(G4FHandler):
    key = "ddg" 
  1. In the init function, we create a g4f client that only uses the DuckDuckGo Provider:
    def __init__(self, settings, path):
        import g4f
        super().__init__(settings, path)
        self.client = g4f.client.Client(provider=g4f.Provider.DDG)        
  1. We add the extra settings in which we specify the available models and the default one. You might want to take a look to settings management.
    def get_extra_settings(self) -> list:
        return [
            {
                "key": "model",
                "title": _("Model"),
                "description": _("The model to use"),
                "type": "combo",
                "values": self.get_model(),
                "default": "gpt-4o-mini",
            }
        ] + super().get_extra_settings()

    def get_model(self):
        import g4f
        res = tuple()
        for model in g4f.Provider.DDG.models:
            res += ((model, model), )
        return res

Note: Always add the streaming setting in the G4FHandlers, you can add it by simply adding super().get_extra_settings().

Full code:

class DDGHandler(G4FHandler):
    key = "ddg" 
    
    def __init__(self, settings, path):
        import g4f
        super().__init__(settings, path)
        self.client = g4f.client.Client(provider=g4f.Provider.DDG)        
    def get_extra_settings(self) -> list:
        return [
            {
                "key": "model",
                "title": _("Model"),
                "description": _("The model to use"),
                "type": "combo",
                "values": self.get_model(),
                "default": "gpt-4o-mini",
            }
        ] + super().get_extra_settings()

    def get_model(self):
        import g4f
        res = tuple()
        for model in g4f.Provider.DDG.models:
            res += ((model, model), )
        return res

Ollama

We will now build an handler for Ollama using the ollama python library.

  1. This library is not compatible with any of the existing handlers, so we create a class that extends LLMHandler
    key = "ollama"

    @staticmethod
    def get_extra_requirements() -> list:
        return ["ollama"]
  1. The ollama library is not preinstalled in Newelle, so we add it to the extra requirements. The user will be able to install these requirements in the settings.
    @staticmethod
    def get_extra_requirements() -> list:
        return ["ollama"]
  1. This library supports vision and multimodal models, so we override the supports_vision method. Note: You can also do some checks if the current settings for the handler support the vision (for example check if the model is compatible)
    def supports_vision(self) -> bool:
        return True
  1. We create a convert history method. Newelle saves the history in a very different format, so we have to convert it. Also, we add vision support. Note: Messages with images are normal messages by the user, but they start with an image codeblock.
from .extra import extract_image
    def convert_history(self, history: list, prompts: list | None = None) -> list:
        if prompts is None:
            prompts = self.prompts
        result = []
        result.append({"role": "system", "content": "\n".join(prompts)})
        for message in history:
            if message["User"] == "Console":
                result.append({
                    "role": "user",
                    "content": "Console: " + message["Message"]
                })
            else:
                # Extract text and image using an helper function
                image, text = extract_image(message["Message"])
                
                msg = {
                    "role": message["User"].lower() if message["User"] in {"Assistant", "User"} else "system",
                    "content": text
                }
                # If the image is not none, append it to the history
                if message["User"] == "User" and image is not None:
                    if image.startswith("data:image/png;base64,"):
                        image = image[len("data:image/png;base64,"):]
                    msg["images"] = [image]
                result.append(msg)
        return result

Note: you can use this method every time there is an OpenAI - like history. 4. We specify the settings that the user can change:

    def get_extra_settings(self) -> list:
        return [ 
            {
                "key": "endpoint",
                "title": _("API Endpoint"),
                "description": _("API base url, change this to use interference APIs"),
                "type": "entry",
                "default": "http://localhost:11434"
            },
            {
                "key": "model",
                "title": _("Ollama Model"),
                "description": _("Name of the Ollama Model"),
                "type": "entry",
                "default": "llama3.1:8b"
            },
            {
                "key": "streaming",
                "title": _("Message Streaming"),
                "description": _("Gradually stream message output"),
                "type": "toggle",
                "default": True
            },
        ]

Note: the streaming settings is automatically taken to choose if stream is enabled instead of overriding stream_enabled method. 5. We create a function to generate text without streaming

    def generate_text(self, prompt: str, history: list[dict[str, str]] = [], system_prompt: list[str] = []) -> str:
        from ollama import client
        messages = self.convert_history(history, system_prompt)
        messages.append({"role": "user", "content": prompt})

        client = client(
            host=self.get_setting("endpoint")
        )
        try:
            response = client.chat(
                model=self.get_setting("model"),
                messages=messages,
            )
            return response["message"]["content"]
        except exception as e:
            return str(e)
  1. We create a function to generate text with streaming
    def generate_text_stream(self, prompt: str, history: list[dict[str, str]] = [], system_prompt: list[str] = [], on_update: Callable[[str], Any] = lambda _: None, extra_args: list = []) -> str:
        from ollama import Client
        messages = self.convert_history(history, system_prompt)
        messages.append({"role": "user", "content": prompt})
        client = Client(
            host=self.get_setting("endpoint")
        )
        try:
            response = client.chat(
                model=self.get_setting("model"),
                messages=messages,
                stream=True
            )
            full_message = ""
            prev_message = ""
            for chunk in response:
                full_message += chunk["message"]["content"]
                args = (full_message.strip(), ) + tuple(extra_args)
                if len(full_message) - len(prev_message) > 1:
                    on_update(*args)
                    prev_message = full_message
            return full_message.strip()
        except Exception as e:
            return str(e)