[Roadmap]: Roadmap about Enhanced non-OpenAI models #2946

qingyun-wu · 2024-06-14T19:50:38Z

One major milestone in release v0.2.30-v0.2.32 will be enhanced support of non-OpenAI models.

Plan for the next release v0.2.32:

Tasks about enhanced non-OpenAI Model Support

Give feedback

A blogpost highlighting the enhanced non-OpenAI model support: Blog post for enhanced non-OpenAI model support #2965
Cohere client
Groq client
01 Yi model
Options

@marklysze, @Hk669, @yiranwu0, feel free to add tasks to the task list.

💡Feel free to suggest any other pressing features you want to see,/which issues you hope to be addressed, or/which PRs to be merged in the next release!

And let's make it happen together 🏆💪!

Finished task in release v0.2.30 and v0.2.31

Tasks about enhanced non-OpenAI Model Support

Give feedback

Hk669 · 2024-06-14T20:25:35Z

thanks @qingyun-wu.

Josephrp · 2024-06-14T21:27:46Z

may i suggest 01 Yi model familly apis , i just got the docs, and this was actually on my list : to bring it to autogen i mean :-) openai drop in compatible already.

Josephrp · 2024-06-14T21:29:46Z

also the cohere api has high potential since they have a performant function calling model among other interesting offerings , just a thought

qingyun-wu · 2024-06-14T21:30:38Z

may i suggest 01 Yi model familly apis , i just got the docs, and this was actually on my list : to bring it to autogen i mean :-) openai drop in compatible already.

Good ideas! Let's see if we can find volunteers to add those!

PanQiWei · 2024-06-15T13:31:52Z

Hi, I find this roadmap is mainly about clients implementation. Here is an optimization advice I would very appreciate if you can also implement in 0.2.30 🙏

Suggestion

Implement an object pool to cache clients to avoid instantiate client with the same key repeatly.

Reason

This is really influence agent's init speed when there are tones of agents and tools. Especially for tools/functions, everytime a tool is registered to a llm, a new OpenAIWrapper is created to update config, that's ok if only request config (payload) is updated, however, in current implementation, this will always create a new client such as openai.OpenAI for _register_default_client is called when init a OpenAIWrapper no matter what.

Below is an image cut for my project's agents init profie result, as you can see, it costed up to ~3s in total to initiate all agents when there are tools registered and llm_config provided (even though they are all the same for each agent).

The root cause is load_verify_locations in ssl package which used in httpx under the hood in openai client. Thus it there is a cache mechanism (such as object pool) implemented in client level, it would boost up a lot for agent initialization when one's project using lots of agents and tools at the same time, make it truely possible for product deployment.

PanQiWei · 2024-06-16T07:16:04Z

Here is my simple implementation for caching client, hope it helpful:

import json
import logging
import sys
from hashlib import md5
from typing import Any, Dict
from threading import Lock

from autogen import OpenAIWrapper
from autogen.oai.client import PlaceHolderClient
from flaml.automl.logger import logger_formatter

from omne._types import ThreadLevelSingleton

logger = logging.getLogger(__name__)
if not logger.handlers:
    # Add the console handler.
    _ch = logging.StreamHandler(stream=sys.stdout)
    _ch.setFormatter(logger_formatter)
    logger.addHandler(_ch)


def _config_to_key(config: Dict[str, Any]) -> str:
    return md5(json.dumps(config, sort_keys=True).encode()).hexdigest()


class ClientCache(ThreadLevelSingleton):
    def __init__(self):
        self._client_creation_lock = Lock()

        self._oai_clients = {}
        self._aoai_clients = {}
        self._google_clients = {}

    def _get_client(self, cache: dict, config: Dict[str, Any], client_class: Any):
        key = _config_to_key(config)
        if key not in cache:
            with self._client_creation_lock:
                if key not in cache:
                    cache[key] = client_class(**config)
        return cache[key]

    def create_or_get_oai_client(self, config: Dict[str, Any]):
        from autogen.oai.client import OpenAIClient
        from openai import OpenAI
        return OpenAIClient(client=self._get_client(self._oai_clients, config, OpenAI).copy())

    def create_or_get_aoai_client(self, config: Dict[str, Any]):
        from autogen.oai.client import OpenAIClient
        from openai import AzureOpenAI
        return OpenAIClient(client=self._get_client(self._aoai_clients, config, AzureOpenAI).copy())

    def create_or_get_google_client(self, config: Dict[str, Any]):
        try:
            from autogen.oai.gemini import GeminiClient
        except:
            raise ImportError("Please install `google-generativeai` to use Google OpenAI API.")
        return self._get_client(self._google_clients, config, GeminiClient)


def _register_default_client(self, config: Dict[str, Any], openai_config: Dict[str, Any]) -> None:
    client_cache = ClientCache()

    openai_config = {**openai_config, **{k: v for k, v in config.items() if k in self.openai_kwargs}}
    api_type = config.get("api_type")
    model_client_cls_name = config.get("model_client_cls")
    if model_client_cls_name is not None:
        # a config for a custom client is set
        # adding placeholder until the register_model_client is called with the appropriate class
        self._clients.append(PlaceHolderClient(config))
        logger.info(
            f"Detected custom model client in config: {model_client_cls_name}, model client can not be used until register_model_client is called."
        )
    else:
        if api_type is not None and api_type.startswith("azure"):
            self._configure_azure_openai(config, openai_config)
            self._clients.append(client_cache.create_or_get_aoai_client(openai_config))
        elif api_type is not None and api_type.startswith("google"):
            self._clients.append(client_cache.create_or_get_google_client(openai_config))
        else:
            self._clients.append(client_cache.create_or_get_oai_client(openai_config))


def patch_openai_wrapper():
    OpenAIWrapper._register_default_client = _register_default_client


__all__ = ["patch_openai_wrapper"]

qingyun-wu · 2024-06-16T18:30:20Z

Here is my simple implementation for caching client, hope it helpful:

import json
import logging
import sys
from hashlib import md5
from typing import Any, Dict
from threading import Lock

from autogen import OpenAIWrapper
from autogen.oai.client import PlaceHolderClient
from flaml.automl.logger import logger_formatter

from omne._types import ThreadLevelSingleton

logger = logging.getLogger(__name__)
if not logger.handlers:
    # Add the console handler.
    _ch = logging.StreamHandler(stream=sys.stdout)
    _ch.setFormatter(logger_formatter)
    logger.addHandler(_ch)


def _config_to_key(config: Dict[str, Any]) -> str:
    return md5(json.dumps(config, sort_keys=True).encode()).hexdigest()


class ClientCache(ThreadLevelSingleton):
    def __init__(self):
        self._client_creation_lock = Lock()

        self._oai_clients = {}
        self._aoai_clients = {}
        self._google_clients = {}

    def _get_client(self, cache: dict, config: Dict[str, Any], client_class: Any):
        key = _config_to_key(config)
        if key not in cache:
            with self._client_creation_lock:
                if key not in cache:
                    cache[key] = client_class(**config)
        return cache[key]

    def create_or_get_oai_client(self, config: Dict[str, Any]):
        from autogen.oai.client import OpenAIClient
        from openai import OpenAI
        return OpenAIClient(client=self._get_client(self._oai_clients, config, OpenAI).copy())

    def create_or_get_aoai_client(self, config: Dict[str, Any]):
        from autogen.oai.client import OpenAIClient
        from openai import AzureOpenAI
        return OpenAIClient(client=self._get_client(self._aoai_clients, config, AzureOpenAI).copy())

    def create_or_get_google_client(self, config: Dict[str, Any]):
        try:
            from autogen.oai.gemini import GeminiClient
        except:
            raise ImportError("Please install `google-generativeai` to use Google OpenAI API.")
        return self._get_client(self._google_clients, config, GeminiClient)


def _register_default_client(self, config: Dict[str, Any], openai_config: Dict[str, Any]) -> None:
    client_cache = ClientCache()

    openai_config = {**openai_config, **{k: v for k, v in config.items() if k in self.openai_kwargs}}
    api_type = config.get("api_type")
    model_client_cls_name = config.get("model_client_cls")
    if model_client_cls_name is not None:
        # a config for a custom client is set
        # adding placeholder until the register_model_client is called with the appropriate class
        self._clients.append(PlaceHolderClient(config))
        logger.info(
            f"Detected custom model client in config: {model_client_cls_name}, model client can not be used until register_model_client is called."
        )
    else:
        if api_type is not None and api_type.startswith("azure"):
            self._configure_azure_openai(config, openai_config)
            self._clients.append(client_cache.create_or_get_aoai_client(openai_config))
        elif api_type is not None and api_type.startswith("google"):
            self._clients.append(client_cache.create_or_get_google_client(openai_config))
        else:
            self._clients.append(client_cache.create_or_get_oai_client(openai_config))


def patch_openai_wrapper():
    OpenAIWrapper._register_default_client = _register_default_client


__all__ = ["patch_openai_wrapper"]

Thanks @PanQiWei! This looks great! I wonder if you would like to contribute? Or help to review/test it if we find contributors? We can chat for more details on Discord: https://discord.com/invite/Yb5gwGVkE5. Thank you!

Phodaie · 2024-06-18T01:46:08Z

Regarding Cohere Command R and Command R+ models, I have implemented a basic CohereAgent(ConversableAgent). Just as GPTAssistantAgent I think Cohere model support should be in the form of CoversableAgent extension (not ModelClient). These models have support for parallel and sequential function calling so a single prompt may result in the model calling multiple (dependent) functions/tools in sequence before returning its response.

geoffroy-noel-ddh · 2024-06-18T13:55:21Z

Better support for local models would be appreciated. See issues reported in #2953 . At a minimum indicating in your documentation which examples have been successfully tested with local models. That would save a lot of time for developers new to autogen trying to understand why your examples work so differently once they use something else than OpenAI. If it doesn't work being upfront about the limitations would greatly help. If it works, telling which model it has been successfully tested with would also save users a lot of time & efforts.

I think it is a common problem with many frameworks (e.g. Langchain). There are plenty of tutorials, examples, prompts, etc. designed primarily and often tested exclusively with OpenAI services but assessing whether they work sufficiently with local models (or how to make them work, or if anyone has ever managed to make them work) requires a lot of experimentations, online search, etc. which can quickly go beyond the resources of smaller development teams.

qingyun-wu · 2024-06-18T20:31:47Z

Can we also address this issue in this release: #1262
@yiranwu0 , @Hk669, @marklysze Thanks!

scruffynerf · 2024-06-18T23:38:33Z

Add #2929 and #2930

Hk669 · 2024-06-19T03:11:11Z

Add #2929 and #2930

thanks, I think the Anthropic client will close these issues.

scruffynerf · 2024-06-19T05:35:00Z

Instructor clients? Instructor needs to use a custom client even with using OpenAI API because it wraps the calls and enforces the response model so it can re-request multiple times until it succeeds (or fails N times)
So while it might be possible to support via a non-client method (you'd have to hooks in multiple places, and reproduce Instructor-ish behavior ala Guidance (but much better than that), allowing using Instructor as a client is much easier.

I have code for this to submit.

scruffynerf · 2024-06-19T05:40:55Z

I also did an Ollama Raw client, though it's not really worth the effort (I did it to do mistral v0.3 tools but it works fine with my 'toolsfortoolless' code without Raw. I'll probably put it someplace regardless, just so it's out there.

garnermccloud · 2024-06-20T20:38:51Z

It might be worth exploring the use of LiteLLM in AutoGen to see if we can offload the non-OpenAI model support to a dedicated library:
https://github.com/BerriAI/litellm

Has anyone looked into this yet? Is there functionality specific to AutoGen that isn't supported in LiteLLM?

scruffynerf · 2024-06-20T21:16:00Z

It might be worth exploring the use of LiteLLM in AutoGen to see if we can offload the non-OpenAI model support to a dedicated library...

That doesn't really address the problem of wanting other clients supported directly. Yes, litellm as a wrapper can work for some, and recommending it is fine, but it's not an answer for everyone. We already have some 'tweaks' in various clients to allow adjusting stuff as needed. Using a 'universal wrapper' means you can't tune that way. If it did, we could adjust the OpenAI wrapper we already use in a majority of cases.

I go back and forth between using Ollama and LLMStudio, and I've played with other local servers. Each has pros and cons. Same is true of wrappers like litellm. There are tradeoffs. Litellm as a library gives you a translation from a common format to various specific formats that differ, but you also lose ability to do those tweaks.

OpenAI API support is the most common, but not universal, API, and adding additional APIs with 'in repo' supported clients is a good thing, because there will always be other flavors of API out there.

I wouldn't be opposed to seeing a LiteLLM using client to be clear, I just don't want it to be 'the answer'

marklysze · 2024-06-20T23:27:23Z

I was thinking that this roadmap can cover the cloud-based inference providers.

Separately, I think it would be good to have a local LLM focused blog and associated PRs on a roadmap. That could focus on client classes for the likes of LiteLLM / Ollama / etc. as well as approaches / classes like @scruffynerf's "toolsfortoolless". Local LLMs is an area I started out in and found it frustrating, like @geoffroy-noel-ddh noted, trying to figure out the right LLM for the right setup. If that's something that people want to work on let's create that.

brycecf · 2024-06-24T14:11:32Z

It might be worth exploring the use of LiteLLM in AutoGen to see if we can offload the non-OpenAI model support to a dedicated library: https://github.com/BerriAI/litellm

Has anyone looked into this yet? Is there functionality specific to AutoGen that isn't supported in LiteLLM?

LiteLLM did not actually resolve the underlying issue that AutoGen is implemented assuming an OpenAI/GPT-style valid conversation flow. LiteLLM just creates an API proxy.

At least prior to the Anthropic PRs. Haven't tested since then to see if it consistently works now (with LiteLLM.)

Hk669 added the roadmap Issues related to roadmap of AutoGen label Jun 14, 2024

qingyun-wu pinned this issue Jun 14, 2024

This was referenced Jun 18, 2024

[Bug]: No way to set or register custom LLM to GroupChat selector agents #2929

Closed

[Bug]: Custom Model not supported in Group Chat with Speaker Prompt #2956

Open

[Bug]: Group chat does not work with Anthropic model #2884

Closed

marklysze mentioned this issue Jun 18, 2024

Blog post for enhanced non-OpenAI model support #2965

Open

3 tasks

scruffynerf mentioned this issue Jun 18, 2024

[Bug]: Registering a function (tool or function) wipes out any register_custom_client class in place #2930

Open

This was referenced Jun 19, 2024

[Bug]: AnthropicClient Example Does Not Work w/ Other Conversation Patterns #2932

Closed

Anthropic client fixes #2981

Merged

Hk669 mentioned this issue Jun 21, 2024

fix AnthropicClient to make it work with ConversableAgent #2807

Closed

3 tasks

qingyun-wu changed the title ~~[Roadmap]: v0.2.30 roadmap~~ [Roadmap]: v0.2.32 roadmap Jun 22, 2024

qingyun-wu changed the title ~~[Roadmap]: v0.2.32 roadmap~~ [Roadmap]: Roadmap about Enhanced non-OpenAI models in v0.2.30, v0.2.31, and v0.2.32 Jun 22, 2024

qingyun-wu changed the title ~~[Roadmap]: Roadmap about Enhanced non-OpenAI models in v0.2.30, v0.2.31, and v0.2.32~~ [Roadmap]: Roadmap about Enhanced non-OpenAI models Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap]: Roadmap about Enhanced non-OpenAI models #2946

[Roadmap]: Roadmap about Enhanced non-OpenAI models #2946

qingyun-wu commented Jun 14, 2024 •

edited by Hk669

Loading

Tasks about enhanced non-OpenAI Model Support

Tasks about enhanced non-OpenAI Model Support

Hk669 commented Jun 14, 2024

Josephrp commented Jun 14, 2024

Josephrp commented Jun 14, 2024

qingyun-wu commented Jun 14, 2024

PanQiWei commented Jun 15, 2024

PanQiWei commented Jun 16, 2024

qingyun-wu commented Jun 16, 2024

Phodaie commented Jun 18, 2024

geoffroy-noel-ddh commented Jun 18, 2024 •

edited

Loading

qingyun-wu commented Jun 18, 2024

scruffynerf commented Jun 18, 2024

Hk669 commented Jun 19, 2024

scruffynerf commented Jun 19, 2024 •

edited

Loading

scruffynerf commented Jun 19, 2024

garnermccloud commented Jun 20, 2024

scruffynerf commented Jun 20, 2024 •

edited

Loading

marklysze commented Jun 20, 2024

brycecf commented Jun 24, 2024 •

edited

Loading

[Roadmap]: Roadmap about Enhanced non-OpenAI models #2946

[Roadmap]: Roadmap about Enhanced non-OpenAI models #2946

Comments

qingyun-wu commented Jun 14, 2024 • edited by Hk669 Loading

Plan for the next release v0.2.32:

Tasks about enhanced non-OpenAI Model Support

And let's make it happen together 🏆💪!

Finished task in release v0.2.30 and v0.2.31

Tasks about enhanced non-OpenAI Model Support

Hk669 commented Jun 14, 2024

Josephrp commented Jun 14, 2024

Josephrp commented Jun 14, 2024

qingyun-wu commented Jun 14, 2024

PanQiWei commented Jun 15, 2024

Suggestion

Reason

PanQiWei commented Jun 16, 2024

qingyun-wu commented Jun 16, 2024

Phodaie commented Jun 18, 2024

geoffroy-noel-ddh commented Jun 18, 2024 • edited Loading

qingyun-wu commented Jun 18, 2024

scruffynerf commented Jun 18, 2024

Hk669 commented Jun 19, 2024

scruffynerf commented Jun 19, 2024 • edited Loading

scruffynerf commented Jun 19, 2024

garnermccloud commented Jun 20, 2024

scruffynerf commented Jun 20, 2024 • edited Loading

marklysze commented Jun 20, 2024

brycecf commented Jun 24, 2024 • edited Loading

qingyun-wu commented Jun 14, 2024 •

edited by Hk669

Loading

geoffroy-noel-ddh commented Jun 18, 2024 •

edited

Loading

scruffynerf commented Jun 19, 2024 •

edited

Loading

scruffynerf commented Jun 20, 2024 •

edited

Loading

brycecf commented Jun 24, 2024 •

edited

Loading