feat(AI): Create abstractions for generic LLM calls within sentry (#6…

…8771) ### Background For the User Feedback #61372 Spam Detection feature, we intend to use an LLM. Along with other use cases such as suggested fix, and code integration, there is a need across the Sentry codebase to be able to call LLMs. Because Sentry is self hosted, some features use different LLMs and we want to provide modularity, we need to be able to configure different LLM providers, models, and usecases. ### Solution: - We define an options based config for Providers and use cases, where you can specify a LLM provider's options and settings. - For each use case, you then define what LLM provider it uses, and what model. Within `sentry/llm` we define a `providers` module, which consists of a base and implementations. To start, we have OpenAI, Google Vertex, and a Preview implementation used for testing. These will use the provider options to initialize a client and connect to the LLM provider. The providers inherit from `LazyServiceWrapper`. Also within `sentry/llm`, we define a `usecases` module, which simply consists of a function `complete_prompt`, along with an enum of use cases. These options are passed to the LLM provider per use case, and can be configured via the above option. ### Testing I've added unit tests which mock the LLM calls, and I've tested in my local environment that calls to the actual services work. ### In practice: So to use an LLM, you do the following steps: 1. define your usecase in the [usecase enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14) 2. Call the `complete_prompt` function with your `usecase`, prompt, content, temperature, and max_tokens) ### Limitations: Because each LLM right now has a different interface, some things that are specific, say to OpenAI like "function calling", where an output is guaranteed to be in a specific JSON format, this solution does not currently support. Advanced usecases beyond simple "prompt" + "text" and a singe output, are not currently supported. It is likely possible to add support for these on a case by case basis. LLM providers are not quite to the point where they have standardized on a consistent API, which makes supporting these somewhat difficult. Third parties have come up with various solutions [LangChain](https://github.com/langchain-ai/langchain), [LiteLLM](https://github.com/BerriAI/litellm), [LocalAI](https://github.com/mudler/LocalAI), [OpenRouter](https://openrouter.ai/). It will probably make sense eventually to adopt one of these tools, or our own advanced tooling, once our use cases outgrow this solution. There is also a possible future where we want different use cases to use different API keys, but for now, one provider only has one set of credentials. ### TODO - [ ] create develop docs for how to add a usecase, or new LLM provider - [x] Follow up PR to replace suggested fix openai calls with new abstraction - [ ] PR in getsentry to set provider / usecase values for SaaS - [ ] PR followup to add telemetry information - [ ] We'll likely want to support streaming responses. --------- Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com> Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
getsentry · Apr 19, 2024 · 32f7e6f · 32f7e6f
1 parent 729b942
commit 32f7e6f
Show file tree

Hide file tree

Showing 21 changed files with 540 additions and 45 deletions.
diff --git a/pyproject.toml b/pyproject.toml
@@ -594,6 +594,7 @@ module = [
     "sentry.eventstore.reprocessing.redis",
     "sentry.issues.related.*",
     "sentry.lang.java.processing",
+    "sentry.llm.*",
     "sentry.migrations.*",
     "sentry.nodestore.base",
     "sentry.nodestore.bigtable.backend",

diff --git a/src/sentry/conf/server.py b/src/sentry/conf/server.py
@@ -2328,11 +2328,6 @@ def custom_parameter_sort(parameter: dict) -> tuple[str, int]:
 SENTRY_CHART_RENDERER = "sentry.charts.chartcuterie.Chartcuterie"
 SENTRY_CHART_RENDERER_OPTIONS: dict[str, Any] = {}
 
-# User Feedback Spam Detection
-SENTRY_USER_FEEDBACK_SPAM = "sentry.feedback.spam.stub.StubFeedbackSpamDetection"
-SENTRY_USER_FEEDBACK_SPAM_OPTIONS: dict[str, str] = {}
-
-
 # URI Prefixes for generating DSN URLs
 # (Defaults to URL_PREFIX by default)
 SENTRY_ENDPOINT: str | None = None

diff --git a/src/sentry/feedback/spam/__init__.py b/src/sentry/feedback/spam/__init__.py
diff --git a/src/sentry/feedback/spam/base.py b/src/sentry/feedback/spam/base.py
diff --git a/src/sentry/feedback/spam/stub.py b/src/sentry/feedback/spam/stub.py
diff --git a/src/sentry/llm/__init__.py b/src/sentry/llm/__init__.py
diff --git a/src/sentry/llm/exceptions.py b/src/sentry/llm/exceptions.py
@@ -0,0 +1,14 @@
+class InvalidUsecaseError(ValueError):
+    pass
+
+
+class InvalidProviderError(ValueError):
+    pass
+
+
+class InvalidModelError(ValueError):
+    pass
+
+
+class InvalidTemperature(ValueError):
+    pass
diff --git a/src/sentry/llm/providers/__init__.py b/src/sentry/llm/providers/__init__.py
diff --git a/src/sentry/llm/providers/base.py b/src/sentry/llm/providers/base.py
@@ -0,0 +1,45 @@
+from sentry.llm.exceptions import InvalidModelError, InvalidProviderError
+from sentry.llm.types import ProviderConfig, UseCaseConfig
+from sentry.utils.services import Service
+
+
+class LlmModelBase(Service):
+    def __init__(self, provider_config: ProviderConfig) -> None:
+        self.provider_config = provider_config
+
+    def complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        self.validate_model(usecase_config["options"]["model"])
+
+        return self._complete_prompt(
+            usecase_config=usecase_config,
+            prompt=prompt,
+            message=message,
+            temperature=temperature,
+            max_output_tokens=max_output_tokens,
+        )
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        raise NotImplementedError
+
+    def validate_model(self, model_name: str) -> None:
+        if "models" not in self.provider_config:
+            raise InvalidProviderError(f"No models defined for provider {self.__class__.__name__}")
+
+        if model_name not in self.provider_config["models"]:
+            raise InvalidModelError(f"Invalid model: {model_name}")
diff --git a/src/sentry/llm/providers/openai.py b/src/sentry/llm/providers/openai.py
@@ -0,0 +1,60 @@
+from openai import OpenAI
+
+from sentry.llm.providers.base import LlmModelBase
+from sentry.llm.types import UseCaseConfig
+
+
+class OpenAIProvider(LlmModelBase):
+
+    provider_name = "openai"
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        model = usecase_config["options"]["model"]
+        client = get_openai_client(self.provider_config["options"]["api_key"])
+
+        response = client.chat.completions.create(
+            model=model,
+            temperature=temperature
+            * 2,  # open AI temp range is [0.0 - 2.0], so we have to multiply by two
+            messages=[
+                {"role": "system", "content": prompt},
+                {
+                    "role": "user",
+                    "content": message,
+                },
+            ],
+            stream=False,
+            max_tokens=max_output_tokens,
+        )
+
+        return response.choices[0].message.content
+
+
+openai_client: OpenAI | None = None
+
+
+class OpenAIClientSingleton:
+    _instance = None
+    client: OpenAI
+
+    def __init__(self) -> None:
+        raise RuntimeError("Call instance() instead")
+
+    @classmethod
+    def instance(cls, api_key: str) -> "OpenAIClientSingleton":
+        if cls._instance is None:
+            cls._instance = cls.__new__(cls)
+            cls._instance.client = OpenAI(api_key=api_key)
+        return cls._instance
+
+
+def get_openai_client(api_key: str) -> OpenAI:
+    return OpenAIClientSingleton.instance(api_key).client
diff --git a/src/sentry/llm/providers/preview.py b/src/sentry/llm/providers/preview.py
@@ -0,0 +1,21 @@
+from sentry.llm.providers.base import LlmModelBase
+from sentry.llm.types import UseCaseConfig
+
+
+class PreviewLLM(LlmModelBase):
+    """
+    A dummy LLM provider that does not actually send any requests to any LLM API.
+    """
+
+    provider_name = "preview"
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float = 0.7,
+        max_output_tokens: int = 1000,
+    ) -> str | None:
+        return ""
diff --git a/src/sentry/llm/providers/vertex.py b/src/sentry/llm/providers/vertex.py
@@ -0,0 +1,66 @@
+import logging
+
+import google.auth
+import google.auth.transport.requests
+import requests
+
+from sentry.llm.providers.base import LlmModelBase
+from sentry.llm.types import UseCaseConfig
+
+logger = logging.getLogger(__name__)
+
+
+class VertexProvider(LlmModelBase):
+    """
+    A provider for Google Vertex AI. Uses default service account credentials.
+    """
+
+    provider_name = "vertex"
+    candidate_count = 1  # we only want one candidate returned at the moment
+    top_p = 1  # TODO: make this configurable?
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+
+        payload = {
+            "instances": [{"content": f"{prompt} {message}"}],
+            "parameters": {
+                "candidateCount": self.candidate_count,
+                "maxOutputTokens": max_output_tokens,
+                "temperature": temperature,
+                "topP": self.top_p,
+            },
+        }
+
+        headers = {
+            "Authorization": f"Bearer {self._get_access_token()}",
+            "Content-Type": "application/json",
+        }
+        vertex_url = self.provider_config["options"]["url"]
+        vertex_url += usecase_config["options"]["model"] + ":predict"
+
+        response = requests.post(vertex_url, headers=headers, json=payload)
+
+        if response.status_code == 200:
+            logger.info("Request successful.")
+        else:
+            logger.info(
+                "Request failed with status code and response text.",
+                extra={"status_code": response.status_code, "response_text": response.text},
+            )
+
+        return response.json()["predictions"][0]["content"]
+
+    def _get_access_token(self) -> str:
+        # https://stackoverflow.com/questions/53472429/how-to-get-a-gcp-bearer-token-programmatically-with-python
+
+        creds, _ = google.auth.default()
+        creds.refresh(google.auth.transport.requests.Request())
+        return creds.token
diff --git a/src/sentry/llm/types.py b/src/sentry/llm/types.py
@@ -0,0 +1,11 @@
+from typing import TypedDict
+
+
+class ProviderConfig(TypedDict):
+    options: dict[str, str]
+    models: list[str]
+
+
+class UseCaseConfig(TypedDict):
+    provider: str
+    options: dict[str, str]
diff --git a/src/sentry/llm/usecases/__init__.py b/src/sentry/llm/usecases/__init__.py
@@ -0,0 +1,104 @@
+from enum import Enum
+
+from sentry import options
+from sentry.llm.exceptions import InvalidProviderError, InvalidTemperature, InvalidUsecaseError
+from sentry.llm.providers.base import LlmModelBase
+from sentry.llm.providers.openai import OpenAIProvider
+from sentry.llm.providers.preview import PreviewLLM
+from sentry.llm.providers.vertex import VertexProvider
+from sentry.llm.types import ProviderConfig, UseCaseConfig
+
+SENTRY_LLM_SERVICE_ALIASES = {
+    "vertex": VertexProvider,
+    "openai": OpenAIProvider,
+    "preview": PreviewLLM,
+}
+
+
+class LLMUseCase(Enum):
+    EXAMPLE = "example"  # used in tests / examples
+    SUGGESTED_FIX = "suggestedfix"  # OG version of suggested fix
+
+
+llm_provider_backends: dict[str, LlmModelBase] = {}
+
+
+def get_llm_provider_backend(usecase: LLMUseCase) -> LlmModelBase:
+    usecase_config = get_usecase_config(usecase.value)
+    global llm_provider_backends
+
+    if usecase_config["provider"] in llm_provider_backends:
+        return llm_provider_backends[usecase_config["provider"]]
+
+    if usecase_config["provider"] not in SENTRY_LLM_SERVICE_ALIASES:
+        raise InvalidProviderError(f"LLM provider {usecase_config['provider']} not found")
+
+    provider = SENTRY_LLM_SERVICE_ALIASES[usecase_config["provider"]]
+
+    provider_config = get_provider_config(usecase_config["provider"])
+
+    llm_provider_backends[usecase_config["provider"]] = provider(
+        provider_config,
+    )
+
+    return llm_provider_backends[usecase_config["provider"]]
+
+
+def complete_prompt(
+    *,
+    usecase: LLMUseCase,
+    prompt: str,
+    message: str,
+    temperature: float = 0.5,
+    max_output_tokens: int = 1000,
+) -> str | None:
+    """
+    Complete a prompt with a message using the specified usecase.
+    Default temperature and max_output_tokens set to a hopefully
+    reasonable value, but please consider what makes sense for
+    your specific use case.
+
+    Note that temperature should be between 0 and 1, and we will
+    normalize to any providers who have a different range
+    """
+    _validate_temperature(temperature)
+
+    usecase_config = get_usecase_config(usecase.value)
+
+    backend = get_llm_provider_backend(usecase)
+    return backend.complete_prompt(
+        usecase_config=usecase_config,
+        prompt=prompt,
+        message=message,
+        temperature=temperature,
+        max_output_tokens=max_output_tokens,
+    )
+
+
+def get_usecase_config(usecase: str) -> UseCaseConfig:
+    usecase_options_all = options.get("llm.usecases.options")
+    if not usecase_options_all:
+        raise InvalidUsecaseError(
+            "LLM usecase options not found. please check llm.usecases.options"
+        )
+
+    if usecase not in usecase_options_all:
+        raise InvalidUsecaseError(
+            f"LLM usecase options not found for {usecase}. please check llm.usecases.options"
+        )
+
+    return usecase_options_all[usecase]
+
+
+def get_provider_config(provider: str) -> ProviderConfig:
+    llm_provider_options_all = options.get("llm.provider.options")
+    if not llm_provider_options_all:
+        raise InvalidProviderError("LLM provider option value not found")
+    if provider not in llm_provider_options_all:
+        raise InvalidProviderError(f"LLM provider {provider} not found")
+    return llm_provider_options_all[provider]
+
+
+def _validate_temperature(temperature: float) -> None:
+    if not (0 <= temperature <= 1):
+        raise InvalidTemperature("Temperature must be between 0 and 1")