Skip to content

Commit

Permalink
feat(AI): Create abstractions for generic LLM calls within sentry (#6…
Browse files Browse the repository at this point in the history
…8771)

### Background
For the User Feedback #61372
Spam Detection feature, we intend to use an LLM. Along with other use
cases such as suggested fix, and code integration, there is a need
across the Sentry codebase to be able to call LLMs.

Because Sentry is self hosted, some features use different LLMs and we
want to provide modularity, we need to be able to configure different
LLM providers, models, and usecases.

### Solution:

- We define an options based config for Providers and use cases, where
you can specify a LLM provider's options and settings.
- For each use case, you then define what LLM provider it uses, and what
model.


Within `sentry/llm` we define a `providers` module, which consists of a
base and implementations. To start, we have OpenAI, Google Vertex, and a
Preview implementation used for testing. These will use the provider
options to initialize a client and connect to the LLM provider. The
providers inherit from `LazyServiceWrapper`.

Also within `sentry/llm`, we define a `usecases` module, which simply
consists of a function `complete_prompt`, along with an enum of use
cases. These options are passed to the LLM provider per use case, and
can be configured via the above option.


### Testing
I've added unit tests which mock the LLM calls, and I've tested in my
local environment that calls to the actual services work.


### In practice:
So to use an LLM, you do the following steps:
1. define your usecase in the [usecase
enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14)
2. Call the `complete_prompt` function with your `usecase`, prompt,
content, temperature, and max_tokens)



### Limitations:

Because each LLM right now has a different interface, some things that
are specific, say to OpenAI like "function calling", where an output is
guaranteed to be in a specific JSON format, this solution does not
currently support. Advanced usecases beyond simple "prompt" + "text" and
a singe output, are not currently supported. It is likely possible to
add support for these on a case by case basis.

LLM providers are not quite to the point where they have standardized on
a consistent API, which makes supporting these somewhat difficult. Third
parties have come up with various solutions
[LangChain](https://github.com/langchain-ai/langchain),
[LiteLLM](https://github.com/BerriAI/litellm),
[LocalAI](https://github.com/mudler/LocalAI),
[OpenRouter](https://openrouter.ai/).

It will probably make sense eventually to adopt one of these tools, or
our own advanced tooling, once our use cases outgrow this solution.

There is also a possible future where we want different use cases to use
different API keys, but for now, one provider only has one set of
credentials.



### TODO

- [ ] create develop docs for how to add a usecase, or new LLM provider
- [x] Follow up PR to replace suggested fix openai calls with new
abstraction
- [ ] PR in getsentry to set provider / usecase values for SaaS
- [ ] PR followup to add telemetry information
- [ ] We'll likely want to support streaming responses.

---------

Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
  • Loading branch information
3 people authored Apr 19, 2024
1 parent 729b942 commit 32f7e6f
Show file tree
Hide file tree
Showing 21 changed files with 540 additions and 45 deletions.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,7 @@ module = [
"sentry.eventstore.reprocessing.redis",
"sentry.issues.related.*",
"sentry.lang.java.processing",
"sentry.llm.*",
"sentry.migrations.*",
"sentry.nodestore.base",
"sentry.nodestore.bigtable.backend",
Expand Down
5 changes: 0 additions & 5 deletions src/sentry/conf/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -2328,11 +2328,6 @@ def custom_parameter_sort(parameter: dict) -> tuple[str, int]:
SENTRY_CHART_RENDERER = "sentry.charts.chartcuterie.Chartcuterie"
SENTRY_CHART_RENDERER_OPTIONS: dict[str, Any] = {}

# User Feedback Spam Detection
SENTRY_USER_FEEDBACK_SPAM = "sentry.feedback.spam.stub.StubFeedbackSpamDetection"
SENTRY_USER_FEEDBACK_SPAM_OPTIONS: dict[str, str] = {}


# URI Prefixes for generating DSN URLs
# (Defaults to URL_PREFIX by default)
SENTRY_ENDPOINT: str | None = None
Expand Down
13 changes: 0 additions & 13 deletions src/sentry/feedback/spam/__init__.py

This file was deleted.

9 changes: 0 additions & 9 deletions src/sentry/feedback/spam/base.py

This file was deleted.

9 changes: 0 additions & 9 deletions src/sentry/feedback/spam/stub.py

This file was deleted.

Empty file added src/sentry/llm/__init__.py
Empty file.
14 changes: 14 additions & 0 deletions src/sentry/llm/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
class InvalidUsecaseError(ValueError):
pass


class InvalidProviderError(ValueError):
pass


class InvalidModelError(ValueError):
pass


class InvalidTemperature(ValueError):
pass
Empty file.
45 changes: 45 additions & 0 deletions src/sentry/llm/providers/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from sentry.llm.exceptions import InvalidModelError, InvalidProviderError
from sentry.llm.types import ProviderConfig, UseCaseConfig
from sentry.utils.services import Service


class LlmModelBase(Service):
def __init__(self, provider_config: ProviderConfig) -> None:
self.provider_config = provider_config

def complete_prompt(
self,
*,
usecase_config: UseCaseConfig,
prompt: str,
message: str,
temperature: float,
max_output_tokens: int,
) -> str | None:
self.validate_model(usecase_config["options"]["model"])

return self._complete_prompt(
usecase_config=usecase_config,
prompt=prompt,
message=message,
temperature=temperature,
max_output_tokens=max_output_tokens,
)

def _complete_prompt(
self,
*,
usecase_config: UseCaseConfig,
prompt: str,
message: str,
temperature: float,
max_output_tokens: int,
) -> str | None:
raise NotImplementedError

def validate_model(self, model_name: str) -> None:
if "models" not in self.provider_config:
raise InvalidProviderError(f"No models defined for provider {self.__class__.__name__}")

if model_name not in self.provider_config["models"]:
raise InvalidModelError(f"Invalid model: {model_name}")
60 changes: 60 additions & 0 deletions src/sentry/llm/providers/openai.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
from openai import OpenAI

from sentry.llm.providers.base import LlmModelBase
from sentry.llm.types import UseCaseConfig


class OpenAIProvider(LlmModelBase):

provider_name = "openai"

def _complete_prompt(
self,
*,
usecase_config: UseCaseConfig,
prompt: str,
message: str,
temperature: float,
max_output_tokens: int,
) -> str | None:
model = usecase_config["options"]["model"]
client = get_openai_client(self.provider_config["options"]["api_key"])

response = client.chat.completions.create(
model=model,
temperature=temperature
* 2, # open AI temp range is [0.0 - 2.0], so we have to multiply by two
messages=[
{"role": "system", "content": prompt},
{
"role": "user",
"content": message,
},
],
stream=False,
max_tokens=max_output_tokens,
)

return response.choices[0].message.content


openai_client: OpenAI | None = None


class OpenAIClientSingleton:
_instance = None
client: OpenAI

def __init__(self) -> None:
raise RuntimeError("Call instance() instead")

@classmethod
def instance(cls, api_key: str) -> "OpenAIClientSingleton":
if cls._instance is None:
cls._instance = cls.__new__(cls)
cls._instance.client = OpenAI(api_key=api_key)
return cls._instance


def get_openai_client(api_key: str) -> OpenAI:
return OpenAIClientSingleton.instance(api_key).client
21 changes: 21 additions & 0 deletions src/sentry/llm/providers/preview.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
from sentry.llm.providers.base import LlmModelBase
from sentry.llm.types import UseCaseConfig


class PreviewLLM(LlmModelBase):
"""
A dummy LLM provider that does not actually send any requests to any LLM API.
"""

provider_name = "preview"

def _complete_prompt(
self,
*,
usecase_config: UseCaseConfig,
prompt: str,
message: str,
temperature: float = 0.7,
max_output_tokens: int = 1000,
) -> str | None:
return ""
66 changes: 66 additions & 0 deletions src/sentry/llm/providers/vertex.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
import logging

import google.auth
import google.auth.transport.requests
import requests

from sentry.llm.providers.base import LlmModelBase
from sentry.llm.types import UseCaseConfig

logger = logging.getLogger(__name__)


class VertexProvider(LlmModelBase):
"""
A provider for Google Vertex AI. Uses default service account credentials.
"""

provider_name = "vertex"
candidate_count = 1 # we only want one candidate returned at the moment
top_p = 1 # TODO: make this configurable?

def _complete_prompt(
self,
*,
usecase_config: UseCaseConfig,
prompt: str,
message: str,
temperature: float,
max_output_tokens: int,
) -> str | None:

payload = {
"instances": [{"content": f"{prompt} {message}"}],
"parameters": {
"candidateCount": self.candidate_count,
"maxOutputTokens": max_output_tokens,
"temperature": temperature,
"topP": self.top_p,
},
}

headers = {
"Authorization": f"Bearer {self._get_access_token()}",
"Content-Type": "application/json",
}
vertex_url = self.provider_config["options"]["url"]
vertex_url += usecase_config["options"]["model"] + ":predict"

response = requests.post(vertex_url, headers=headers, json=payload)

if response.status_code == 200:
logger.info("Request successful.")
else:
logger.info(
"Request failed with status code and response text.",
extra={"status_code": response.status_code, "response_text": response.text},
)

return response.json()["predictions"][0]["content"]

def _get_access_token(self) -> str:
# https://stackoverflow.com/questions/53472429/how-to-get-a-gcp-bearer-token-programmatically-with-python

creds, _ = google.auth.default()
creds.refresh(google.auth.transport.requests.Request())
return creds.token
11 changes: 11 additions & 0 deletions src/sentry/llm/types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from typing import TypedDict


class ProviderConfig(TypedDict):
options: dict[str, str]
models: list[str]


class UseCaseConfig(TypedDict):
provider: str
options: dict[str, str]
104 changes: 104 additions & 0 deletions src/sentry/llm/usecases/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
from enum import Enum

from sentry import options
from sentry.llm.exceptions import InvalidProviderError, InvalidTemperature, InvalidUsecaseError
from sentry.llm.providers.base import LlmModelBase
from sentry.llm.providers.openai import OpenAIProvider
from sentry.llm.providers.preview import PreviewLLM
from sentry.llm.providers.vertex import VertexProvider
from sentry.llm.types import ProviderConfig, UseCaseConfig

SENTRY_LLM_SERVICE_ALIASES = {
"vertex": VertexProvider,
"openai": OpenAIProvider,
"preview": PreviewLLM,
}


class LLMUseCase(Enum):
EXAMPLE = "example" # used in tests / examples
SUGGESTED_FIX = "suggestedfix" # OG version of suggested fix


llm_provider_backends: dict[str, LlmModelBase] = {}


def get_llm_provider_backend(usecase: LLMUseCase) -> LlmModelBase:
usecase_config = get_usecase_config(usecase.value)
global llm_provider_backends

if usecase_config["provider"] in llm_provider_backends:
return llm_provider_backends[usecase_config["provider"]]

if usecase_config["provider"] not in SENTRY_LLM_SERVICE_ALIASES:
raise InvalidProviderError(f"LLM provider {usecase_config['provider']} not found")

provider = SENTRY_LLM_SERVICE_ALIASES[usecase_config["provider"]]

provider_config = get_provider_config(usecase_config["provider"])

llm_provider_backends[usecase_config["provider"]] = provider(
provider_config,
)

return llm_provider_backends[usecase_config["provider"]]


def complete_prompt(
*,
usecase: LLMUseCase,
prompt: str,
message: str,
temperature: float = 0.5,
max_output_tokens: int = 1000,
) -> str | None:
"""
Complete a prompt with a message using the specified usecase.
Default temperature and max_output_tokens set to a hopefully
reasonable value, but please consider what makes sense for
your specific use case.
Note that temperature should be between 0 and 1, and we will
normalize to any providers who have a different range
"""
_validate_temperature(temperature)

usecase_config = get_usecase_config(usecase.value)

backend = get_llm_provider_backend(usecase)
return backend.complete_prompt(
usecase_config=usecase_config,
prompt=prompt,
message=message,
temperature=temperature,
max_output_tokens=max_output_tokens,
)


def get_usecase_config(usecase: str) -> UseCaseConfig:
usecase_options_all = options.get("llm.usecases.options")
if not usecase_options_all:
raise InvalidUsecaseError(
"LLM usecase options not found. please check llm.usecases.options"
)

if usecase not in usecase_options_all:
raise InvalidUsecaseError(
f"LLM usecase options not found for {usecase}. please check llm.usecases.options"
)

return usecase_options_all[usecase]


def get_provider_config(provider: str) -> ProviderConfig:
llm_provider_options_all = options.get("llm.provider.options")
if not llm_provider_options_all:
raise InvalidProviderError("LLM provider option value not found")
if provider not in llm_provider_options_all:
raise InvalidProviderError(f"LLM provider {provider} not found")
return llm_provider_options_all[provider]


def _validate_temperature(temperature: float) -> None:
if not (0 <= temperature <= 1):
raise InvalidTemperature("Temperature must be between 0 and 1")
Loading

0 comments on commit 32f7e6f

Please sign in to comment.