-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(AI): Create abstractions for generic LLM calls within sentry #68771
Conversation
Bundle ReportChanges will decrease total bundle size by 552 bytes ⬇️
|
src/sentry/llm/usecases/__init__.py
Outdated
llm_provider_backends: dict[str, LlmModelBase] = {} | ||
|
||
|
||
def get_llm_provider_backend(usecase: LlmUseCase) -> LlmModelBase: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an alternative to this solution would be to use the ServiceDelegator
, but it seemed a bit of a heavy lift for what we needed here. open to other implementation ideas here.
|
||
|
||
def complete_prompt( | ||
*, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wasn't as familiar with this syntax, but this forces the rest of the arguments to be keyword arguments https://stackoverflow.com/a/14298976, which is a nice pythonic version of https://refactoring.com/catalog/introduceParameterObject.html
logger = logging.getLogger(__name__) | ||
|
||
|
||
class VertexProvider(LlmModelBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will note that this is for "Palm 2" class models. Google, in classic fashion, has changed the API format for "Gemini" class models, we'll need to write another provider. will rename this one.
(We aren't using the GCP AI module because it seemed to be quite a heavy one). Will update the naming to reflect that.
if response.status_code == 200: | ||
logger.info("Request successful.") | ||
else: | ||
logger.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this mean we log into on 401, 500?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, just added these to help with debugging as we were having issues getting this to work in prod, would more likely want to standardize it across the providers, but just leaving this in and planning to follow up with more telemetry
from sentry.utils.services import Service | ||
|
||
|
||
class LlmModelBase(Service): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a docblock comment I think
src/sentry/llm/providers/openai.py
Outdated
|
||
|
||
def get_openai_client(api_key: str) -> OpenAI: | ||
# TODO: make this global? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will have to look into it, but the client may make an oauth token request, so not cacheing the client may have it do that on every request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I see what you mean. Like initialize once 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an oauth flow, does the client need handle refresh flows to renew tokens? Or are tokens long lived
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question, will look into that.
src/sentry/llm/providers/openai.py
Outdated
|
||
|
||
def get_openai_client(api_key: str) -> OpenAI: | ||
# TODO: make this global? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an oauth flow, does the client need handle refresh flows to renew tokens? Or are tokens long lived
@@ -2390,3 +2390,23 @@ | |||
default=[], | |||
flags=FLAG_PRIORITIZE_DISK | FLAG_AUTOMATOR_MODIFIABLE, | |||
) | |||
|
|||
# Options for setting LLM providers and usecases | |||
register("llm.provider.options", default={}, flags=FLAG_NOSTORE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to configure these with options-automator or change them at runtime? If you have no scenarios where you need to change this at runtime module level constants/dictionaries could be a simpler solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just thinking through what the easiest way for self hosted users to configure these -- is it right that they can simply modify their config.yml with options and they'll be picked up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it right that they can simply modify their config.yml with options and they'll be picked up?
Yes, option values can be defined in the yml/py config for self-hosted. Self-hosted also has a django settings file in python that they can update. Changes to either the yml or python will require a restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going with options so config.yml can be source of config for this, confirmed with open source team this makes sense.
78b8cbe
to
13e5263
Compare
from what i understand, I think the openAI client itself handles this. we've been running this in prod for a bit now, so think it is okay. |
src/sentry/llm/providers/base.py
Outdated
def complete_prompt( | ||
self, | ||
*, | ||
usecase_config: dict[str, Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be either a real type or a TypedDict
-- especially in brand new code we should not be using dict[str, Any]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tried typing with a typed dict and it proved to be annoying as each provider can have different options. can try again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a TypedDict with more narrow types, with options being dict[str,str]
. doesn't seem like narrowing the specific types for each option set further from there is easy
from sentry.llm.providers.base import LlmModelBase | ||
|
||
|
||
class PreviewLLM(LlmModelBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally would try and avoid inheritance for this -- sentry historically has leaned on this a lot and it ends up being unmaintainably tangled (especially when attempting to type properly or reason about the implementation(s))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i feel like in this particular care inheritance seems sensible, as the different providers should all conform to the same interface -- curious what you think an alternative implementation could look like
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #68771 +/- ##
==========================================
+ Coverage 79.70% 79.72% +0.02%
==========================================
Files 6421 6431 +10
Lines 285320 285752 +432
Branches 49161 49245 +84
==========================================
+ Hits 227420 227829 +409
- Misses 57464 57487 +23
Partials 436 436
|
prompt: str, | ||
message: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it clearer that the prompt is the system message as it holds a special role in most LLM models and is treated differently. Also, it's optional and not required, only the initial user message is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. will add clarification.
temperature=temperature | ||
* 2, # open AI temp range is [0.0 - 2.0], so we have to multiply by two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scaling the temperature feels a bit weird to me but I can get behind this use case for allowing it to be easy to swap out models–however temperature scales differently across different models, would it make sense to allow it to be configurable in the use case config too. This way when someone switches out a model they can adjust to the according temp for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah definitely. can follow up on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the commented code before merging. The rest of my feedback is optional.
src/sentry/llm/providers/openai.py
Outdated
|
||
@lru_cache(maxsize=1) | ||
def get_openai_client(api_key: str) -> OpenAI: | ||
return OpenAI(api_key=api_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a singleton?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this, more clear than this atm
|
||
class UseCaseConfig(TypedDict): | ||
provider: str | ||
options: dict[str, str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning this because of all the commented config definitions. We could make the base class (or the protocol definition) generic over type T
. This would allow you to pass specifically typed options to specific implementations. But only if you want to take the typing that far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tried doing this and my type-fu wasn't up to it :( can look to improve this if necessary in a future PR
…8771) ### Background For the User Feedback #61372 Spam Detection feature, we intend to use an LLM. Along with other use cases such as suggested fix, and code integration, there is a need across the Sentry codebase to be able to call LLMs. Because Sentry is self hosted, some features use different LLMs and we want to provide modularity, we need to be able to configure different LLM providers, models, and usecases. ### Solution: - We define an options based config for Providers and use cases, where you can specify a LLM provider's options and settings. - For each use case, you then define what LLM provider it uses, and what model. Within `sentry/llm` we define a `providers` module, which consists of a base and implementations. To start, we have OpenAI, Google Vertex, and a Preview implementation used for testing. These will use the provider options to initialize a client and connect to the LLM provider. The providers inherit from `LazyServiceWrapper`. Also within `sentry/llm`, we define a `usecases` module, which simply consists of a function `complete_prompt`, along with an enum of use cases. These options are passed to the LLM provider per use case, and can be configured via the above option. ### Testing I've added unit tests which mock the LLM calls, and I've tested in my local environment that calls to the actual services work. ### In practice: So to use an LLM, you do the following steps: 1. define your usecase in the [usecase enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14) 2. Call the `complete_prompt` function with your `usecase`, prompt, content, temperature, and max_tokens) ### Limitations: Because each LLM right now has a different interface, some things that are specific, say to OpenAI like "function calling", where an output is guaranteed to be in a specific JSON format, this solution does not currently support. Advanced usecases beyond simple "prompt" + "text" and a singe output, are not currently supported. It is likely possible to add support for these on a case by case basis. LLM providers are not quite to the point where they have standardized on a consistent API, which makes supporting these somewhat difficult. Third parties have come up with various solutions [LangChain](https://github.com/langchain-ai/langchain), [LiteLLM](https://github.com/BerriAI/litellm), [LocalAI](https://github.com/mudler/LocalAI), [OpenRouter](https://openrouter.ai/). It will probably make sense eventually to adopt one of these tools, or our own advanced tooling, once our use cases outgrow this solution. There is also a possible future where we want different use cases to use different API keys, but for now, one provider only has one set of credentials. ### TODO - [ ] create develop docs for how to add a usecase, or new LLM provider - [x] Follow up PR to replace suggested fix openai calls with new abstraction - [ ] PR in getsentry to set provider / usecase values for SaaS - [ ] PR followup to add telemetry information - [ ] We'll likely want to support streaming responses. --------- Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com> Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Background
For the User Feedback #61372 Spam Detection feature, we intend to use an LLM. Along with other use cases such as suggested fix, and code integration, there is a need across the Sentry codebase to be able to call LLMs.
Because Sentry is self hosted, some features use different LLMs and we want to provide modularity, we need to be able to configure different LLM providers, models, and usecases.
Solution:
Within
sentry/llm
we define aproviders
module, which consists of a base and implementations. To start, we have OpenAI, Google Vertex, and a Preview implementation used for testing. These will use the provider options to initialize a client and connect to the LLM provider. The providers inherit fromLazyServiceWrapper
.Also within
sentry/llm
, we define ausecases
module, which simply consists of a functioncomplete_prompt
, along with an enum of use cases. These options are passed to the LLM provider per use case, and can be configured via the above option.Testing
I've added unit tests which mock the LLM calls, and I've tested in my local environment that calls to the actual services work.
In practice:
So to use an LLM, you do the following steps:
complete_prompt
function with yourusecase
, prompt, content, temperature, and max_tokens)Limitations:
Because each LLM right now has a different interface, some things that are specific, say to OpenAI like "function calling", where an output is guaranteed to be in a specific JSON format, this solution does not currently support. Advanced usecases beyond simple "prompt" + "text" and a singe output, are not currently supported. It is likely possible to add support for these on a case by case basis.
LLM providers are not quite to the point where they have standardized on a consistent API, which makes supporting these somewhat difficult. Third parties have come up with various solutions LangChain, LiteLLM, LocalAI, OpenRouter.
It will probably make sense eventually to adopt one of these tools, or our own advanced tooling, once our use cases outgrow this solution.
There is also a possible future where we want different use cases to use different API keys, but for now, one provider only has one set of credentials.
TODO