Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] User Feedback Spam Filtering #61372

Closed
17 tasks done
Tracked by #64671 ...
bruno-garcia opened this issue Dec 7, 2023 · 5 comments
Closed
17 tasks done
Tracked by #64671 ...

[Epic] User Feedback Spam Filtering #61372

bruno-garcia opened this issue Dec 7, 2023 · 5 comments

Comments

@bruno-garcia
Copy link
Member

bruno-garcia commented Dec 7, 2023

To help deal with possible spams, we need some controls.

KTTR: https://www.notion.so/sentry/User-Feedback-Spam-Filtering-4ed3526e469543d6b6f458bdeb87b467
UI Flagr: https://flagr.getsentry.net/#/flags/539

Feature flags:
organizations:user-feedback-spam-filter-ui
organizations:user-feedback-spam-filter-ingest

Implementation

Surfacing in UI

  1. ryan953

Launch

@getsantry
Copy link
Contributor

getsantry bot commented Dec 7, 2023

Routing to @getsentry/product-owners-user-feedback for triage ⏲️

@ryan953 ryan953 changed the title [User Feedback] Controls to deal with abuse or spam [Epic] Controls to deal with abuse or spam Jan 26, 2024
@ryan953 ryan953 changed the title [Epic] Controls to deal with abuse or spam [Epic] User Feedback controls to deal with abuse or spam Jan 26, 2024
@ryan953 ryan953 changed the title [Epic] User Feedback controls to deal with abuse or spam [Epic] User Feedback controls to deal with spam Jan 26, 2024
@michellewzhang michellewzhang changed the title [Epic] User Feedback controls to deal with spam [Epic] User Feedback Spam Filtering Feb 2, 2024
@JoshFerge
Copy link
Member

this would be a good one to do in preparation for spam filtering:

#64950

@JoshFerge
Copy link
Member

JoshFerge commented Feb 27, 2024

Next steps:

  • Add a project flag, similar to feat(feedback): add project option for enabling crash/user report notifs #65352 to enable AI based spam detection
  • Move spam logic to getsentry/feeback folder (have command still use this logic, just call it from the current file)
  • in sentry, create a new LazyServiceWrapper for feedback spam detection, and a StubFeedbackSpamDetection function which sentry by default will use.
  • in getsentry, inherit from the base service wrapper created in sentry, and call the spam detection logic from there
  • create a new settings variable in sentry/server.py which will determine which "service" Sentry should use for spam detection. by default, it should be the StubFeedbackSpamDetection. We'll override this in getsentry when we're ready to deploy in prod
  • Modify create_feedback function in create_feedback.py to call out to the spam detection function, and add is_spam to the IssueEvidence in make_evidence (only if the project has spam detection enabled)
    def make_evidence(feedback, source: FeedbackCreationSource):
  • need to check for the spam ingest feature flag in create_feedback: organizations:user-feedback-spam-filter-ingest
  • in post_process.py, modify
    def should_postprocess_feedback(job: PostProcessJob) -> bool:
    to also check if the feeback issue had is_spam set to true, and if so, return false
  • TODO: look into how we can set the status of the issue on creation so it shows in the spam column

@michellewzhang
Copy link
Member

michellewzhang commented Mar 22, 2024

@michellewzhang michellewzhang removed their assignment Apr 1, 2024
JoshFerge added a commit that referenced this issue Apr 19, 2024
…8771)

### Background
For the User Feedback #61372
Spam Detection feature, we intend to use an LLM. Along with other use
cases such as suggested fix, and code integration, there is a need
across the Sentry codebase to be able to call LLMs.

Because Sentry is self hosted, some features use different LLMs and we
want to provide modularity, we need to be able to configure different
LLM providers, models, and usecases.

### Solution:

- We define an options based config for Providers and use cases, where
you can specify a LLM provider's options and settings.
- For each use case, you then define what LLM provider it uses, and what
model.


Within `sentry/llm` we define a `providers` module, which consists of a
base and implementations. To start, we have OpenAI, Google Vertex, and a
Preview implementation used for testing. These will use the provider
options to initialize a client and connect to the LLM provider. The
providers inherit from `LazyServiceWrapper`.

Also within `sentry/llm`, we define a `usecases` module, which simply
consists of a function `complete_prompt`, along with an enum of use
cases. These options are passed to the LLM provider per use case, and
can be configured via the above option.


### Testing
I've added unit tests which mock the LLM calls, and I've tested in my
local environment that calls to the actual services work.


### In practice:
So to use an LLM, you do the following steps:
1. define your usecase in the [usecase
enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14)
2. Call the `complete_prompt` function with your `usecase`, prompt,
content, temperature, and max_tokens)



### Limitations:

Because each LLM right now has a different interface, some things that
are specific, say to OpenAI like "function calling", where an output is
guaranteed to be in a specific JSON format, this solution does not
currently support. Advanced usecases beyond simple "prompt" + "text" and
a singe output, are not currently supported. It is likely possible to
add support for these on a case by case basis.

LLM providers are not quite to the point where they have standardized on
a consistent API, which makes supporting these somewhat difficult. Third
parties have come up with various solutions
[LangChain](https://github.com/langchain-ai/langchain),
[LiteLLM](https://github.com/BerriAI/litellm),
[LocalAI](https://github.com/mudler/LocalAI),
[OpenRouter](https://openrouter.ai/).

It will probably make sense eventually to adopt one of these tools, or
our own advanced tooling, once our use cases outgrow this solution.

There is also a possible future where we want different use cases to use
different API keys, but for now, one provider only has one set of
credentials.



### TODO

- [ ] create develop docs for how to add a usecase, or new LLM provider
- [x] Follow up PR to replace suggested fix openai calls with new
abstraction
- [ ] PR in getsentry to set provider / usecase values for SaaS
- [ ] PR followup to add telemetry information
- [ ] We'll likely want to support streaming responses.

---------

Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
MichaelSun48 pushed a commit that referenced this issue Apr 25, 2024
…8771)

### Background
For the User Feedback #61372
Spam Detection feature, we intend to use an LLM. Along with other use
cases such as suggested fix, and code integration, there is a need
across the Sentry codebase to be able to call LLMs.

Because Sentry is self hosted, some features use different LLMs and we
want to provide modularity, we need to be able to configure different
LLM providers, models, and usecases.

### Solution:

- We define an options based config for Providers and use cases, where
you can specify a LLM provider's options and settings.
- For each use case, you then define what LLM provider it uses, and what
model.


Within `sentry/llm` we define a `providers` module, which consists of a
base and implementations. To start, we have OpenAI, Google Vertex, and a
Preview implementation used for testing. These will use the provider
options to initialize a client and connect to the LLM provider. The
providers inherit from `LazyServiceWrapper`.

Also within `sentry/llm`, we define a `usecases` module, which simply
consists of a function `complete_prompt`, along with an enum of use
cases. These options are passed to the LLM provider per use case, and
can be configured via the above option.


### Testing
I've added unit tests which mock the LLM calls, and I've tested in my
local environment that calls to the actual services work.


### In practice:
So to use an LLM, you do the following steps:
1. define your usecase in the [usecase
enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14)
2. Call the `complete_prompt` function with your `usecase`, prompt,
content, temperature, and max_tokens)



### Limitations:

Because each LLM right now has a different interface, some things that
are specific, say to OpenAI like "function calling", where an output is
guaranteed to be in a specific JSON format, this solution does not
currently support. Advanced usecases beyond simple "prompt" + "text" and
a singe output, are not currently supported. It is likely possible to
add support for these on a case by case basis.

LLM providers are not quite to the point where they have standardized on
a consistent API, which makes supporting these somewhat difficult. Third
parties have come up with various solutions
[LangChain](https://github.com/langchain-ai/langchain),
[LiteLLM](https://github.com/BerriAI/litellm),
[LocalAI](https://github.com/mudler/LocalAI),
[OpenRouter](https://openrouter.ai/).

It will probably make sense eventually to adopt one of these tools, or
our own advanced tooling, once our use cases outgrow this solution.

There is also a possible future where we want different use cases to use
different API keys, but for now, one provider only has one set of
credentials.



### TODO

- [ ] create develop docs for how to add a usecase, or new LLM provider
- [x] Follow up PR to replace suggested fix openai calls with new
abstraction
- [ ] PR in getsentry to set provider / usecase values for SaaS
- [ ] PR followup to add telemetry information
- [ ] We'll likely want to support streaming responses.

---------

Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
@aliu39 aliu39 self-assigned this Jun 3, 2024
@aliu39
Copy link
Member

aliu39 commented Jun 12, 2024

Done and GA'd! 🎉 Follow-ups:

@aliu39 aliu39 closed this as completed Jun 12, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jun 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
Development

No branches or pull requests

4 participants