Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance powered by OTel #2251

Closed
sentrivana opened this issue Jul 14, 2023 · 5 comments · Fixed by #2272
Closed

Performance powered by OTel #2251

sentrivana opened this issue Jul 14, 2023 · 5 comments · Fixed by #2272
Assignees
Labels
Integration: OpenTelementry (OTel) Triaged Has been looked at recently during old issue triage

Comments

@sentrivana
Copy link
Contributor

sentrivana commented Jul 14, 2023

Milestone: Performance Powered by OTel

Problem Statement

At the time of Sentry's initial performance product development, OpenTelemetry was in the nascent stages of its lifecycle and was not yet optimized for our requirements. Nevertheless, we maintained similarities in our data models and paradigms with OpenTelemetry. Since then, OpenTelemetry has significantly matured, passed the test of time, and has been generally available (GA) for over a year. It now boasts an extensive ecosystem of integrations spanning multiple technologies, including databases, queues, and protocols.

This maturity means that now is the time for us to rework our Python Performance Monitoring to use OTel under the hood. This way, we can leverage all the functionality from the OTel ecosystem, and overall better align with the broader ecosystem.

Key goals of this project

  • Leverage OpenTelemetry instrumentation for Performance
  • Ensure a consistent user experience - it should not feel much different to users than the current SDK experience
  • Do not lose functionality - in places where our instrumentation may exceed what OTel currently offers, we want to either keep some of our own instrumentation, and/or in the medium term upstream improvements to the OTel instrumentations
  • TLDR: Users continue to use sentry_sdk.init() and everything just works™️.

Non-goals

For the initial work, we want to strive to optimize for easy setup & usage. Exposing OTel internals & providing more hooks etc. for users to manually add OTel stuff may come at a later point. The public API of the SDK will remain the same for the prototype.

Misc

Internally, this project is known as POTel (from performance powered by OTel).

References

@sentrivana sentrivana self-assigned this Jul 14, 2023
@stephanie-anderson stephanie-anderson changed the title Python Performance Monitoring powered by OpenTelemetry Performance powered by OTel Jul 14, 2023
@sentrivana
Copy link
Contributor Author

sentrivana commented Aug 17, 2023

We need to check whether the following works (repurposed from getsentry/opentelemetry-demo#12):

Product Features

What works out of the box? What doesn't? What works, but differently?

Note: See https://www.notion.so/sentry/SDKs-Performance-powered-by-OTel-POTEL-7f1900c5c1b04870bdc0f2cc8dd4d929?pvs=4#50f3fec4bef3402087ec035688329efa (internal link) for more in-depth description.

  • Error Tracking
    • Issues
      • 'POST /test/12344?test=true'
        • returns an error - should be reflected
      • Stack trace and code link
      • Breadcrumbs
    • Related transactions
    • Trace view
    • ❌ Performance Issues
  • Performance Monitoring
    • Distributed tracing
    • Transactions and Spans
      • Naming, descriptions, etc
      • Operation: http.server
        • parameterized transactions
          • GET /?test=true
          • POST /test/12344?test=true
        • http.method
        • related issues
      • Operation: queue.task.____
      • Operation: integration.http
      • Operation: db
    • Vitals
      • User Misery
      • Backend
    • TwP
  • Release Management
    • Adoption, Crash related features
  • Discover & Dashboards
  • Ecosystem & Workflow
    • consider product integration dependent upon incoming data
    • alerts for example
  • ❌ Profiling
  • Replays

@smeubank smeubank changed the title Performance powered by OTel [python] Performance powered by OTel Feb 20, 2024
@sentrivana
Copy link
Contributor Author

sentrivana commented Apr 17, 2024

As we're looking into continuing with the project again, there's a few questions that have come up since we last worked on this.

  • What do we gain and what do we lose by using OTel and how do we bridge the gap?

    • The quality of the spans produced and the data on them should be on par with the current Sentry experience, e.g., we have a lot of custom stuff on DB spans that are used in specialized parts of Sentry. This is potentially a huge topic that might require upstream PRs as well as changes on the product side.
    • Similarly, we'll likely need to adopt a more strict Python and library version support schedule.
    • Furthermore, we'll need to contribute instrumentation upstream if we have something that is not in OTel yet.
    • We need to identify the gaps and think about how to bridge them.
  • How does the scopes change affect the implementation?

    • The new scopes implementation is easier to map onto the OTel context propagation. However, it's still two separate context propagation systems. We need to figure out to what degree and how we can keep them in sync.
    • Since there are no context lifecycle hooks, we'll probably need to store our scopes on the OTel context to enrich the resulting span later in the processor.
  • Error monitoring and tracing are currently quite tightly coupled in the SDK. What challenges does this pose?

  • How does direct OTLP span ingestion affect the implementation?

    • Is the span processor still needed? (Might be, though not for creating Sentry span spans, but rather for augmenting the OTel spans with more Sentry data.)
  • How does span streaming affect the implementation?

@sentrivana
Copy link
Contributor Author

Couple of things that came up while trying to test the feature gap. Documenting them here for future reference.

Getting Things to Install & Run

OTel provides a package for each instrumentation, so there's e.g. opentelemetry-instrumentation-flask, opentelemetry-instrumentation-django, etc. There's also a way to install all available instrumentation packages at once by installing opentelemetry-contrib-instrumentations. However, this is broken in recent versions due to packaging issues with one of the instrumentation packages involved. So if we want to use the latest versions of everything, we need to install all the individual packages manually.

And we do in principle want to install the newest versions of everything -- for instance, the psycopg instrumentation was only added in the last release.

Instrumenting

We have two basic options how to instrument programmatically.

  1. We call the instrument() method of each of the Instrumentors (e.g., FlaskInstrumentor, DjangoInstrumentor, etc.). This has the advantage of being able to pass in optional arguments to tweak the instrumentation (e.g., adding DB spans for Django).
  2. We use the OTel auto-discovery mechanism to figure out what should be instrumented. Unknown if there's a way to provide additional arguments to the individual instrumentation since by default, it'll call instrument() without any args.

Note that whatever way we choose, classes imported before sentry_sdk.init will still be left uninstrumented. The solution for that is to either require folks to import sentry_sdk and init it before anything else has even been imported, which is not great; or seeking out and post-patching the imported classes.

@antonpirker
Copy link
Member

antonpirker commented May 16, 2024

One idea to tackle the "classes imported before sentry_sdk.init will left uninstrumented":

What opentelementry-instrumentation does is setting the PYTHONPATH env variable to '.../lib/python3.10/site-packages/opentelemetry/instrumentation/auto_instrumentation:/home/anton/my_project'
In .../lib/python3.10/site-packages/opentelemetry/instrumentation/auto_instrumentation there is a file called sitecustomize.py
And when the Python interpreter starts it picks up this file and in this file the instrumentation is setup.
This is a feature of Python (I did not know about): https://docs.python.org/3/library/site.html#module-site

Maybe we can do the same and put our own sentry_sdk.init in the sitecostumize.py? (Then we could people tell to set SENTRY_DSN env var and start their program and everything just works(tm).)

@szokeasaurusrex szokeasaurusrex added the Triaged Has been looked at recently during old issue triage label Jun 3, 2024
@sentrivana sentrivana changed the title [python] Performance powered by OTel Performance powered by OTel Jun 11, 2024
@sentrivana
Copy link
Contributor Author

We're still working on this but using a different tracking issue -- I will close this and link the new issue as soon as it's available (we need to move it to this repository first).

@sentrivana sentrivana closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration: OpenTelementry (OTel) Triaged Has been looked at recently during old issue triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants