Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: formula queries in EndpointTraceItemTable #6844

Merged
merged 9 commits into from
Feb 4, 2025
Merged

Conversation

kylemumma
Copy link
Member

@kylemumma kylemumma commented Jan 30, 2025

This PR implements support for formulas in the TraceItemTable endpoint. It is relevant to this ticket https://github.com/orgs/getsentry/projects/284/views/1?pane=issue&itemId=85243940&issue=getsentry%7Ceap-planning%7C27
It enables queries such as sum(my_attribute) / count(my_attribute) which werent possible before

Major changes

Testing

I wrote 3 new tests for this feature:

  • one that tests a simple formula on aggregates sum(my_attribute) / count(my_attribute)
  • one that is the same as above uses extrapolation, to ensure formulas work with extrapolation.
  • one that tests formulas on attributes without aggregation my_attribute + my_other_attribute

design decisions

  • reliabilities dont work with formulas, if you do a formula on extrapolated aggregates you will not get any reliability information back. This decision was made because of the increased complexity it would add to support. If needed we can implement support for this in the future.
  • I realized while implementing this that formulas using constants are not supported such as my_attribute * 10 if we need support for this it must be implemented as a follow up. and will require further modification of our protobuf grammar.
  • it only supports spans not uptime or logs

Copy link

codecov bot commented Jan 30, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2847 1 2846 11
View the top 1 failed tests by shortest run time
tests.web.rpc.v1.test_endpoint_trace_item_table.test_endpoint_trace_item_table.TestTraceItemTable::test_non_agg_formula
Stack Traces | 0.342s run time
Traceback (most recent call last):
  File ".../v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py", line 2450, in test_non_agg_formula
    assert response.column_values == [
AssertionError: assert [attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 13\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\n] == [attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 13\n}\n]
  At index 0 diff: attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 13\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\n != attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 13\n}\n
  Full diff:
    [
     attribute_name: "kyles_measurement + my_other_attribute"
  + results {
  +   val_double: 13
  + }
  + results {
  +   val_double: 5
  + }
  + results {
  +   val_double: 5
  + }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
  - results {
  -   val_double: 5
  - }
  - results {
  -   val_double: 5
  - }
  - results {
  -   val_double: 13
  - }
    ,
    ]

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

@@ -29,7 +29,7 @@ python-rapidjson==1.8
redis==4.5.4
sentry-arroyo==2.19.12
sentry-kafka-schemas==0.1.129
sentry-protos==0.1.55
sentry-protos==0.1.58
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new protobuf definition with support for formulas, see getsentry/sentry-protos#105

@kylemumma kylemumma changed the title formula rpc init wip feat: formula queries in EndpointTraceItemTable Feb 1, 2025
@kylemumma kylemumma marked this pull request as ready for review February 1, 2025 00:07
@kylemumma kylemumma requested review from a team as code owners February 1, 2025 00:07
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if people try to use formulas in Timeseries Endpoint it returns a not-implemented-error. I will implement support for formulas in timeseries endpoint as my next PR.

@volokluev
Copy link
Member

So this implements formulas only for spans. How much work would it be to support it for all trace item types

@kylemumma
Copy link
Member Author

So this implements formulas only for spans. How much work would it be to support it for all trace item types

It would be non-trivial to extend this PR to support all data types, we will leave it as span only

@kylemumma kylemumma merged commit 8c1787b into master Feb 4, 2025
32 checks passed
@kylemumma kylemumma deleted the krm/formularpc branch February 4, 2025 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants