feat: formula queries in EndpointTraceItemTable #6844

kylemumma · 2025-01-30T23:03:16Z

This PR implements support for formulas in the TraceItemTable endpoint. It is relevant to this ticket https://github.com/orgs/getsentry/projects/284/views/1?pane=issue&itemId=85243940&issue=getsentry%7Ceap-planning%7C27
It enables queries such as sum(my_attribute) / count(my_attribute) which werent possible before

Major changes

the column to expression conversion logic used to take place here I extracted this logic into 2 separate new functions: _get_reliability_context_columns and _column_to_expression. (i.e. the logic stayed the same but its not inside these functions)
I then extended _column_to_expression to support formulas https://github.com/getsentry/snuba/pull/6844/files#diff-e1e06d7f875a7c2870cc11bc4301dd6ab9fba73263c76260452c0b3176f66110R180-R191
I also had to extend the existing result conversion and order by logic to support formulas https://github.com/getsentry/snuba/pull/6844/files#diff-e1e06d7f875a7c2870cc11bc4301dd6ab9fba73263c76260452c0b3176f66110R287-R288

Testing

I wrote 3 new tests for this feature:

one that tests a simple formula on aggregates sum(my_attribute) / count(my_attribute)
one that is the same as above uses extrapolation, to ensure formulas work with extrapolation.
one that tests formulas on attributes without aggregation my_attribute + my_other_attribute

design decisions

reliabilities dont work with formulas, if you do a formula on extrapolated aggregates you will not get any reliability information back. This decision was made because of the increased complexity it would add to support. If needed we can implement support for this in the future.
I realized while implementing this that formulas using constants are not supported such as my_attribute * 10 if we need support for this it must be implemented as a follow up. and will require further modification of our protobuf grammar.
it only supports spans not uptime or logs

codecov · 2025-01-30T23:19:52Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
2847	1	2846	11

View the top 1 failed tests by shortest run time

tests.web.rpc.v1.test_endpoint_trace_item_table.test_endpoint_trace_item_table.TestTraceItemTable::test_non_agg_formula

Stack Traces | 0.342s run time

Traceback (most recent call last):
  File ".../v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py", line 2450, in test_non_agg_formula
    assert response.column_values == [
AssertionError: assert [attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 13\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\n] == [attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 13\n}\n]
  At index 0 diff: attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 13\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\n != attribute_name: "kyles_measurement + my_other_attribute"\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 0\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 5\n}\nresults {\n  val_double: 13\n}\n
  Full diff:
    [
     attribute_name: "kyles_measurement + my_other_attribute"
  + results {
  +   val_double: 13
  + }
  + results {
  +   val_double: 5
  + }
  + results {
  +   val_double: 5
  + }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
    results {
      val_double: 0
    }
  - results {
  -   val_double: 5
  - }
  - results {
  -   val_double: 5
  - }
  - results {
  -   val_double: 13
  - }
    ,
    ]

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

kylemumma · 2025-01-31T22:31:14Z

requirements.txt

@@ -29,7 +29,7 @@ python-rapidjson==1.8
 redis==4.5.4
 sentry-arroyo==2.19.12
 sentry-kafka-schemas==0.1.129
-sentry-protos==0.1.55
+sentry-protos==0.1.58


new protobuf definition with support for formulas, see getsentry/sentry-protos#105

kylemumma · 2025-02-01T00:08:35Z

snuba/web/rpc/v1/resolvers/R_eap_spans/resolver_time_series.py

if people try to use formulas in Timeseries Endpoint it returns a not-implemented-error. I will implement support for formulas in timeseries endpoint as my next PR.

volokluev · 2025-02-01T00:13:18Z

So this implements formulas only for spans. How much work would it be to support it for all trace item types

snuba/web/rpc/v1/resolvers/R_eap_spans/resolver_trace_item_table.py

kylemumma · 2025-02-04T01:07:36Z

So this implements formulas only for spans. How much work would it be to support it for all trace item types

It would be non-trivial to extend this PR to support all data types, we will leave it as span only

kylemumma force-pushed the krm/formularpc branch from 144f7e0 to 793b372 Compare January 31, 2025 22:20

kylemumma commented Jan 31, 2025

View reviewed changes

kylemumma changed the title ~~formula rpc init wip~~ feat: formula queries in EndpointTraceItemTable Feb 1, 2025

kylemumma marked this pull request as ready for review February 1, 2025 00:07

kylemumma requested review from a team as code owners February 1, 2025 00:07

kylemumma commented Feb 1, 2025

View reviewed changes

phacops reviewed Feb 3, 2025

View reviewed changes

snuba/web/rpc/v1/resolvers/R_eap_spans/resolver_trace_item_table.py Outdated Show resolved Hide resolved

snuba/web/rpc/v1/resolvers/R_eap_spans/resolver_trace_item_table.py Show resolved Hide resolved

kylemumma added 8 commits February 3, 2025 20:42

init wip

9a816c9

fix mypy

bb00427

initial implementation, might add more test

b448332

extrapolation test

da0356c

update extrapolation test

1f1b38a

test non-agg-formula

c3c9762

error msg for logs and uptime

4292502

pr feedback

6d91e4e

kylemumma force-pushed the krm/formularpc branch from f524c98 to 6d91e4e Compare February 4, 2025 04:42

phacops approved these changes Feb 4, 2025

View reviewed changes

fix pr feedback

a53e0db

kylemumma merged commit 8c1787b into master Feb 4, 2025
32 checks passed

kylemumma deleted the krm/formularpc branch February 4, 2025 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: formula queries in EndpointTraceItemTable #6844

feat: formula queries in EndpointTraceItemTable #6844

kylemumma commented Jan 30, 2025 •

edited

Loading

codecov bot commented Jan 30, 2025 •

edited

Loading

kylemumma Jan 31, 2025

kylemumma Feb 1, 2025

volokluev commented Feb 1, 2025

kylemumma commented Feb 4, 2025

feat: formula queries in EndpointTraceItemTable #6844

feat: formula queries in EndpointTraceItemTable #6844

Conversation

kylemumma commented Jan 30, 2025 • edited Loading

Major changes

Testing

design decisions

codecov bot commented Jan 30, 2025 • edited Loading

❌ 1 Tests Failed:

kylemumma Jan 31, 2025

Choose a reason for hiding this comment

kylemumma Feb 1, 2025

Choose a reason for hiding this comment

volokluev commented Feb 1, 2025

kylemumma commented Feb 4, 2025

kylemumma commented Jan 30, 2025 •

edited

Loading

codecov bot commented Jan 30, 2025 •

edited

Loading