Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(telemetry): rename the selector to get studio operation id #5337

Merged
merged 6 commits into from
Jun 5, 2024

Conversation

bnjjj
Copy link
Contributor

@bnjjj bnjjj commented Jun 4, 2024

We introduced a new trace_id selector format in 1.48.0 which has been misnamed because it's not a trace id but the Apollo Studio Operation ID. If you want to access to this selector, here is an example:

telemetry:
  instrumentation:
    spans:
      router:
        "studio.operation.id":
            studio_operation_id: true

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj requested review from a team as code owners June 4, 2024 15:10

This comment has been minimized.

@router-perf
Copy link

router-perf bot commented Jun 4, 2024

CI performance tests

  • step - Basic stress test that steps up the number of users over time
  • reload - Reload test over a long period of time at a constant rate of users
  • step-with-prometheus - A copy of the step test with the Prometheus metrics exporter enabled
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • xlarge-request - Stress test with 10 MB request payload
  • const - Basic stress test that runs with a constant number of users
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • xxlarge-request - Stress test with 100 MB request payload
  • demand-control-uninstrumented - A copy of the step test, but with demand control monitoring enabled
  • no-graphos - Basic stress test, no GraphOS.
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • large-request - Stress test with a 1 MB request payload
  • demand-control-instrumented - A copy of the step test, but with demand control monitoring and metrics enabled

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj requested review from garypen, Geal, BrynCooke and abernix June 4, 2024 15:16
Co-authored-by: Gary Pennington <gary@apollographql.com>
@abernix
Copy link
Member

abernix commented Jun 5, 2024

I think tests are failing for a legitimate reason here.

Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing because the rename isn't quite complete.

bnjjj added 2 commits June 5, 2024 15:30
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj requested a review from abernix June 5, 2024 14:34
@bnjjj bnjjj merged commit 37ad992 into dev Jun 5, 2024
13 of 14 checks passed
@bnjjj bnjjj deleted the bnjjj/fix_studio_selector branch June 5, 2024 14:52
bnjjj added a commit that referenced this pull request Jun 5, 2024
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj mentioned this pull request Jun 5, 2024
@abernix abernix mentioned this pull request Jun 10, 2024
dotdat referenced this pull request in apollographql/rover Jun 11, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [apollographql/router](https://github.com/apollographql/router) |
patch | `v1.48.0` -> `v1.48.1` |

---

### Release Notes

<details>
<summary>apollographql/router (apollographql/router)</summary>

###
[`v1.48.1`](https://github.com/apollographql/router/releases/tag/v1.48.1)

[Compare
Source](https://github.com/apollographql/router/compare/v1.48.0...v1.48.1-rc.0)

#### 🐛 Fixes

##### Improve error message produced when a subgraph response doesn't
include an expected `content-type` header value ([Issue
#&#8203;5359](https://github.com/apollographql/router/issues/5359))

To improve a common debuggability challenge when a subgraph response
doesn't contain an expected `content-type` header value, the error
message produced will include additional details about the error.

Some examples of the improved error message:

- HTTP fetch failed from 'test': subgraph response contains invalid
'content-type' header value "application/json,application/json";
expected content-type: application/json or content-type:
application/graphql-response+json
- HTTP fetch failed from 'test': subgraph response does not contain
'content-type' header; expected content-type: application/json or
content-type: application/graphql-response+json

By [@&#8203;IvanGoncharov](https://github.com/IvanGoncharov) in
[https://github.com/apollographql/router/pull/5223](https://github.com/apollographql/router/pull/5223)

##### Update `apollo-compiler` for two small improvements ([PR
#&#8203;5347](https://github.com/apollographql/router/pull/5347))

Updated our underlying `apollo-rs` dependency on our `apollo-compiler`
crate to bring in two nice improvements:

-   *Fix validation performance bug*

Adds a cache in fragment spread validation, fixing a situation where
validating a query with many fragment spreads against a schema with many
interfaces could take multiple seconds to validate.

-   *Remove ariadne byte/char mapping*

Generating JSON or CLI reports for apollo-compiler diagnostics used a
translation layer between byte offsets and character offsets, which cost
some computation and memory proportional to the size of the source text.
The latest version of `ariadne` allows us to remove this translation.

By [@&#8203;goto-bus-stop](https://github.com/goto-bus-stop) in
[https://github.com/apollographql/router/pull/5347](https://github.com/apollographql/router/pull/5347)

#### 📃 Configuration

##### Rename the telemetry selector which obtains the GraphOS operation
id ([PR
#&#8203;5337](https://github.com/apollographql/router/pull/5337))

Renames a misnamed `trace_id` selector introduced in
[v1.48.0](https://github.com/apollographql/router/releases/tag/v1.48.0)
to the value which it actually represents which is an Apollo GraphOS
operation ID, rather than a trace ID. Apologies for the confusion!
Unfortunately, we aren't able to produce an Apollo GraphOS trace ID at
this time.

If you want to access this operation ID selector, here is an example of
how to apply it to your tracing spans:

```yaml
telemetry:
  instrumentation:
    spans:
      router:
        "studio.operation.id":
            studio_operation_id: true
```

This can be useful for more easily locating the operation in [GraphOS'
Insights](https://www.apollographql.com/docs/graphos/metrics/operations)
feature and finding applicable traces in Studio.

By [@&#8203;bnjjj](https://github.com/bnjjj) in
[https://github.com/apollographql/router/pull/5337](https://github.com/apollographql/router/pull/5337)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/apollographql/rover).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zOTMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjM5My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyI6Y2hyaXN0bWFzX3RyZWU6IGRlcGVuZGVuY2llcyJdfQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@lrlna lrlna mentioned this pull request Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants