Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add EventToCarrier to AWS Lambda semantic conventions #164

Closed
Closed
Changes from 4 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
3aaa4cd
Feat: Add EventToCarrier to AWS Lambda semantic conventions
rapphil Jul 6, 2023
526dc65
Fix: fix TOC
rapphil Jul 6, 2023
ab15449
Fix: fix toc
rapphil Jul 6, 2023
8606301
Merge branch 'main' into rapphil-add-event-to-carrier
rapphil Jul 6, 2023
29fdde3
Update specification/faas/aws-lambda.md
rapphil Aug 14, 2023
71f4932
Moved resource semconv directly under the resource directory (#165)
AlexanderWert Jul 6, 2023
f4a5387
Renamed `specification` and `semantic_conventions` to `model` and `do…
AlexanderWert Jul 6, 2023
053fe9a
Fix path of md table generation make task (#173)
joaopgrassi Jul 7, 2023
86940d2
Add markdown file for url semantic conventions (#174)
ChrsMark Jul 7, 2023
c2b9767
[editorial] Add missing `README.md` for Resource Cloud Provider semco…
chalin Jul 7, 2023
02071de
[editorial] docs/rpc/json-rpc: fix title and in-page hypenation (#170)
chalin Jul 7, 2023
57bd011
[editorial] Rename docs files and folders: replace "_" by "-"; add ch…
chalin Jul 7, 2023
c35b1d4
Fix semantic-conventions README.md file links (#168)
YANG-DB Jul 7, 2023
aa85e62
[editorial] Fix doc page titles and add Hugo front matter (#175)
chalin Jul 7, 2023
31ca6ef
[editorial] Link directly to page, not to page's title (#180)
chalin Jul 7, 2023
f875722
Bump to latest version of the specification. (#185)
jsuereth Jul 12, 2023
5200be9
[CI] Report link-check error when external URL used for local doc pag…
chalin Jul 13, 2023
69f6bae
Update documentation for how to cut a release AND perform the action …
jsuereth Jul 13, 2023
12bd6b0
[editorial] markdownlint in less then 1.5 sec (#193)
chalin Jul 14, 2023
b7e10e6
[editorial] Rename general section pages by dropping `general` from t…
chalin Jul 14, 2023
89e1603
[editorial] Setup Prettier and run it on some files (#192)
chalin Jul 21, 2023
c31e03c
Editorial: Remove overlooked messaging.source attributes from aws lam…
lmolkova Jul 21, 2023
de2cf45
Fix the unit of metric.process.runtime.jvm.system.cpu.load_1m to be {…
zeitlinger Jul 21, 2023
be82642
Change company affiliation for Johannes (#207)
pyohannes Jul 25, 2023
fe818fd
`.count` metric naming convention only applies to UpDownCounters (#107)
trask Jul 25, 2023
97e92be
[editorial][CI] Ensure markdownlint has proper exit status (#210)
chalin Jul 27, 2023
c4b58a4
Update contributing documentation for restructuring of repository (#216)
jsuereth Jul 28, 2023
a46db0f
Re-enable the schema check to look at the website. (#217)
jsuereth Jul 28, 2023
1c4e58f
Add Alexander Wert as an Approver (#220)
reyang Aug 1, 2023
ff064a4
Bump semantic conventions tooling to v0.20.0 (#225)
arminru Aug 1, 2023
0cf0e8d
Add system.cpu.physical.count and system.cpu.logical.count metrics (#99)
frzifus Aug 1, 2023
740ebd6
Add PR template (#223)
reyang Aug 1, 2023
6561f68
Generate database metrics semconv from YAML (#90)
joaopgrassi Aug 1, 2023
a491f82
Generate RPC metrics from YAML (#93)
joaopgrassi Aug 2, 2023
790eac9
Rename `http.*.duration` to `http.*.request.duration` (#224)
lmolkova Aug 3, 2023
2f4d8b4
HTTP version should be 2 and 3 instead of 2.0 and 3.0 (#228)
lmolkova Aug 6, 2023
94b653e
Generate FaaS metrics semconv from YAML (#88)
joaopgrassi Aug 7, 2023
8dfd92c
Re-introduce namespace to describe the original destination (#156)
joaopgrassi Aug 7, 2023
50cf594
Move Joao Grassi from approver to maintainer (#239)
reyang Aug 8, 2023
d550c98
Update destination attribute briefs (#243)
trask Aug 10, 2023
a6c3202
Rename all JVM metrics from process.runtime.jvm.* to jvm.* (#241)
trask Aug 10, 2023
92ad618
Fix: fix TOC
rapphil Jul 6, 2023
1d3b5ca
Fix: fix toc
rapphil Jul 6, 2023
58c722c
Update EvenToCarrier based on feedback
rapphil Aug 14, 2023
7fd5a23
Merge branch 'main' into rapphil-add-event-to-carrier
rapphil Aug 14, 2023
94a3bf5
Update to simplify language
rapphil Sep 11, 2023
6dd68fe
Merge branch 'main' into rapphil-add-event-to-carrier
rapphil Sep 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 24 additions & 13 deletions specification/faas/aws-lambda.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ use cases.
<!-- toc -->

- [All triggers](#all-triggers)
* [AWS X-Ray Environment Span Link](#aws-x-ray-environment-span-link)
* [Determining the remote parent span context](#determining-the-remote-parent-span-context)
+ [Composite EventToCarrier](#composite-eventtocarrier)
- [API Gateway](#api-gateway)
- [SQS](#sqs)
* [SQS Event](#sqs-event)
Expand Down Expand Up @@ -54,21 +55,31 @@ and the [cloud resource conventions][cloud]. The following AWS Lambda-specific a
[faasres]: /specification/resource/semantic_conventions/faas.md (FaaS resource conventions)
[cloud]: /specification/resource/semantic_conventions/cloud.md (Cloud resource conventions)

### AWS X-Ray Environment Span Link
### Determining the remote parent span context

If the `_X_AMZN_TRACE_ID` environment variable is set, instrumentation SHOULD try to parse an
OpenTelemetry `Context` out of it using the [AWS X-Ray Propagator](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.21.0/specification/context/api-propagators.md). If the
resulting `Context` is [valid](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.21.0/specification/trace/api.md#isvalid) then a [Span Link][] SHOULD be added to the new Span's
[start options](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.21.0/specification/trace/api.md#specifying-links) with an associated attribute of `source=x-ray-env` to
indicate the source of the linked span.
Instrumentation MUST check if the context is valid before using it because the `_X_AMZN_TRACE_ID` environment variable can
contain an incomplete trace context which indicates X-Ray isn’t enabled. The environment variable will be set and the
`Context` will be valid and sampled only if AWS X-Ray has been enabled for the Lambda function. A user can
disable AWS X-Ray for the function if the X-Ray Span Link is not desired.
Lambda does not have HTTP headers to read from and instead stores the headers it was invoked with (including TraceID, etc.) as part of the invocation event. If using the AWS XRay tracing then the trace information is instead stored in the Lambda environment. It is also possible that both options are populated at the same time, with different values. Finally it is also possible to propagate tracing information in a SQS message using the system attribute of the message `AWSTraceHeader`. A single lambda function can be triggered from multiple sources, however spans can only have a single parent.
rapphil marked this conversation as resolved.
Show resolved Hide resolved

**Note**: When instrumenting a Java AWS Lambda, instrumentation SHOULD first try to parse an OpenTelemetry `Context` out of the system property `com.amazonaws.xray.traceHeader` using the [AWS X-Ray Propagator](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md) before checking and attempting to parse the environment variable above.
To determine the parent span context, the lambda instrumentation SHOULD use a `EventToCarrier`. `EventToCarrier` defines how the instrumentation should prepare a `Carrier` to be used by subsequent `TextMapPropagators`.

[Span Link]: https://opentelemetry.io/docs/concepts/signals/traces/#span-links
The `EventToCarrier` MUST implement the `Convert` operation to convert a lammbda `Event` into a `Carrier`.
Copy link
Member

@Oberon00 Oberon00 Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This EventToCarrier sounds like an ordinary getter function that can be passed to the propagator along with the carrier. And in fact that is how it is implemented in some places, e.g. see .NET https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/Instrumentation.AWSLambda-1.1.0-beta.3/src/OpenTelemetry.Instrumentation.AWSLambda/Implementation/AWSLambdaUtils.cs#L68-L95

So I would suggest to change the wording here to avoid introducing this new intermediate concept.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. The linked .NET implementation does not appear to be user-provided and thus does not account for any event type not known to the instrumentation. It also does not provide a mechanism for extracting context information from any source other than the specific sources it has implemented. Because it combines the context propagation and event processing it does not provide any opportunity for composition to enable the user to establish a chain of responsibility. In fact, it's not at all implemented as an ordinary getter function that can be passed to a propagator, it is a sealed context extraction mechanism with limited knowledge of how to process a few event types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is also not proposing anything user-configurable beyond the sealed list of strings. If it does, please clarify the wording, or point me to where I missed it.

(Regarding the particular linked instrumentation, a custom context can be extracted by the user and then passed explicitly https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/Instrumentation.AWSLambda-1.1.0-beta.3/src/OpenTelemetry.Instrumentation.AWSLambda/AWSLambdaWrapper.cs#L97 which is much simpler to use, implement & understand in this case (but of course the concept would not be applicable to fully automatic / agent based instrumentations).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new revision I tried to make it clear that the EventToCarrier is configurable.


The `Convert` operation MUST have the following parameters:
`Carrier` - the carrier that will be populated from the `Event`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why must the caller provide the carrier to populate? Should this not be a pure function of event -> carrier?

`Event` - the lambda event.

#### Composite EventToCarrier

Implementations MUST provide a facility to group multiple `EventToCarrier`s. A composite `EventToCarrier` can be built from a list of `EventToCarrier`s. The resulting composite `EventToCarrier` will invoke the `Convert` operation of each individual `EventToCarrier` in the order they were specified, sequentially updating the carrier.

The list of `EventToCarrier`s passed to the composite `EventToCarrier` MUST be configured using the `OTEL_AWS_LAMBDA_EVENT_TO_CARRIERS`, as a comma separated list of values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? New environment-based configuration cannot be added to the specification at this time.

At the very least this needs to be reduced to SHOULD, but I think it should be removed entirely.

Copy link

@pavankrish123 pavankrish123 Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does OTEL_AWS_LAMBDA_EVENT_TO_CARRIERSenvironment variable also serve the purpose of exposing user-configurable event carrier ask from @Oberon00 comment above? https://github.com/open-telemetry/semantic-conventions/pull/164/files#r1254641850

If that is the case we should keep this as is or provide some configuration if not already provided (same ask in
@Oberon00in comment )


Valid values to configure the composite `EventToCarrier` are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Valid values to configure the composite `EventToCarrier` are:
Valid values to configure the composite `EventToCarrier` include:

There may be reasons for other values to also be valid.

rapphil marked this conversation as resolved.
Show resolved Hide resolved

* `lambda_runtime` - populates the `Carrier` with a key `X-Amzn-Trace-Id` from the value of the `_X_AMZN_TRACE_ID` environment variable. (see note below)
Copy link
Member

@Oberon00 Oberon00 Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is actually no need to have a per-trigger-type distinction, since most functions will only have a single trigger type anyway. I think instead there are 3 things for which it may make sense to switch them on/off separately:

  1. Propagation using X-Ray from _X_AMZN_TRACE_ID (or equivalent Java sytem property)
  2. Propagation using X-Ray information in event payload (e.g. SQS system attributes, HTTP headers)
  3. Propagation using the default / configured propagator using sensible locations in the event payload (e.g. SQS message attributes, SNS message attributes, HTTP headers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a per-trigger-type distinction, this is enumerating a set of pre-defined implementations of the EventToCarrier abstraction that can be selected at runtime.

As for the three toggles you propose, two and three are the same thing as long as a carrier can be extracted from the event and handed to the configured propagator. One is also effectively the same to the extent that an EventToCarrier implementation can always choose to include information in the carrier that was not in the incoming event but instead extracted from the operating environment or some other source.

The whole point of this is to separate the "Propagation using " from the second half of each of those sentences and to give the user the controls they need to choose what works best for them. We don't need to choose only three options for the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that in all Lambda instrumentations I have seen that use X-Ray-dependent inputs, they also use an hard-coded X-Ray propagator, disregarding any configured propagator, which is sensible.

This is not a per-trigger-type distinction, this is enumerating a set of pre-defined implementations of the EventToCarrier abstraction that can be selected at runtime.

I don't really get the difference, but what I meant is: There is no reason to make the user manually select "http" vs "sqs" when the event type can be automatically determined either from the function signature or by inspecting the payload JSON. Selecting them separately would only make sense if you have a function that receives both http and sqs events (not easily possible in statically typed languages like Java, .NET, but possible in priciple -- still probably unusual) and want to only extract context from one of them (I can't really come up with a reasonable use case for that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully support not requiring the user to configure something that can be determined automatically. As such, separating sqs and http seems unnecessary.

* `http_headers` = populates the `Carrier` with the content of the http headers.
* `sqs` - populate the carrier with the content of the `AWSTraceHeader` system attribute of the message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unclear/inconsistent. For the HTTP headers, you would also have the headers like traceparent, etc. available that may be usable with propagators other than X-Ray. For SQS you specify only the system attributes, which will only contain X-Ray.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, extracting a carrier from an SQS event should probably include both message attributes and message system attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the list of pre defined EventToCarrier from the new revision.


**NOTE**: When instrumenting a Java AWS Lambda, instrumentation SHOULD first try to parse the `X-Amzn-Trace-Id` out of the system property `com.amazonaws.xray.traceHeader` before checking and attempting to parse the environment variable `_X_AMZN_TRACE_ID`.

## API Gateway

Expand Down