Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelemetry.io: instrumentation docs page #243

Merged
377 changes: 377 additions & 0 deletions website_docs/instrumentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,377 @@
---
title: "Instrumentation"
weight: 30
---

Instrumentation is the act of adding observability code to your
application. This can be done with direct calls to the OpenTelemetry API within
your code or including a dependency which calls the API and hooks into your
project, like a middleware for an HTTP server.

# TracerProvider and Tracers

In OpenTelemetry each service being traced has at least one `TracerProvider`
that is used to hold configuration about the name/version of the service, what
sampler to use and how to process/export the spans. A `Tracer` is created by a
`TracerProvider` and has a name and version. In the Erlang/Elixir OpenTelemetry
the name and version of each `Tracer` is the same as the name and version of the
OTP Application the module using the `Tracer` is in. If the call to use a
`Tracer` is not in a module, for example when using the interactive shell, the
default `Tracer` is used.

Each OTP Application has a `Tracer` registered for it when the `opentelemetry`
Application boots. This can be disabled by setting the Application environment
variable `register_loaded_applications` to `false`. If you want a more specific
named `Tracer` or disable the automatic registration you can register a `Tracer`
either with a name and version or with an Application name and
`opentelemetry` will get the version from the loaded Application. Examples:

{{< tabs Erlang Elixir >}}

{{< tab >}}
opentelemetry:register_tracer(test_tracer, <<"0.1.0">>),
opentelemetry:register_application_tracer(myapp),
{{< /tab >}}

{{< tab >}}
OpenTelemetry.register_tracer(:test_tracer, "0.1.0")
OpenTelemetry.register_application_tracer(:myapp)
{{< /tab >}}

{{< /tabs >}}

Giving names to each `Tracer`, and in the case of Erlang/Elixir having that name
be the name of the Application, allows for the ability to blacklist traces from
a particular Application. This can be useful if, for example, a dependency turns
out to be generating too many or in some way problematic spans and it is desired
to disable their generation.
ferd marked this conversation as resolved.
Show resolved Hide resolved

Additionally, the name and version of the `Tracer` are exported as the
[`InstrumentationLibrary`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumentation-library)
component of spans. This allows users to group and search spans by the
Application they came from.

You can lookup a `Tracer` by name with `get_tracer/1` and use that `Tracer`
variable to call the tracing API through `otel_tracer` in Erlang or
`OpenTelemetry.Tracer` in Elixir:

{{< tabs Erlang Elixir >}}

{{< tab >}}
Tracer = opentelemetry:get_tracer(my_app),
SpanCtx = otel_tracer:start_span(Tracer, <<"hello-world">>, #{}),
...
otel_tracer:end_span(SpanCtx).
{{< /tab >}}

{{< tab >}}
tracer = OpenTelemetry.get_tracer(:my_app)
span_ctx = OpenTelemetry.Tracer.start_span(tracer, "hello-world", %{})
...
OpenTelemetry.Tracer.end_span(span_ctx)
{{< /tab >}}

{{< /tabs >}}

In most cases you will not need to manually register or look up a
`Tracer`. Simply use the macros provided, which are covered in the following
section, and the `Tracer` for the Application the macro is used in will be used
automatically.

# Starting Spans

A trace is a tree of spans, starting with a root span that has no parent. To
represent this tree, each span after the root has a parent span associated with
it. When a span is started the parent is set based on the `context`. A `context`
can either be implicit, meaning your code does not have to pass a `Context`
variable to track the active `context`, or explicit where your code must pass
the `Context` as an argument not only to the OpenTelemetry functions but to any
function you need to propagate the `context` so that spans started in those
functions have the proper parent.

For implicit context propagation across functions within a process the [process
dictionary](http://erlang.org/doc/reference_manual/processes.html#process-dictionary)
is used to store the context. When you start a span with the macro `with_span`
the context in the process dictionary is updated to make the newly started span
the currently active span and this span will be end'ed when the block or
function completes. Additionally, starting a new span within the body of
`with_span` will use the active span as the parent of the new span and the
parent is again the active span when the child's block or function body
completes:

{{< tabs Erlang Elixir >}}

{{< tab >}}
parent_function() ->
?with_span(<<"parent">>, #{}, fun child_function/0).

child_function() ->
%% this is the same process, so the span <<"parent">> set as the active
%% span in the with_span call above will be the active span in this function
?with_span(<<"child">>, #{},
fun() ->
%% do work here. when this function returns, <<"child">> will complete.
end).

{{< /tab >}}

{{< tab >}}
require OpenTelemetry.Tracer

def parent_function() do
OpenTelemetry.Tracer.with_span "parent" do
child_function()
end
end

def child_function() do
## this is the same process, so the span <<"parent">> set as the active
## span in the with_span call above will be the active span in this function
OpenTelemetry.Tracer.with_span "child" do
## do work here. when this function returns, <<"child">> will complete.
end
end
{{< /tab >}}

{{< /tabs >}}

## Cross Process Propagation

The examples in the previous section were spans with a child-parent relationship
within the same process where the parent is available in the process dictionary
when creating a child span. Using the process dictionary this way isn't possible
when crossing processes, either by spawning a new process or sending a message
to an existing process. Instead, the context must be manually passed as a variable.

### Creating Spans for New Processes

To pass spans across processes we need to start a span that isn't connected to
particular process. This can be done with the macro `start_span`. Unlike
`with_span`, the `start_span` macro does not set the new span as the currently
active span in the context of the process dictionary.

Connecting a span as a parent to a child in a new process can be done by attaching
the context and setting the new span as currently active in the process. The
whole context should be attached in order to not lose other telemetry data like
[baggage](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md).

{{< tabs Erlang Elixir >}}

{{< tab >}}
SpanCtx = ?start_span(<<"child">>),
Ctx = otel_ctx:get_current(),

proc_lib:spawn_link(fun() ->
otel_ctx:attach(Ctx),
?set_current_span(SpanCtx),

%% do work here

?end_span(SpanCtx)
end),
{{< /tab >}}

{{< tab >}}
span_ctx = OpenTelemetry.Tracer.start_span(<<"child">>)
Ctx = OpenTelemetry.Ctx.get_current()
tsloughter marked this conversation as resolved.
Show resolved Hide resolved

task = Task.async(fn ->
OpenTelemetry.Ctx.attach(ctx),
OpenTelemetry.Tracer.set_current_span(span_ctx)
# do work here

# end span here or after `await` returns
end)

_ = Task.await(task)
OpenTelemetry.Tracer.end_span(span_ctx)
{{< /tab >}}

{{< /tabs >}}

### Linking the New Span

If the work being done by the other process is better represented as a `link` --
see [the `link` definition in the
specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#links-between-spans)
for more on when that is appropriate
-- then the `SpanCtx` returned by `start_span` is passed to `link/1` to create
a `link` that can be passed to `with_span` or `start_span`:

{{< tabs Erlang Elixir >}}

{{< tab >}}
SpanCtx = ?current_span_ctx,
proc_lib:spawn_link(fun() ->
%% a new process has a new context so the span created
%% by the following `with_span` will have no parent
Link = opentelemetry:link(SpanCtx),
?with_span(<<"other-process">>, #{links => [Link]},
fun() -> ok end)
end),
{{< /tab >}}

{{< tab >}}
task = Task.async(fn ->
# a new process has a new context so the span created
# by the following `with_span` will have no parent
link = OpenTelemetry.link(parent)
tsloughter marked this conversation as resolved.
Show resolved Hide resolved
Tracer.with_span "my-task", %{links: [link]} do
:hello
end
end)
{{< /tab >}}

{{< /tabs >}}

## Attributes

Attributes are key-value pairs that are applied as metadata to your spans and
are useful for aggregating, filtering, and grouping traces. Attributes can be
added at span creation, or at any other time during the life cycle of a span
before it has completed.

The key can be an atom or a utf8 string (a regular string in Elixir and a
binary, `<<"..."/utf8>>`, in Erlang). The value can be of any type. If necessary
the key and value are converted to strings when the attribute is exported in a
span.

The following example shows the two ways of setting attributes on a span by both
setting an attribute in the start options and then again with `set_attributes`
in the body of the span operation:

{{< tabs Erlang Elixir >}}

{{< tab >}}
?with_span(<<"my-span">>, #{attributes => [{<<"start-opts-attr">>, <<"start-opts-value">>}]},
fun() ->
?set_attributes([{<<"my-attribute">>, <<"my-value">>},
{another_attribute, <<"value-of-attribute">>}])
end)
{{< /tab >}}

{{< tab >}}
Tracer.with_span "span-1", %{attributes: [{<<"start-opts-attr">>, <<"start-opts-value">>}]} do
Tracer.set_attributes([{"my-attributes", "my-value"},
{:another_attribute, "value-of-attributes"}])
end
{{< /tab >}}

{{< /tabs >}}

### Semantic Attributes

Semantic Attributes are attributes that are defined by the OpenTelemetry
Specification in order to provide a shared set of attribute keys across multiple
languages, frameworks, and runtimes for common concepts like HTTP methods,
status codes, user agents, and more. These attribute keys and values are
available in the header `opentelemetry_api/include/otel_resource.hrl`.

Tracing semantic conventions can be found [in this document](https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions)

## Events

An event is a human-readable message on a span that represents "something
happening" during it's lifetime. For example, imagine a function that requires
exclusive access to a resource like a database connection from a pool. An event
could be created at two points - once, when the connection is checked out from
the pool, and another when it is checked in.

{{< tabs Erlang Elixir >}}

{{< tab >}}
?with_span(<<"my-span">>, #{},
fun() ->
?add_event(<<"checking out connection">>),
%% acquire connection from connection pool
?add_event(<<"got connection, doing work">>),
%% do some work with the connection and then return it to the pool
?add_event(<<"checking in connection">>)
end)
{{< /tab >}}

{{< tab >}}
Tracer.with_span "my-span" do
Span.add_event("checking out connection")
## acquire connection from connection pool
Span.add_event("got connection, doing work")
## do some work with the connection and then return it to the pool
Span.add_event("checking in connection")
end
{{< /tab >}}

{{< /tabs >}}


A useful characteristic of events is that their timestamps are displayed as
offsets from the beginning of the span, allowing you to easily see how much time
elapsed between them.

Additionally, events can also have attributes of their own:

{{< tabs Erlang Elixir >}}

{{< tab >}}
?add_event("Process exited with reason", [{pid, Pid)}, {reason, Reason}]))
{{< /tab >}}

{{< tab >}}
Span.add_event("Process exited with reason", pid: pid, reason: Reason)
{{< /tab >}}

{{< /tabs >}}

# Cross Service Propagators

Distributed traces extend beyond a single service, meaning some context must be
propagated across services to create the parent-child relationship between
spans. This requires cross service [_context
propagation_](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#context-propagation),
a mechanism where identifiers for a trace are sent to remote processes.

In order to propagate trace context over the wire, a propagator must be
registered with OpenTelemetry. This can be done through configuration of the
`opentelemetry` application:

{{< tabs Erlang Elixir >}}

{{< tab >}}
%% sys.config
...
{text_map_propagators, [fun otel_baggage:get_text_map_propagators/0,
fun otel_tracer_default:w3c_propagators/0]},
...
{{< /tab >}}

{{< tab >}}
# runtime.exs
...
text_map_propagators:
[&:otel_baggage.get_text_map_propagators/0,
&:otel_tracer_default.w3c_propagators/0],
...
{{< /tab >}}

{{< /tabs >}}

If you instead need to use the [B3
specification](https://github.com/openzipkin/b3-propagation), originally from
the [Zipkin project](https://zipkin.io/), then replace
`otel_tracer_default:w3c_propagators/0` and
`&:otel_tracer_default.w3c_propagators/0` with `fun
otel_tracer_default:b3_propagators/0` and
`&:otel_tracer_default.b3_propagators/0` for Erlang or Elixir respectively.

# Library Instrumentation

Library instrumentations, broadly speaking, refers to instrumentation code that
you didn't write but instead include through another library. OpenTelemetry for
Erlang/Elixir supports this process through wrappers and helper functions around
many popular frameworks and libraries. You can find in the
[opentelemetry-erlang-contrib
repo](https://github.com/open-telemetry/opentelemetry-erlang-contrib/) and the [registry](/registry).

# Creating Metrics

The metrics API, found in `apps/opentelemetry-experimental-api` of the
`opentelemetry-erlang` repository, is currently unstable, documentation TBA.