Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inits Deployment contrib #2312

Closed
wants to merge 11 commits into from
Closed

Conversation

mhausenblas
Copy link
Member

@mhausenblas mhausenblas commented Feb 9, 2023


The collector would then be configured like so:

{{< ot-tabs Traces Metrics Logs >}} {{< ot-tab lang="yaml">}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, we're trying to phase out the use of the ot-tabs shortcode (for details see #1820). Since you seem to be on a roll in terms of creating new content, would you be willing to switch to using the Docsy tabpane shortcode instead?

Here's an example:

{{< tabpane langEqualsHeader=true >}}
{{< tab TypeScript >}}
/*tracing.ts*/
import { BatchSpanProcessor, ConsoleSpanExporter } from "@opentelemetry/sdk-trace-base";
import { Resource } from "@opentelemetry/resources";
import { SemanticResourceAttributes } from "@opentelemetry/semantic-conventions";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { registerInstrumentations } from "@opentelemetry/instrumentation";
// Optionally register instrumentation libraries
registerInstrumentations({
instrumentations: [],
});
const resource =
Resource.default().merge(
new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: "service-name-here",
[SemanticResourceAttributes.SERVICE_VERSION]: "0.1.0",
})
);
const provider = new NodeTracerProvider({
resource: resource,
});
const exporter = new ConsoleSpanExporter();
const processor = new BatchSpanProcessor(exporter);
provider.addSpanProcessor(processor);
provider.register();
{{< /tab >}}
{{< tab JavaScript >}}
/*tracing.js*/
const opentelemetry = require("@opentelemetry/api");
const { Resource } = require("@opentelemetry/resources");
const { SemanticResourceAttributes } = require("@opentelemetry/semantic-conventions");
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const { ConsoleSpanExporter, BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
// Optionally register instrumentation libraries
registerInstrumentations({
instrumentations: [],
});
const resource =
Resource.default().merge(
new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: "service-name-here",
[SemanticResourceAttributes.SERVICE_VERSION]: "0.1.0",
})
);
const provider = new NodeTracerProvider({
resource: resource,
});
const exporter = new ConsoleSpanExporter();
const processor = new BatchSpanProcessor(exporter);
provider.addSpanProcessor(processor);
provider.register();
{{< /tab >}}
{{< /tabpane>}}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got it to work after some wrangling, here's what I'm using:

<!-- prettier-ignore-start -->
{{< tabpane persistLang=false >}}
  {{< tab header="Traces" lang="yaml" >}}
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the lang doesn't inherit as per docs and without the persistLang=false the rendering is screwed.

Copy link
Contributor

@cartermp cartermp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a review pass with some questions and suggestions. Thank you, this is really great content overall!

Comment on lines +7 to +10
The OpenTelemetry collector consists of a single binary which you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OpenTelemetry collector consists of a single binary which you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.
The OpenTelemetry collector consists of a single binary that you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons, and best practices for
collector configurations for cross-environment and multi-backend deployments.


## Normalizing

Normalize the metadata from different instrumentations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Normalize the metadata from different instrumentations
Normalize instrumentation from different sources.


## Multitenancy

You want to isolate different tenants (customers, teams, etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You want to isolate different tenants (customers, teams, etc.)
Isolate different tenants (customers, teams, etc.)

Comment on lines +169 to +170
You want to aggregate signals from multiple environments (on-prem, Kubernetes,
etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You want to aggregate signals from multiple environments (on-prem, Kubernetes,
etc.)
Aggregate signals from multiple environments (on-prem, Kubernetes,
etc.)

Comment on lines +174 to +175
Have one collector instance per signal type, for example, one dedicated to
Prometheus metrics, one dedicated to Jaeger traces.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Have one collector instance per signal type, for example, one dedicated to
Prometheus metrics, one dedicated to Jaeger traces.
One collector instance per signal type. For example, one dedicated to
Prometheus metrics and one dedicated to Jaeger traces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this referring to a export or ingest or both?


Cons:

- Effort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to what?

If you want to try it out for yourself, you can have a look at the end-to-end
[Java][java-otlp-example] or [Python][py-otlp-example] examples.

## Tradeoffs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the other tradeoffs sections aren't quite clear to me. Is this comparing centralized vs. decentralized collector deployment patterns? If so, I'd call it out explicitly in both docs sections.

Comment on lines +8 to +13
The decentralized collector deployment pattern consists of
applications—[instrumented][instrumentation] with an OpenTelemetry SDK using
[OpenTelemetry protocol (OTLP)][otlp]—or other collectors (using the OTLP
exporter) that send telemetry signals to one or more [collectors][collector].
Each client-side SDK or downstream collector is configured with a collector
location:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this paragraph difficult to parse. I also found it hard to understand how this pattern differs from the other since you mention several SDKs or collectors sending data to other collectors. Is the difference just that there isn't a load balancer?

like so:

<!-- prettier-ignore-start -->
{{< ot-tabs Traces Metrics Logs >}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Why is there an example for traces when the example scenario described above is just emitting metrics? I think the example scenario above should be changed to also include traces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also curious why this example uses the "one collector per signal type" pattern here. Is that something we should recommend for people, or is it just happenstance that it's the example chosen here?

- Simple to use (especially in a dev/test environment)
- No additional moving parts to operate (in production environments)

Cons:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to link to this section and/or amend it to have more examples.

patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.

## Other information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question that has come up fairly frequently regarding collector deployments is "when/why should I use the Collector Operator?"

Would be helpful for end-users if the intro mentioned how the operator (optionally) fits into k8s-based deployments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhausenblas, wdyt? I am always a fan of having inter-linked pages to point people to other resources (not the repo, but the docs for k8s operator at /docs/k8s-operator/), but I don't see yet how to incorporate that to this page.

@mhausenblas mhausenblas self-assigned this Mar 6, 2023
@mhausenblas mhausenblas marked this pull request as ready for review March 6, 2023 09:37
@mhausenblas mhausenblas requested review from a team and codeboten and removed request for a team March 6, 2023 09:37
@mhausenblas
Copy link
Member Author

FYI: aiming to complete this entry in W10 and also I will add the side car pattern to Decentralized section.

weight: 4
---

Now that you are equipped with the essential deployment patterns for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link back to the "essential deployment patterns" so if people land here first they figure out where to go for this.

Cons:

- Requires code changes if collection, processing, or ingestion changes
- Strong coupling between the application code and the backend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are more disadvantages we should list:

  • SDK needs to take care of authentication, connection management (reconnections, etc.), encryption, etc.
  • Long round time times to the backend increase overhead at the application level
  • ...

Comment on lines +22 to +27
A concrete example of the decentralized collector deployment pattern could look
as follows: you manually instrument, say, a [Java application to export
metrics][instrument-java-metrics] using the OpenTelemetry Java SDK. In the
context of the app, you would set the `OTEL_METRICS_EXPORTER` to `otlp` (which
is the default value) and configure the [OTLP exporter][otlp-exporter] with the
address of your collector, for example (in Bash or `zsh` shell):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A concrete example of the decentralized collector deployment pattern could look
as follows: you manually instrument, say, a [Java application to export
metrics][instrument-java-metrics] using the OpenTelemetry Java SDK. In the
context of the app, you would set the `OTEL_METRICS_EXPORTER` to `otlp` (which
is the default value) and configure the [OTLP exporter][otlp-exporter] with the
address of your collector, for example (in Bash or `zsh` shell):
A concrete example of the decentralized collector deployment pattern could look
as follows: you manually instrument a [Java application to export
metrics][instrument-java-metrics] using the OpenTelemetry Java SDK. In the
context of the app, you would set the `OTEL_METRICS_EXPORTER` to `otlp` (which
is the default value) and configure the [OTLP exporter][otlp-exporter] with the
address of your collector, for example (in Bash or `zsh` shell):

weight: 3
---

The centralized collector deployment pattern consists of applications (or other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d have two main scenarios with separate pictures. First one with multiple applications sending to one Collector (basic setup). The second one could be this load-balanced scenario, but with multiple applications to visualize the centralized pattern.

Each client-side SDK or downstream collector is configured with a collector
location:

![Decentralized collector deployment concept](../../img/decentralized-sdk.svg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, this picture doesn’t really capture the decentralized pattern and 1:1 relationship between the application and Collector. You could add 3 apps and 3 Collectors (1-1 connected) to make this visible.


- Simple to get started
- Clear 1:1 mapping between application and collector

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could mention the benefit to offload applications for batching, retry, encryption, compression, and more.

- Centralized policy management

Cons:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be added as cons:

  • Single point of failure
  • Can lead to high load levels, if not dimensioned properly

## Other information

- GitHub repo [OpenTelemetry Collector Deployment Patterns][gh-patterns]
- YouTube video [OpenTelemetry Collector Deployment Patterns][y-patterns]
Copy link
Member

@mx-psi mx-psi Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it clear that this is a talk

Suggested change
- YouTube video [OpenTelemetry Collector Deployment Patterns][y-patterns]
- KubeCon NA 2021 Talk [OpenTelemetry Collector Deployment Patterns][y-patterns]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggested change, but drop the quotes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edited, thanks for the feedback :)


## Other information

- GitHub repo [OpenTelemetry Collector Deployment Patterns][gh-patterns]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- GitHub repo [OpenTelemetry Collector Deployment Patterns][gh-patterns]
- [Repository with full configuration examples for different deployment patterns][gh-patterns]

Comment on lines +7 to +10
The OpenTelemetry collector consists of a single binary which you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make the first sentence structure simpler

Suggested change
The OpenTelemetry collector consists of a single binary which you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.
You can deploy the OpenTelemetry collector in different ways depending on your
use case. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.

Comment on lines +27 to +29
jaeger:
endpoint: "https://jaeger.example.com:14250"
insecure: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have deprecated the Jaeger exporter, so I would recommend using the OTLP exporter instead

receivers:
otlp:
protocols:
grpc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this configuration will produce a warning about DoS attacks (see here). We should set the endpoint explicitly

receivers:
otlp:
protocols:
grpc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

otlp:
protocols:
grpc:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

receivers:
otlp: # the OTLP receiver the app is sending traces to
protocols:
grpc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines +48 to +50
jaeger: # the Jaeger exporter, to ingest traces to backend
endpoint: "https://jaeger.example.com:14250"
insecure: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto (use OTLP exporter)

receivers:
otlp: # the OTLP receiver the app is sending logs to
protocols:
grpc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@dmitryax
Copy link
Member

dmitryax commented Mar 9, 2023

Are we introducing new deployment concepts or replacing Agent/Gateway with Decentralized/Centralized as 1:1? It doesn't seem to be 1:1 based on the docs:

  • Agent is supposed to represent an installation on one host so that instrumentation libraries can point to local endpoints like http://localhost:4318. The decentralized doc says collector.example.com:4318 instead. Also, the decentralized section mentions "Clear 1:1 mapping between application and collector" in "Pros" section, which is right for the agent term as well, but it confuses me when I read the first paragraph that seems to contradict:

The decentralized collector deployment pattern consists of applications—instrumented with an OpenTelemetry SDK using OpenTelemetry protocol (OTLP)—or other collectors (using the OTLP exporter) that send telemetry signals to one or more collectors.

  • Several Collector pull-based receivers are intended to run in the agent (decentralized?) mode, for example, hostmetrics receiver. If we are extending the documentation, I believe it's worth mentioning.

In general, I don't fully agree that decentralized/centralized terms are easier to understand than agent/gateway. I'd like to bring this to discussion for the Collector SIG meeting.

cc @open-telemetry/collector-approvers

@mhausenblas mhausenblas deleted the col-deploy branch March 13, 2023 15:00
@mhausenblas
Copy link
Member Author

Argh, didn't mean to close the PR, just catching up with main :(

@svrnm
Copy link
Member

svrnm commented Mar 13, 2023

Argh, didn't mean to close the PR, just catching up with main :(

you should be able to reopen it with a force push

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants