Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC][DO NOT MERGE] Pipelines without a service #5149

Closed
wants to merge 3 commits into from

Conversation

mx-psi
Copy link
Member

@mx-psi mx-psi commented Apr 5, 2022

Description:

This is a PoC for thinking about how pipelines without using service.Collector (or service.service, if it ever gets exposed) would look like. It's meant to spark discussion more than as actual code to be merged

Link to tracking Issue: Relates to discussion on #4970

Additional notes:

This only supports metrics (since it is what I had an easy testcase to run against). It includes an example or how to use the API in ExampleNewBuilder.

There are several limitations that can be improved with more code:

  1. No support for extensions
  2. No support for multiple exporters or multiple receivers attached to the same pipeline. I am not sure that the builder should handle this or if a different API should be provided for this (say, a Compose(exporters ...component.Exporter) component.Exporter kind of function).
  3. Only supports building full pipelines (on a Go SDK one would want a consumer.Metrics|Traces|Logs instead).

There are several limitations that I find harder to think about how to overcome, mostly related to configuration. These are detailed below on a comment

@codecov
Copy link

codecov bot commented Apr 5, 2022

Codecov Report

Merging #5149 (28f38b3) into main (ff6a8f6) will decrease coverage by 0.33%.
The diff coverage is 58.82%.

@@            Coverage Diff             @@
##             main    #5149      +/-   ##
==========================================
- Coverage   90.34%   90.00%   -0.34%     
==========================================
  Files         182      183       +1     
  Lines       11031    11150     +119     
==========================================
+ Hits         9966    10036      +70     
- Misses        840      882      +42     
- Partials      225      232       +7     
Impacted Files Coverage Δ
pipeline/pipeline.go 58.82% <58.82%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff6a8f6...28f38b3. Read the comment docs.

Comment on lines +64 to +75
receiverFactory := otlpreceiver.NewFactory()
receiverCfg := receiverFactory.CreateDefaultConfig().(*otlpreceiver.Config)
receiverCfg.HTTP = nil // I need to explicitly nil HTTP, since this is done in Unmarshal usually
receiverCfg.GRPC = &configgrpc.GRPCServerSettings{
NetAddr: confignet.NetAddr{
Transport: "tcp",
// I can't really express 'use the default gRPC settings here' as one can do by setting 'grpc:' on the YAML
Endpoint: "0.0.0.0:4317",
},
// I only know this by reading the code
ReadBufferSize: 512 * 1024,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, to me, is the hardest problem to solve: declaring configuration for components is very brittle and has a big number of gotchas. The ones that I have thought about are:

Problem 1: Can't express default configuration (nil but present sections)

On the OTLP receiver and other components that have a custom unmarshaling function, we rely on having empty sections for configuration:

# Example: enable only HTTP with default endpoint
otlp:
  protocols:
    http:

We can't express that on the Go API easily (setting to nil disables them), and we need to explicitly declare the endpoint. In fact, for the OTLP receiver, we need to explicitly disable an endpoint by setting to nil instead of the other way around.

Problem 2: Users can't rely on zero values

Configuration structs on the different components don't usually have the zero value of the fields as the default. For example, one may be tempted to set the default configuration for gRPC connection as:

// WRONG: ReadBufferSize is 0, which disables read buffering (not the default)
receiverCfg.GRPC = &configgrpc.GRPCServerSettings{
		NetAddr: confignet.NetAddr{
			Transport: "tcp",
			Endpoint: "0.0.0.0:4317",
		},
	}

while disabling the read buffer is not necessarily catastrophic, in other situations it can lead to more serious unexpected side effects. This can be partially mitigated by components if they assign meaning to zero values, but this is not always possible (e.g. what about booleans?)

Problem 3: Can't express overrides

Imagine a component with a config like the following:

timeout: <time.Duration, 0 disables timeout>

foo:
  timeout: <time.Duration, 0 disables timeout, overrides timeout for foos>
bar:
  timeout: <time.Duration, 0 disables timeout, overrides timeout for bars>

A component can express this right now via a custom unmarshaling function, by checking if foo::timeout or bar::timeout are set and overriding if so. In the Go API, we can't really do this:

cfg := component.Config{
   Timeout: 10*time.Second,
   Foo: Foo{ Timeout: 30 *time.Second },
   // Pipeline doesn't know that Bar was unset, so it doesn't know that it should ignore Bar 0 value and use top-level timeout instead
}

I think all of these are solvable by adding a configuration builder for each component like otlpreceiver.NewConfig(options ...otlpreceiver.ConfigOption) and nudging users toward using those, but it needs individual component support. Furthermore, since the configs of components are meant to have all fields public and behave like PODs, users can be tempted to use the configuration struct directly regardless. This is not very worrisome (nobody does zap.Logger{}, even if you can), but it's yet another point of friction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mx-psi I understand all these possible problems, but I am still not sure I understand what are you trying to solve. In your case if I understood correctly you have a "hardcoded" pipeline, and you don't need to let users define the config. Do you care that you have to do some "hardcoded" config? What problem are you trying to solve with this proposal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure I understand what are you trying to solve. In your case if I understood correctly you have a "hardcoded" pipeline, and you don't need to let users define the config

Let me try to describe my use case in more detail. We use the Collector on the Datadog Agent, an observability daemon like the Collector. We have fixed pipelines for different telemetry signals with

  • an OTLP receiver,
  • some processing (e.g. batching) and
  • a custom exporter that converts to the Datadog format and forwards the telemetry data internally to existing components of the Datadog Agent.

Parts of the configuration for the pipelines are hardcoded (e.g. what components the pipeline has), while other parts are customizable by end-users via YAML (but not necessarily in the format of the Collector).

Do you care that you have to do some "hardcoded" config?

I am okay with hardcoding things. We currently use a custom ConfigProvider that assembles a config.Map from the YAML configuration and the fixed settings, and we run a service.Collector instead of an individual pipeline.

However, right now I either have to hardcode things in a config.Map or deal with the problems described in my message above, neither option being ideal.

What problem are you trying to solve with this proposal?

The overarching problem I am trying to solve is to run a pipeline within another existing application and integrating it with this application (e.g. by integrating the logging, metrics, CLI or configuration). In particular, this PoC stemmed from my need for overriding telemetry providers so that we can eventually access the different components' telemetry.

I hoped to solve this with #5107, but I understand that service.Collector may not fit my use case and I want to take your concerns seriously, thus I am exploring the second option of getting rid of it entirely and just having a lightweight pipeline builder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my comment about What problem are you trying to solve with this proposal? was not clear, I was referring to the config part not the initial problem you are trying to solve.

So for configuration let me see if I understood the problem:
Datadog Agent wants to provide some configuration for some of the components in this hardcoded pipeline that they need, in a yaml format that is not the collectors yaml

Some solutions that I see:

  1. You translate the Datadog's YAML into collector's YAML (something that you do right now). In this you need indeed to take care of the default config etc, because the "Unmarshal" function on the component config may make some assumptions about the default configuration.
  2. You translate the Datadog's YAML into collector's component Config (the unmarshalled config). Here you don't care about the "default config" etc. You define your own rules on how to configure components.
  3. You want the same component yaml config for Components that you use in the hardcoded pipeline, but in a different high-level yaml structure (no top-level fields like receiver, service, etc.). In this case you should parse that as a map[string]interface{} -> config.Map -> Maybe do any conversion to the Map -> Unmarshal into the component Config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my comment about What problem are you trying to solve with this proposal? was not clear, I was referring to the config part not the initial problem you are trying to solve.

Looks like I misunderstood you but I think I managed to answer the question anyway :)

So for configuration let me see if I understood the problem:
Datadog Agent wants to provide some configuration for some of the components in this hardcoded pipeline that they need, in a yaml format that is not the collectors yaml

I think this is a good summary, thanks for taking the time to work with me on this. Right now indeed we are using solution (1), which works even if a bit cumbersome to use. Solution (2) has the pitfalls I described above and I don't feel like it's safe to use with the current API design.

In any case, I still don't feel like this solves the original 'overriding telemetry providers' problem, in that I wouldn't want to use code building pipelines like the one on this PR if it did not live on the Collector repo: things today are mostly designed to work only with the 'Collector distro' use case in mind, and I would worry that I can't reliably keep using the latest version of upstream. Do you think it makes sense to have pipeline-building code on go.opentelemetry.io/collector? Is this better supported by eventually exposing service.Service or do we need this different thing?

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@mx-psi
Copy link
Member Author

mx-psi commented Apr 29, 2022

After thinking about this, I don't think this is a solution that offers many advantages over just eventually exposing service. I am closing this and will focus on

  1. exploring if telemetry providers is a good idea and addressing concerns there and
  2. refactoring service to see if the resulting API would make sense to expose as an alternative option.

@mx-psi mx-psi closed this Apr 29, 2022
@mx-psi mx-psi deleted the mx-psi/pipeline-no-service branch April 29, 2022 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants