Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Use a V2 API to evolve interfaces #3920

Closed
MadVikingGod opened this issue Mar 22, 2023 · 13 comments · Fixed by #3968
Closed

Proposal: Use a V2 API to evolve interfaces #3920

MadVikingGod opened this issue Mar 22, 2023 · 13 comments · Fixed by #3968
Assignees
Labels
enhancement New feature or request

Comments

@MadVikingGod
Copy link
Contributor

MadVikingGod commented Mar 22, 2023

This is a potential solution to #3919

Cutting a new version is a potential solution to adding methods to interfaces. This issue is a collection of requirements for the PoC, and what questions this PoC should answer.

Requirements:

  • The new API must be at a different URL than the current one. It should be acceptable if this is done in a private repo.
  • The new API must be backward compatible with the old. This means that you can't change the signature of a method.
  • The SDK must be compatible with both APIs
  • The SDK must not take a V2 approach. If this is done in a seperate repo, the SDK should use the equalivent URL, eg. /sdk/trace
  • There should be 1 new API on the span, and it should have a measureable effect. It doesn't have to be AddLink, but we need to know if it was processed by V1 or V2 API.

Questions:

  • How does a user actively upgrade from V1 of the API to V2?
  • If a user of V1, takes a dependency that produces V2 traces, what happens to the dependency spans?
  • If a user has a dependency with V1 traces, what happens to the dependency's spans?

cc @katiehockman

@MadVikingGod
Copy link
Contributor Author

First roadblock in this adventure. We can't use the same method names in the V2 of the API and have them be directly compatible with V1.

The problem is if we modify an interface, for example Span, everything that uses or returns that interface will be different.
Because Tracer is defined as:

type Tracer interface {
    Start(...) (ctx, Span)
}

The function on the type that implements tracer can only implement Span V1 or Span V2. This is true even if SpanV2 is a superset of SpanV1.

Example code: https://go.dev/play/p/wj9GfKoW3B0

This means for the SDK types to directly be usable as both a V1 and a V2, the V2 interfaces will have to use a different name. For example, TracerV2, StartV2, etc.

An alternative solution would be to not have direct implementation, but include a method that allows you to get a V1 tracer from V2. The V2 Tracer might look like:

// This is the V2 so Span is the V2 span
type Tracer interface {
    Start(...) (ctx, Span)
    V1() v1.Tracer
}

The goal would then be to have the returned v1.Tracer be a wrapper around the implemented tracer that has the V1 methods. A potential drawback is this might end up with a lot of copying from V1 datatypes to v2.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

Issue with global provider functions

Starting with the sample instrumented application:

package main

import (
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/trace"
)

type Application struct {
	tracer trace.Tracer
}

func NewApplication() Application {
	const name = "Application"
	return Application{tracer: otel.Tracer(name)}
}

func main() {
	_ = NewApplication()
}

Upgrading to go.opentelemetry.io/otel/trace/v2:

package main

import (
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/trace/v2"
)

type Application struct {
	tracer trace.Tracer
}

func NewApplication() Application {
	const name = "Application"
	return Application{tracer: otel.Tracer(name)}
}

func main() {
	_ = NewApplication()
}

Causes:

$ go build
[...]: cannot use otel.Tracer(name) (value of type "go.opentelemetry.io/otel/trace".Tracer) as "go.opentelemetry.io/otel/trace/v2".Tracer value in struct literal: "go.opentelemetry.io/otel/trace".Tracer does not implement "go.opentelemetry.io/otel/trace/v2".Tracer (wrong type for method Start)
		have Start(context.Context, string, ..."go.opentelemetry.io/otel/trace".SpanStartOption) (context.Context, "go.opentelemetry.io/otel/trace".Span)
		want Start(context.Context, string, ..."go.opentelemetry.io/otel/trace/v2".SpanStartOption) (context.Context, "go.opentelemetry.io/otel/trace/v2".Span)

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

Issue with instrumentation options accepting impelementations

Starting with the sample application:

package main

import (
	sdk "go.opentelemetry.io/otel/sdk/trace"
	"go.opentelemetry.io/otel/trace"
)

type Application struct {
	tracer trace.Tracer
}

func NewApplication(t trace.TracerProvider) Application {
	const name = "Application"
	return Application{tracer: t.Tracer(name)}
}

func main() {
	_ = NewApplication(sdk.NewTracerProvider())
}

And upgrading to go.opentelemetry.io/otel/trace/v2:

package main

import (
	sdk "go.opentelemetry.io/otel/sdk/trace"
	"go.opentelemetry.io/otel/trace/v2"
)

type Application struct {
	tracer trace.Tracer
}

func NewApplication(t trace.TracerProvider) Application {
	const name = "Application"
	return Application{tracer: t.Tracer(name)}
}

func main() {
	_ = NewApplication(sdk.NewTracerProvider())
}

Causes the following error:

$ go build
[...]: cannot use sdk.NewTracerProvider() (value of type *"go.opentelemetry.io/otel/sdk/trace".TracerProvider) as "go.opentelemetry.io/otel/trace/v2".TracerProvider value in argument to NewApplication: *"go.opentelemetry.io/otel/sdk/trace".TracerProvider does not implement "go.opentelemetry.io/otel/trace/v2".TracerProvider (wrong type for method Tracer)
		have Tracer(string, ..."go.opentelemetry.io/otel/trace".TracerOption) "go.opentelemetry.io/otel/trace".Tracer
		want Tracer(string, ..."go.opentelemetry.io/otel/trace/v2".TracerOption) "go.opentelemetry.io/otel/trace/v2".Tracer

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

First roadblock in this adventure. We can't use the same method names in the V2 of the API and have them be directly compatible with V1.
[...]

From SIG meeting: this looks like a non-starter. We cannot use this approach because of this.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

If we want to solve the global question by having each API manage their own global impelementation, there is an unresolved issue where an instrumentation library imports the v2 global and ask for a provider but the operator has only registered a v1 global. The global packages could be made to return a wrapped version of the v1 provider, but the unimplemented methods would either panic or be no-ops. Both behaviors being something the instrumentation author did not intend.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

An alternative solution would be to not have direct implementation, but include a method that allows you to get a V1 tracer from V2. The V2 Tracer might look like:

// This is the V2 so Span is the V2 span
type Tracer interface {
    Start(...) (ctx, Span)
    V1() v1.Tracer
}

The goal would then be to have the returned v1.Tracer be a wrapper around the implemented tracer that has the V1 methods. A potential drawback is this might end up with a lot of copying from V1 datatypes to v2.

Similarly, but instead of extending the interfaces, the v2 API package could also provide a "compatibility" function:

func TracerProviderV1(tp TracerProvider) trace.TracerProvider {
	return v1TracerProvider{tp}
}

type v1TracerProvider struct {
	TracerProvider
}

func (tp v1TracerProvider) Tracer(name string, options ...trace.TracerOption) trace.Tracer {
	return v1Tracer{tp.TracerProvider.Tracer(name, v2TracerOptions(options)...)}
}

func v2TracerOptions(opt []trace.TracerOption) []TracerOption {
	out := make([]TracerOption, 0, len(opt))
	// TODO: implement.
	return out
}

type v1Tracer struct {
	Tracer
}

func (t v1Tracer) Start(ctx context.Context, spanName string, opts ...trace.SpanStartOption) (context.Context, trace.Span) {
	ctx, s := t.Tracer.Start(ctx, spanName, v2SpanStartOptions(opts)...)
	return ctx, v1Span{s}
}

func v2SpanStartOptions(opt []trace.SpanStartOption) []SpanStartOption {
	out := make([]SpanStartOption, 0, len(opt))
	// TODO: implement.
	return out
}

type v1Span struct {
	Span
}

func (s v1Span) End(options ...trace.SpanEndOption)
func (s v1Span) AddEvent(name string, options ...trace.EventOption)
func (s v1Span) IsRecording() bool
func (s v1Span) RecordError(err error, options ...trace.EventOption)
func (s v1Span) SpanContext() trace.SpanContext
func (s v1Span) SetStatus(code codes.Code, description string)
func (s v1Span) SetName(name string)
func (s v1Span) SetAttributes(kv ...attribute.KeyValue)
func (s v1Span) TracerProvider() trace.TracerProvider

This would mean the SDK would not simultaneously implement both API versions, but the v2 implementing version could still be used for the v1 with this conversion.

This still doesn't solve the issue with a v2 API being provided a v1 only implementing SDK.

@tedsuo
Copy link

tedsuo commented Mar 23, 2023

I don't think you have to solve the problem of a v2 API being handed a v1 implementation? In OTel we require you to upgrade your SDK.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 23, 2023

func TracerProviderV1(tp TracerProvider) trace.TracerProvider {
	return v1TracerProvider{tp}
}

type v1TracerProvider struct {
	TracerProvider
}

func (tp v1TracerProvider) Tracer(name string, options ...trace.TracerOption) trace.Tracer {
	return v1Tracer{tp.TracerProvider.Tracer(name, v2TracerOptions(options)...)}
}

func v2TracerOptions(opt []trace.TracerOption) []TracerOption {
	out := make([]TracerOption, 0, len(opt))
	// TODO: implement.
	return out
}

type v1Tracer struct {
	Tracer
}

func (t v1Tracer) Start(ctx context.Context, spanName string, opts ...trace.SpanStartOption) (context.Context, trace.Span) {
	ctx, s := t.Tracer.Start(ctx, spanName, v2SpanStartOptions(opts)...)
	return ctx, v1Span{s}
}

func v2SpanStartOptions(opt []trace.SpanStartOption) []SpanStartOption {
	out := make([]SpanStartOption, 0, len(opt))
	// TODO: implement.
	return out
}

type v1Span struct {
	Span
}

func (s v1Span) End(options ...trace.SpanEndOption)
func (s v1Span) AddEvent(name string, options ...trace.EventOption)
func (s v1Span) IsRecording() bool
func (s v1Span) RecordError(err error, options ...trace.EventOption)
func (s v1Span) SpanContext() trace.SpanContext
func (s v1Span) SetStatus(code codes.Code, description string)
func (s v1Span) SetName(name string)
func (s v1Span) SetAttributes(kv ...attribute.KeyValue)
func (s v1Span) TracerProvider() trace.TracerProvider

This would mean the SDK would not simultaneously implement both API versions, but the v2 implementing version could still be used for the v1 with this conversion.

I don't think that's going to work.

User's code that start with a v1 implementing SDK used with v1 API instrumentation would compile. They would upgrade to the latest version of the v1 API and the SDK would update to implementing the v2 API. This means by upgrading their SDK it would no longer support their instrumentation because it doesn't implement the v1 API anymore. They would need to go through some modification of their code to effectively "downgrade" their SDK to support an older version of the OTel API.

This does not comply with the OTel mission statement:

It MUST always be possible to upgrade to the latest minor version of the OpenTelemetry SDK, without creating compilation or runtime errors.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 24, 2023

We could also offer a v2 creation function in sdk/trace:

func NewTracerProviderV2(/*...*/) tracev2.TracerProvider { /*...*/ }

The returned TracerProvider would implement the v2 API and could use some "downgrade" conversion function (mentioned above) to also be registered with the v1 API.

The original NewTracerProvider would not be modified, it would continue to return a SDK implementing the v1 API. This mean existing code would continue to compile.

This has the following downsides:

  1. There would be two TracerProvider creation functions
    • The original NewTracerProvider could be deprecated in favor or ConvertV2ToV1(NewTracerProviderV2(...))
  2. While the upgrade process will not be broken in the sense that package versions can be incremented, the actual upgrade to support v2 will require application operators to take a positive action of switching to NewTracerProviderV2.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 24, 2023

The span context storage in a context.Context is also going to be an issue:

Currently the active span is stored in a context:

// ContextWithSpan returns a copy of parent with span set as the current Span.
func ContextWithSpan(parent context.Context, span Span) context.Context {
return context.WithValue(parent, currentSpanKey, span)
}

If a v2 API branched, it would have its own ContextWithSpan that is not compatible with the v1 API. You cannot store a v2 span using the v1 API.

This means a trace from instrumentation that uses the v2 API to instrumentation that uses the v1 API and vice versa is going to have incompatibilities.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 24, 2023

We could also offer a v2 creation function in sdk/trace:

func NewTracerProviderV2(/*...*/) tracev2.TracerProvider { /*...*/ }

The returned TracerProvider would implement the v2 API and could use some "downgrade" conversion function (mentioned above) to also be registered with the v1 API.

The original NewTracerProvider would not be modified, it would continue to return a SDK implementing the v1 API. This mean existing code would continue to compile.

This has the following downsides:

  1. There would be two TracerProvider creation functions

    • The original NewTracerProvider could be deprecated in favor or ConvertV2ToV1(NewTracerProviderV2(...))
  2. While the upgrade process will not be broken in the sense that package versions can be incremented, the actual upgrade to support v2 will require application operators to take a positive action of switching to NewTracerProviderV2.

This approach means that we would effectively be splitting the SDK and hiding the details within the same package. This doesn't truly address the original requirement that one SDK would implement the v1 and v2 APIs.

It means one SDK package, with two hidden implementation, could possibly1 be used to implement both APIs with the help of conversion functions

Footnotes

  1. there are bound to be incompatibilities still: https://github.com/open-telemetry/opentelemetry-go/issues/3920#issuecomment-1483371275

@MadVikingGod
Copy link
Contributor Author

Sorry I've been in the airport most of today. Can we sync on Monday and summarize where each of us stands on the different approaches? See if we can come to an agreement on which approach we want to use.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 30, 2023

The plan (from the SIG meeting today) is to summarize our decision and close this proposal as "declined".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

3 participants