Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDK] Restore Recordable API compatibility with versions < 1.14.0 #2547

Merged
merged 6 commits into from
Feb 23, 2024

Conversation

dbolduc
Copy link
Contributor

@dbolduc dbolduc commented Feb 22, 2024

Issue:

Recordable::SetTraceFlags(...) was added in v1.14.0. This was a breaking change. Consumers of this library would have to define an override for this method. Also, the latest version of opentelemetry-cpp would not be API compatible with older versions of consumers (which could not know about this new, pure virtual method).

#2488 (comment)

Solution:

Provides a default implementation for Recordable::SetTraceFlags(...).

The default implementation is sufficient for W3C Trace Context version 1, which is actually used.

To later support W3C Trace Context version 2, exporters are encouraged to capture trace flags, by providing an implementation in Recordable sub classes.

Limitation:

Note that this PR does not restore ABI compatibility for older versions.

The ABI change is required to support opentelemetry-proto 1.1.0, used by the OTLP exporter.

@lalitb
Copy link
Member

lalitb commented Feb 22, 2024

Agree with the change. This got overlooked in the review.

@dbolduc dbolduc marked this pull request as ready for review February 22, 2024 20:02
@dbolduc dbolduc requested a review from a team February 22, 2024 20:02
Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change implemented in Recordable in #2488:

  • is an SDK API breaking change (new pure virtual method)
  • is an SDK ABI breaking change (new virtual method)
  • is indeed not documented as breaking change in the CHANGELOG.md, and release notes for 1.14.0

As the author of #2488, I failed to document this, which caused unexpected breaks.

Point taken.

Now, about the idea to provide a default no-op implementation for this method, I failed to understand why it is considered desirable:

  • The default no-op implementation is broken. Every exporter not recording trace flags (by overriding SetTraceFlags) will cause bugs with sampling downstream, and yet this is what opentelemetry-cpp would cause by default.

  • Providing a default implementation might make the code "compatible" (but broken, see above) at the API level (old exporters will still build with both 1.13.0 and 1.14.0), but it resolves nothing at the ABI level:

    • An exporter compiled with 1.13.0 and linked (dynamically) with 1.14.0 will crash
    • An exporter compiled with 1.14.0 and linked (dynamically) with 1.13.0 will crash

@marcalff
Copy link
Member

See related #2549

@ThomsonTan
Copy link
Contributor

The change implemented in Recordable in #2488:

  • is an SDK API breaking change (new pure virtual method)
  • is an SDK ABI breaking change (new virtual method)
  • is indeed not documented as breaking change in the CHANGELOG.md, and release notes for 1.14.0

As the author of #2488, I failed to document this, which caused unexpected breaks.

Point taken.

Now, about the idea to provide a default no-op implementation for this method, I failed to understand why it is considered desirable:

  • The default no-op implementation is broken. Every exporter not recording trace flags (by overriding SetTraceFlags) will cause bugs with sampling downstream, and yet this is what opentelemetry-cpp would cause by default.

  • Providing a default implementation might make the code "compatible" (but broken, see above) at the API level (old exporters will still build with both 1.13.0 and 1.14.0), but it resolves nothing at the ABI level:

    • An exporter compiled with 1.13.0 and linked (dynamically) with 1.14.0 will crash
    • An exporter compiled with 1.14.0 and linked (dynamically) with 1.13.0 will crash

An exporter compiled with 1.13.0 but not the same exporter will not compile with 1.4.0 because no default implementation for the new virtual method. The exporter will need to add its implementation to pass compilation.

As 1.14.0 is already broken, this fix could make it into 1.40.1 as patch release instead of the next minor version update 1.15.0.

@dbolduc
Copy link
Contributor Author

dbolduc commented Feb 23, 2024

indeed not documented

No worries. My problem is more with the API break (which caused me to do work 😵‍💫), not a call-out in the release notes.

Now, about the idea to provide a default no-op implementation for this method, I failed to understand why it is considered desirable

My understanding is that providing a default no-op implementation does not change any behavior that existed in v1.13.0, right? So the PR can't make any current problems worse. (Please correct me if I am wrong.)

So what do we gain? I think some exporters do not care about trace flags1. If I understand correctly, these are not broken in v1.13.0. e.g. zipkin seems not to, and I don't think Cloud Trace does either. These exporters take an unnecessary API break in v1.14.0.

I think you are using the API breaking change to alert exporters to check whether they need to heed the trace flags. Are you sure you would rather

A. introduce an API break that forces exporters to update, whether they use trace flags or not

over...

B. provide a default implementation that might go unnoticed by exporters that should use the trace flags

If you still think option A, that is fine. Feel free to close this PR.

but it resolves nothing at the ABI level

Ack. Note that some consumers of opentelemetry-cpp only care about API compatibility, though.

Footnotes

  1. It is possible I don't understand how trace flags are supposed to be used. My assumption is that they are sent to the tracing backends. I could easily be wrong. Are the exporters supposed to use the IsSampled flag to decide whether to export given spans?

@marcalff
Copy link
Member

marcalff commented Feb 23, 2024

@dbolduc

Thanks for the details.

This is what I suggest then:

  1. Merge [DOC] Add missing 1.14.0 CHANGELOG #2549 to fix the release notes, it is better anyway.

  2. Merge this PR to provide a default no-op instrumentation.

It is correct that some exporters will never care about trace flags (like zipkin).

My first concern is with:

B. provide a default implementation that might go unnoticed by exporters that should use the trace flags

but it can be mitigated by other means (documentation) instead of forcing a decision with a build break (pure virtual method).

Ok to have 1.14.1 as well.

My second concern is with binary compatibility.

With or without a no op implementation, mixing binaries pre and post 1.14.0 will lead to crashes, so the claim that "Recordable is backward compatible" should be clarified and put into context, because there is still an ABI change that can not be ignored.

@dbolduc dbolduc changed the title fix: Recordable backwards compat for versions < 1.14.0 fix: restore Recordable API compatibility with versions < 1.14.0 Feb 23, 2024
@dbolduc
Copy link
Contributor Author

dbolduc commented Feb 23, 2024

This is what I suggest then...

Thanks for working with me on this.

The claim that "Recordable is backward compatible" should be clarified and put into context

I updated the PR title and description to try to capture the nuance. (sorry, I only ever think about API compatibility). Feel free to make edits.

@marcalff marcalff changed the title fix: restore Recordable API compatibility with versions < 1.14.0 [SDL] Restore Recordable API compatibility with versions < 1.14.0 Feb 23, 2024
@marcalff marcalff changed the title [SDL] Restore Recordable API compatibility with versions < 1.14.0 [SDK] Restore Recordable API compatibility with versions < 1.14.0 Feb 23, 2024
@marcalff
Copy link
Member

@dbolduc

Done with edits, please confirm you are ok with the rewording.

@dbolduc
Copy link
Contributor Author

dbolduc commented Feb 23, 2024

LGTM, thanks!

@marcalff marcalff merged commit 75f34e6 into open-telemetry:main Feb 23, 2024
47 checks passed
@dbolduc dbolduc deleted the recordable-back-compat branch February 23, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants