Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make OTLP exporter memory mode API public #6469

Merged
merged 1 commit into from
May 30, 2024

Conversation

jack-berg
Copy link
Member

The newly added memory mode setting works without a hitch as far as I can tell.

I talk about its benefits in a new blog post on opentelemetry.io called "Java Metric Systems Compared" - currently pending PR review open-telemetry/opentelemetry.io#4512.

Some considerations for making the API public:

  • I think we ought to make REUSABLE_DATA the default memory mode in the future. It currently defaults to IMMUTABLE_DATA, but we should be allowed to update the default.
  • If the default is eventually REUSABLE_DATA, then maybe it shouldn't be configurable at all? Well the memory mode impact the behavior of the SDK for metrics (not the case for traces or logs), and its not safe to wrap the OTLP metric exporters and hold onto MetricData after MetricExporter#export(Collection<MetricData>) returns. Althought this type of thing is an edge case, its still valid. Allowing memory mode to be configurable to IMMUTABLE_DATA acts as an escape hatch. Logs and traces don't have the same use case, but maybe future JVMs can get so good at escape analysis that immutable data ends up outperforming data structure reuse. It seems harmless to make memory mode configurable for traces and logs.

@jack-berg jack-berg requested a review from a team May 21, 2024 20:22
Copy link

codecov bot commented May 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.85%. Comparing base (c71c4d9) to head (b923351).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6469      +/-   ##
============================================
- Coverage     90.86%   90.85%   -0.01%     
+ Complexity     6169     6154      -15     
============================================
  Files           678      675       -3     
  Lines         18507    18454      -53     
  Branches       1818     1813       -5     
============================================
- Hits          16816    16766      -50     
+ Misses         1154     1151       -3     
  Partials        537      537              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@jkwatson jkwatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

* <p>>When memory mode is {@link MemoryMode#REUSABLE_DATA}, serialization is optimized to reduce
* memory allocation.
*/
public OtlpHttpLogRecordExporterBuilder setMemoryMode(MemoryMode memoryMode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this is part of public api of a stable module and once we have added this method we can't easily get rid of it? If so then perhaps it would be better to not tie this to implementation details like immutable or reusable data, but rather something abstract like minimize allocations and maximize throughput (I guess this would only make sense if the one that allocates more has a bit better performance). That way the behavior of this method could more easily accommodate future changes which could, for example, include deleting one of the memory modes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this is part of public api of a stable module and once we have added this method we can't easily get rid of it?

We can't get rid of it without a major version bump, which we don't plan on doing anytime soon.

perhaps it would be better to not tie this to implementation details like immutable or reusable data, but rather something abstract like minimize allocations and maximize throughput (I guess this would only make sense if the one that allocates more has a bit better performance). That way the behavior of this method could more easily accommodate future changes which could, for example, include deleting one of the memory modes.

The MemoryMode enum is also part of the stable API at this point so we can't delete the memory modes. In hindsight it might have been preferable to choose a name like MemoryMode.LOW_ALLOCATION instead MemoryMode.REUSABLE_DATA. LOW_ALLOCATION describes the intended outcome (more user facing) where REUSABLE_DATA describes what is happening under the covers. The flip side of this is that REUSABLE_DATA communicates some important information to MetricReader / MetricExporter implementations: we're going to reuse MetricData classes so they won't function right after the CompletableResultCode from MetricExporter.export() resolves. So there's benefits to naming for the outcome and also how the outcome is accomplished.

REUSABLE_DATA doesn't describes what's happening with the serializers as well as it describes what's happening with the metrics SDK where it was originally introduced, but its still somewhat accurate. One thing that sticks out is that with the metrics SDK, implementers of MetricReader / MetricExporter have to be aware of the semantics of REUSABLE_DATA. With serializers, there's no impact to user semantics. The only indication a user can see that something has changed is a shift in the CPU / memory behavior.

Overall, I think its preferable to have one memory mode configuration concept, even if the words we chose for that concept (IMMUTABLE_DATA and REUSABLE_DATA) don't perfectly describe all the places we make that configurable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flip side of this is that REUSABLE_DATA communicates some important information to MetricReader / MetricExporter implementations: we're going to reuse MetricData classes so they won't function right after the CompletableResultCode from MetricExporter.export() resolves

👍

@jack-berg jack-berg merged commit f579061 into open-telemetry:main May 30, 2024
18 checks passed
@jkwatson jkwatson mentioned this pull request Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants