Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Change Processor.OnEmit to accept a record pointer #5469

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

pellared
Copy link
Member

@pellared pellared commented Jun 3, 2024

Fixes #5219

Related to open-telemetry/opentelemetry-specification#4065

Alternative proposal: #5470

Changes Processor.OnEmit to:

OnEmit(ctx context.Context, record *Record) error

Pros:

  • compliant with the specification
  • design similar to span processors

Cons:

  • worse performance (see benchmark results below), not possible to have zero-allocations
  • design different from slog
  • less concurrent safe (see comments below)

Processor benchmark results:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                    │   old.txt    │                new.txt                │
                                    │    sec/op    │    sec/op      vs base                │
BatchProcessorOnEmit-16               369.9n ± 13%    397.8n ± 12%        ~ (p=0.063 n=10)
Processor/Simple-16                   807.7n ± 13%   1096.5n ± 14%  +35.76% (p=0.000 n=10)
Processor/Batch-16                    1.225µ ±  7%    1.637µ ± 12%  +33.69% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    744.3n ±  7%   1041.5n ± 13%  +39.94% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     1.219µ ±  6%    1.535µ ±  4%  +25.93% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   827.8n ±  3%   1127.5n ± 10%  +36.20% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    1.292µ ± 13%    1.599µ ±  4%  +23.68% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16       400.4n ±  9%    396.0n ±  9%        ~ (p=0.813 n=10)
LoggerNewRecord/10_attributes-16      1.907µ ± 12%    1.737µ ±  6%   -8.91% (p=0.002 n=10)
SimpleProcessorOnEmit-16              251.4n ± 23%    218.5n ±  7%  -13.07% (p=0.001 n=10)
geomean                               764.4n          889.2n        +16.32%

                        │     old.txt     │                new.txt                │
                        │       B/s       │      B/s       vs base                │
BatchProcessorOnEmit-16   1093.06Mi ± 11%   76.74Mi ± 11%  -92.98% (p=0.000 n=10)

                                    │   old.txt    │                new.txt                 │
                                    │     B/op     │    B/op      vs base                   │
BatchProcessorOnEmit-16               0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                   417.0 ± 0%      834.0 ± 0%  +100.00% (p=0.000 n=10)
Processor/Batch-16                    627.0 ± 3%     1039.0 ± 2%   +65.71% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    417.0 ± 0%      834.0 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     630.0 ± 2%     1048.0 ± 1%   +66.35% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   465.0 ± 0%      882.0 ± 0%   +89.68% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    649.5 ± 2%     1070.0 ± 2%   +64.74% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16       0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16      610.0 ± 0%      610.0 ± 0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16              416.0 ± 0%      416.0 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                          ²                 +42.47%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │   old.txt    │                new.txt                │
                                    │  allocs/op   │ allocs/op   vs base                   │
BatchProcessorOnEmit-16               0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                   1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/Batch-16                    0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   2.000 ± 0%     3.000 ± 0%   +50.00% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16      4.000 ± 0%     4.000 ± 0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16              1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                          ²               ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                    │   old.txt    │               new.txt               │
                                    │    sec/op    │   sec/op     vs base                │
Processor/Simple-16                   536.0n ± 10%   751.2n ± 9%  +40.14% (p=0.000 n=10)
Processor/Batch-16                    1.068µ ±  3%   1.296µ ± 5%  +21.35% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    557.8n ±  5%   827.2n ± 8%  +48.31% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     1.028µ ±  2%   1.318µ ± 6%  +28.22% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   615.2n ±  2%   887.1n ± 3%  +44.19% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    1.054µ ±  3%   1.406µ ± 6%  +33.33% (p=0.000 n=10)
geomean                               772.7n         1.048µ       +35.60%

                                    │  old.txt   │               new.txt                │
                                    │    B/op    │    B/op      vs base                 │
Processor/Simple-16                   416.0 ± 0%    834.0 ± 0%  +100.48% (p=0.000 n=10)
Processor/Batch-16                    620.5 ± 2%   1049.5 ± 1%   +69.14% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    417.0 ± 0%    834.0 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     617.5 ± 2%   1047.5 ± 1%   +69.64% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   465.0 ± 0%    882.0 ± 0%   +89.68% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    642.0 ± 2%   1085.0 ± 1%   +69.00% (p=0.000 n=10)
geomean                               520.3         949.3        +82.44%

                                    │   old.txt    │               new.txt               │
                                    │  allocs/op   │ allocs/op   vs base                 │
Processor/Simple-16                   1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/Batch-16                    0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   2.000 ± 0%     3.000 ± 0%   +50.00% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
geomean                                          ¹   1.698       ?
¹ summaries must be >0 to compute geomean
Copy link

codecov bot commented Jun 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.6%. Comparing base (4c3ecab) to head (d72d990).
Report is 2 commits behind head on main.

Current head d72d990 differs from pull request most recent head ac568ef

Please upload reports for the commit ac568ef to get more accurate results.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #5469   +/-   ##
=====================================
  Coverage   84.6%   84.6%           
=====================================
  Files        267     267           
  Lines      17825   17825           
=====================================
+ Hits       15084   15086    +2     
+ Misses      2428    2426    -2     
  Partials     313     313           
Files Coverage Δ
sdk/log/batch.go 100.0% <100.0%> (ø)
sdk/log/logger.go 100.0% <100.0%> (ø)
sdk/log/simple.go 100.0% <100.0%> (ø)

... and 1 file with indirect coverage changes

@pellared pellared changed the title [WIP] [PoC] Change Processor.OnEmit to accept a record pointer [PoC] Change Processor.OnEmit to accept a record pointer Jun 3, 2024
@pellared pellared changed the title [PoC] Change Processor.OnEmit to accept a record pointer ] Change Processor.OnEmit to accept a record pointer Jun 3, 2024
@pellared pellared changed the title ] Change Processor.OnEmit to accept a record pointer Change Processor.OnEmit to accept a record pointer Jun 3, 2024
Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Record type is not concurrent safe. This new pattern almost guarantees concurrent access to the same record. This seems like a design issue that needs to be addressed before this can be accepted.

@pellared
Copy link
Member Author

pellared commented Jun 3, 2024

This new pattern almost guarantees concurrent access to the same record.

Could you please elaborate on it?

I added a following comment for Processor.Emit:

Implementations must not modify the record after OnEmit returns.

Maybe I could change it to:

Implementations must not modify the record asynchronously as [Record] is not concurrent safe.

?

EDIT: I have updated the comment to:

Implementations may synchronously modify the record so that the changes are visible in the next registered processor.
Implementations must not modify the record asynchronously as [Record] is not concurrent safe.

Notice that the specification does not require log records to be concurrent safe.
Moreover, from open-telemetry/opentelemetry-specification#4065 (comment):

"A LogRecordProcessor may freely modify logRecord for the duration of the OnEmit call. If logRecord is needed after OnEmit returns (i.e. for asynchronous processing) only reads are permitted" (see spec). The intent of this line is clarify that log processors can only mutate log records synchronously, since if they hold onto the record and mutate asynchronously, the mutation may conflict with one from a subsequent processor. See the open-telemetry/opentelemetry-specification#2955 for more context, which was resolved by this statement.

The log records in other languages are not concurrent safe.

@pellared pellared requested a review from MrAlias June 3, 2024 16:01
@MrAlias
Copy link
Contributor

MrAlias commented Jun 3, 2024

I added a following comment for Processor.Emit:

Implementations must not modify the record after OnEmit returns.

Commenting something like this is not going to be a solution. There are going to be many user use cases that will violate this and having a comment is not an adequate design choice. The point of passing a reference is use the side-effects of modifying the passed record.

Additional to users use-case, we by default provide a batch processor. What happens when an processor modifies a record it was passed to a batch processor?

@pellared
Copy link
Member Author

pellared commented Jun 3, 2024

Additional to users use-case, we by default provide a batch processor. What happens when an processor modifies a record it was passed to a batch processor?

That is a very good use case.

I think the solution could be adding a sync.Mutex to Record and then we would need to always use *Record instead of Record, right?

@MrAlias
Copy link
Contributor

MrAlias commented Jun 3, 2024

I think the solution could be adding a sync.Mutex to Record and then we would need to always use *Record instead of Record, right?

If we want to go with this design, then yeah that sounds correct.

This will come with the issue that the Record would have a copy-lock issue that no use could copy the record once it is used.

@pellared
Copy link
Member Author

pellared commented Jun 3, 2024

Additional to users use-case, we by default provide a batch processor. What happens when an processor modifies a record it was passed to a batch processor?

We could also create a Clone of the record in batch span processor so that the further processors will not modify the record.

I will create one PR which does that and a separate which adds a mutex to the log record.

@MrAlias
Copy link
Contributor

MrAlias commented Jun 3, 2024

Additional to users use-case, we by default provide a batch processor. What happens when an processor modifies a record it was passed to a batch processor?

We could also create a Clone of the record in batch span processor so that the further processors will not modify the record.

I will create one PR which does that and a separate which adds a mutex to the log record.

Like I was saying, that is our implementation. We do not control all the possible user implementations. We should provide a design that works easily and intuitively for those as well.

@pellared
Copy link
Member Author

pellared commented Jun 3, 2024

Commenting something like this is not going to be a solution. There are going to be many user use cases that will violate this and having a comment is not an adequate design choice.

I just want to notice that this is a problem that we already have as we have the following comments:

Before modifying a Record, the implementation must use Record.Clone to create a copy that shares no state with the original.

I think that only making the log record concurrent safe will get rid of this problem. I am afraid of the performance consequences of adding a sync.Mutex to each log record, but maybe these are misplaced fears.

@MrAlias
Copy link
Contributor

MrAlias commented Jun 3, 2024

I think that only making the log record concurrent safe will get rid of this problem. I am afraid of the performance consequences of adding a sync.Mutex to each log record, but maybe these are misplaced fears.

FWIW, adding the mutex has small overhead. It is just its use that will have computational performance impact.

@pellared pellared changed the title Change Processor.OnEmit to accept a record pointer [PoC] Change Processor.OnEmit to accept a record pointer Jun 4, 2024
@pellared pellared marked this pull request as draft June 4, 2024 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log Records passed to log processor OnEmit is non-mutable
2 participants