[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

lalitb · 2023-07-05T23:04:05Z

Changes

This is to further validate the benchmark introduced in #1121, changing the LogRecord to store attributes in Vec instead of OrderMap/IndexMap. The benchmark was done to emit the tokio-tracing log to user-event exporter.
As an example, to emit the error log with below attributes:

error!(
        event_name = "my-event-name",
        event_id = 20,
        user_name = "otel user",
        user_email = "otel@opentelemetry.io",
        login_success =  false
)

With OrderMap:
It takes around ~1000 ns to emit a single event.
With Vector:
It takes around ~700 ns to emit a single event.

The user_events tracepoint was disabled, so no actual export was happening. Which means most of the above overhead was coming from logs SDK (and not the exporter).

The user_events export iterate over the attributes to serialize, and doesn't require to do any lookup on particular key.

These changes come at below cost:

If the number of attributes are large, and exporter need to do direct lookup for particular attribute - the operation would be fast in OrderMap as compared to Vector. I feel this use-case is somewhat rare.
The duplicate key detection need to be done by application/instrumentation library before writing attributes.

The PR is raised to further discuss if these changes would be good for log signal, or any better suggestion. Another option I thought was to have SDK support both data structures, and let user select which one to use. Something like this (not tested):

enum AttributeData<K, V> {
    HashStore(HashMap<K, V>),
    VectorStore(Vec<(K, V)>),
}

 
impl<K, V> AttributeData<K, V> {
    fn iterator(&self) -> Box<dyn Iterator<Item = &(K, V)>> {
        match self {
            AttributeData::HashStore(index_map) => {
                Box::new(index_map.iter().map(|(k, v)| &(k, v)))
            }
            AttributeData::VectorStore(vector) => Box::new(vector.iter()),
        }
    }

  fn insert(&mut self, key: K, value: V) {
        match self {
            AttributeData::HashStore(index_map) => {
                index_map.insert(key, value);
            }
            AttributeData::VectorStore(vector) => {
                vector.push((key, value));
            }
        }
    }
}

pub struct LogRecord {
    // existing fields ..

    /// Additional attributes associated with this record
    pub attributes: Option<AttributeData<Key, AnyValue>>,
}

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

djc

Why was an OrderMap chosen in the first place? Do we do any lookups/deduplication on this data?

shaun-cox · 2023-07-06T13:57:36Z

Why was an OrderMap chosen in the first place? Do we do any lookups/deduplication on this data?

I think it was a result of #794

lalitb · 2023-07-06T21:16:06Z

Do we do any lookups/deduplication on this data?

There are no lookups in the log SDK. The lookups could be part of any external exporters, but there shouldn't be significant difference in complexity - o(n) vs o(1) - for smaller number of attributes.
With this change, the deduplication would now be the responsibility of the instrumented library/application. The application needs to ensure that all the attributes are unique.

codecov · 2023-07-10T03:51:09Z

Codecov Report

Patch coverage: 28.5% and project coverage change: -0.6 ⚠️

Comparison is base (3209577) 49.8% compared to head (6093b5b) 49.3%.

❗ Current head 6093b5b differs from pull request most recent head 5554e19. Consider uploading reports for the commit 5554e19 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1142     +/-   ##
=======================================
- Coverage   49.8%   49.3%   -0.6%     
=======================================
  Files        171     175      +4     
  Lines      20171   20464    +293     
=======================================
+ Hits       10061   10101     +40     
- Misses     10110   10363    +253

Impacted Files	Coverage Δ
opentelemetry-api/src/global/trace.rs	`30.6% <0.0%> (+2.6%)`	⬆️
opentelemetry-api/src/logs/record.rs	`0.0% <0.0%> (ø)`
opentelemetry-api/src/trace/noop.rs	`54.8% <0.0%> (+2.5%)`	⬆️
opentelemetry-api/src/trace/tracer_provider.rs	`42.1% <0.0%> (-57.9%)`	⬇️
opentelemetry-appender-tracing/src/layer.rs	`0.0% <0.0%> (ø)`
...elemetry-contrib/src/trace/exporter/jaeger_json.rs	`0.0% <ø> (ø)`
opentelemetry-jaeger/src/exporter/config/agent.rs	`32.6% <0.0%> (ø)`
opentelemetry-jaeger/src/exporter/mod.rs	`57.5% <ø> (ø)`
opentelemetry-jaeger/src/exporter/uploader.rs	`18.1% <ø> (ø)`
opentelemetry-otlp/src/metric.rs	`0.0% <0.0%> (ø)`
... and 9 more

... and 12 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

opentelemetry-api/src/logs/record.rs

use vector for attributes

f72fb9e

lalitb requested a review from a team July 5, 2023 23:04

djc approved these changes Jul 6, 2023

View reviewed changes

shaun-cox approved these changes Jul 6, 2023

View reviewed changes

lint errors

edd61b1

TommyCpp reviewed Jul 10, 2023

View reviewed changes

opentelemetry-api/src/logs/record.rs Show resolved Hide resolved

lalitb added 2 commits July 10, 2023 00:20

add comment

6093b5b

fix lint

5554e19

lalitb changed the title ~~To discuss - [Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes~~ [Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes Jul 10, 2023

TommyCpp merged commit 1f1a4fe into open-telemetry:main Jul 11, 2023

cijothomas mentioned this pull request Oct 3, 2023

SpanAttribute key deduplication #1284

Closed

cijothomas mentioned this pull request Oct 11, 2023

SpanAttributes modified to use Vec instead of OrderMap/EvictedHashMap #1293

Merged

cijothomas mentioned this pull request Nov 18, 2023

Improve cost of creating AttributeSets #1379

Merged

4 tasks

cijothomas mentioned this pull request Mar 12, 2024

logs: Allow duplicate keys open-telemetry/opentelemetry-specification#3931

Closed

MrAlias mentioned this pull request Mar 12, 2024

KeyValueList and log attributes with duplicate keys as undefined behavior open-telemetry/opentelemetry-proto#533

Closed

pellared mentioned this pull request Mar 13, 2024

Change map to key-value pair collection in Logs Data Model open-telemetry/opentelemetry-specification#3938

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

lalitb commented Jul 5, 2023

djc left a comment

shaun-cox commented Jul 6, 2023

lalitb commented Jul 6, 2023

codecov bot commented Jul 10, 2023 •

edited

Loading

[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

Conversation

lalitb commented Jul 5, 2023

Changes

Merge requirement checklist

djc left a comment

Choose a reason for hiding this comment

shaun-cox commented Jul 6, 2023

lalitb commented Jul 6, 2023

codecov bot commented Jul 10, 2023 • edited Loading

Codecov Report

codecov bot commented Jul 10, 2023 •

edited

Loading