Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Name field necessary in Log Data Model #2074

Closed
tigrannajaryan opened this issue Oct 28, 2021 · 40 comments · Fixed by #2182 or #2271
Closed

Is Name field necessary in Log Data Model #2074

tigrannajaryan opened this issue Oct 28, 2021 · 40 comments · Fixed by #2182 or #2271
Assignees
Labels
release:required-logdatamodel-ga Required for declaring log data model stable spec:logs Related to the specification/logs directory

Comments

@tigrannajaryan
Copy link
Member

The log data model currently has a Name field.

It is not entirely clear why this field is needed.

Some of the example mappings provided use it this way:

  • Syslog - MSGID. Could be an Attribute["syslog.msgid"] instead to match how PROCID is handled.
  • SignalFx Events - EventType. Could be Attributes["com.splunk.signalfx.event_type"] to match how Category is handled.
  • CloudTrail Log Event - errorCode. Unclear where we put this. Attributes["cloudtrail.error_code"]?
  • Google Cloud Logging - log_name. This appears to be incorrectly designed, see Google Cloud Logging specification: log_name should not map to Name #1808

In most of these cases it appears that it could be placed in Attributes. The unclear one is CloudTrail Log Event and probably Google Cloud Logging.

Can we come up with clear justifications of why the Name field must exist and why for example Body is not enough?

@jack-berg
Copy link
Member

jack-berg commented Oct 28, 2021

If logs do end being the data type to represent events as well, then I would prefer having a name field as a differentiator.

New Relic supports the MELT data types: Metrics, Events, Logs, and Traces. The characteristics of traces and metrics are distinct enough that they obviously require data types in OpenTelemetry. Logs and Events are much closer. In New Relic's log data model, logs don't require any fields, but generally have a timestamp, a message, and attributes. In New Relic's event data model, events have a required eventType field, a timestamp, and a bag of attributes. Event's eventType field, and log's message field are really the most distinct difference between them. But behind the scenes logs and events are treated quite differently.

I've been working off the assumption that OpenTelemetry won't have an event data type, and that logs would cover both use cases. And so in order to allow our users to be able to use OpenTelemetry to ingest New Relic events, we could use the presence of the name field, which describes something close to the New Relic log eventType field, to dictate whether the OpenTelemetry log represents a New Relic log or a New Relic event.

Its possible to differentiate between New Relic logs and New Relic events with an attribute, but I like the idea of it being a top level field in order to make it more obvious: an opentelemetry log with a name field is treated as a new relic event; other logs are new relic logs.

@jkwatson
Copy link
Contributor

The name would also be quite useful for modelling for RUM events, which look a lot like logs, but would benefit from having a name as well. Is this better named type for these use-cases, @jack-berg ?

@jack-berg
Copy link
Member

I think type is better than name.

Good call out on the RUM events. I haven't been following those discussions closely, but is the group leaning towards using logs to represent stuff?

@jkwatson
Copy link
Contributor

I think type is better than name.

Good call out on the RUM events. I haven't been following those discussions closely, but is the group leaning towards using logs to represent stuff?

Yes, we are. Point-in-time events (which look pretty much identical to NR's events) are very commonly needed in RUM instrumentation, and logs seems to be a good match. The only thing that is missing is a notion of causality. In tracing, you can have span parenting, but in logs, you only have the loose association via attributes, and logs can't parent other logs (as they don't have ids).

@tigrannajaryan
Copy link
Member Author

In New Relic's log data model, logs don't require any fields, but generally have a timestamp, a message, and attributes. In New Relic's event data model, events have a required eventType field, a timestamp, and a bag of attributes.

@jack-berg it would be very useful if you could add the mapping from New Relic to Otel Log Data to example mappings. It would help with data model discussions. We need more examples to be able to better generalize the data model to cover real life use cases.

I've been working off the assumption that OpenTelemetry won't have an event data type, and that logs would cover both use cases.

Yes, definitely. This was discussed and accepted as the Otel's stance on log and events.

@tigrannajaryan
Copy link
Member Author

I think we need to decide on 2 things:

  1. Is Name or Type a well understood thing? The fact that we are not sure about how to call this does not add confidence that we know what it is meant to represent.
    To help answer this let me re-frame the question. If the Collector receives data from Syslog and sends to New Relic is it expected that Syslog's MSGID field will be populated in the New Relic's eventType field? Or if we receive CloudTrail Log Event and send to SignalFx should CloudTrail errorCode be copied to SignalFx EventType ? Because that's what will happen if we keep this a top-level field with the current mappings. Somehow, this doesn't feel quite right to me.

  2. Assuming it is a well-understood thing should it be a top-level field or it is sufficient to record it as an attribute. If it is rare (but still well understood) then we can define a semantic convention to record it in the attributes. This reasoning was primarily used in the data model design.

Note that not having a Name field does not prevent RUM for example to define what it believes is the "event type" and standardize it as a semantic convention. Similarly New Relic can do that for their events. Again, one of the important questions we need to answer is whether that's the same "event type", i.e. should RUM "event type" go into New Relic as the eventType field?

@jack-berg
Copy link
Member

@jack-berg it would be very useful if you could add the mapping from New Relic to Otel Log Data

Will do 👍 .

Is Name or Type a well understood thing? If the Collector receives data from Syslog and sends to New Relic is it expected that Syslog's MSGID field will be populated in the New Relic's eventType field?

I agree that sounds a bit strange. But is there good reason for the Syslog MSGID field mapping to the name field? Could it not just as easily go in attributes, eg Attributes["syslog.msgid"]?

Is Name or Type a well understood thing?

If name or type is going to be a top level field and be useful for New Relic's purposes, it should be defined as a low cardinality categorization of the log. I can imagine this type of concept would be useful for RUM events as well.

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Nov 3, 2021

On the Name usage - isn't this the place where logger name goes? E.g. value used in LogManager.getLogger(MyClassName.class)

@jkwatson
Copy link
Contributor

jkwatson commented Nov 3, 2021

On the Name usage - isn't this the place where logger name goes? E.g. value used in LogManager.getLogger(MyClassName.class)

I would think that would go into the Instrumentation Library 'name', rather than the individual log entry 'name.

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Nov 3, 2021

On the Name usage - isn't this the place where logger name goes? E.g. value used in LogManager.getLogger(MyClassName.class)

I would think that would go into the Instrumentation Library 'name', rather than the individual log entry 'name.

I wonder if that is not too fine granular though. Also, in certain cases, this includes more information, such as line number:

2021-11-03T18:27:34.964+0100	info	service/collector.go:174	Applying configuration...
2021-11-03T18:27:34.965+0100	info	builder/exporters_builder.go:259	Exporter was built.	{"kind": "exporter", "name": "otlphttp"}
2021-11-03T18:27:34.965+0100	info	memorylimiterprocessor/memorylimiter.go:106	Memory limiter configured	{"kind": "processor", "name": "memory_limiter", "limit_mib": 25000, "spike_limit_mib": 5000, "check_interval": 1}

I think the candidates for name (or Instrumentation Library name) could be service/collector.go:174 (the caller) and so on (BTW in Log Data Model the caller is listed as being stored in attribute with name TBD)

EDIT: I forgot about the idea of having LoggerName which would fit the above use case nicely

@tigrannajaryan tigrannajaryan added the release:required-logdatamodel-ga Required for declaring log data model stable label Nov 4, 2021
@scheler
Copy link
Contributor

scheler commented Nov 9, 2021

Note that not having a Name field does not prevent RUM for example to define what it believes is the "event type" and standardize it as a semantic convention. Similarly New Relic can do that for their events.

Events must have a mandatory name/type to make sense, so if we remove Name field we will be changing it from being mandatory to optional to now hidden. While we can make the Name field mandatory in the API (say: public void addEvent(String name, Attributes attributes) we will need further conventions and validations to ensure there is no other attribute named Name.

Again, one of the important questions we need to answer is whether that's the same "event type", i.e. should RUM "event type" go into New Relic as the eventType field?

I think, yes. @jack-berg could you confirm that the Name field can represent event type that maps to New Relic's eventType field?

@jack-berg
Copy link
Member

@jack-berg could you confirm that the Name field can represent event type that maps to New Relic's eventType field?

In the case of RUM event names, yes, since I believe RUM event names to be a low cardinality categorization.

@tigrannajaryan
Copy link
Member Author

EDIT: I forgot about the idea of having LoggerName which would fit the above use case nicely

Yes, that's a separate issue. Let's not conflate with this one.

@djaglowski
Copy link
Member

This was discussed in the Log SIG today. It was agreed that the field should remain. A clarification that cardinality should be limited will be proposed to the data model by @jack-berg. This issue is expected to be closed once that language is added.

tigrannajaryan pushed a commit that referenced this issue Dec 8, 2021
Clarifies that log name field is a low cardinality event classifier. 

Resolves #2074.
@jack-berg
Copy link
Member

@tigrannajaryan after further consideration with some folk at New Relic, I think we should re-open this issue. TL;DR is that I think we should actually remove name altogether.

I thought it would be useful to have name as a low cardinality event type. New Relic supports the notion of custom events, and the inclusion of name would be a signal that you want this log treated as a custom event. Part of the problem is that the definition we landed on isn't clear on how backends should interpret name's inclusion:

Short low cardinality event type that does not contain varyingparts. Name describes what happened (e.g. "ProcessStarted"). Recommended to be no longer than 50 characters. Typically used for filtering and grouping purposes in backends. This field is optional.

Looking at the example mappings in the appendix, Syslog sets name to be MSGID. Does New Relic or other vendors want to treat syslogs differently than other logs? Probably not. Same argument applies to windows event log, which maps EventId to name. Without clearly defining when exactly you should include name, it might be included randomly and backends keying off it might misinterpret user intent.

The other problem is that using the value of a field as a marker to backends seems to be inconsistent with conventions in this project. More often, OpenTelemetry uses the presence of an attribute key as a marker. Let's take the RUM events use case as an example. Would it be better to say that name=BrowserWindowLoad, or Attributes["browser.event"]=WindowLoad?

I think the attributes solution is stronger because it encodes information in a way that's easier to operate on: A backend can use the presence of an attribute called browser.event as a marker that special processing should be performed. And the actual name of the event (WindowLoad) isn't burdened with the browser prefix.

To support our New Relic custom event use case, we might introduce our own New Relic semantic convention that says if you include Attributes["newrelic.event_type"]=Foo, that we'll treat the record as a custom event of type Foo. This provides a much clearer signal from the user about how they want the data to be processed than name currently does.

So in conclusion, I believe that both the RUM event use case and the New Relic custom event use case are better served with attributes. Because those were the use cases that motivated the retention of name, I think we should remove it.

/cc @alanwest @martinkuba

@scheler
Copy link
Contributor

scheler commented Dec 17, 2021

@jack-berg Custom events should be officially supported by Otel via "generic events" and so there should be an attribute name for this purpose that does not include vendor name.

@tigrannajaryan
Copy link
Member Author

@jkwatson you were arguing in favour of keeping the Name field. Can you reply to #2074 (comment)

@djaglowski
Copy link
Member

@jack-berg's perspective makes sense to me.

Moreover, I'd argue we haven't successfully established a clear argument that this should be a top-level field. The data model provides the following guidance on determining whether or not a top-level field is justified:

When designing this data model we followed the following reasoning to make a decision about when to use a top-level named field:

The field needs to be either mandatory for all records or be frequently present in well-known log and event formats (such as Timestamp) or is expected to be often present in log records in upcoming logging systems (such as TraceId).

The field’s semantics must be the same for all known log and event formats and can be mapped directly and unambiguously to this data model.

A case could be made that the first condition is satisfied, but I think the second condition is a stretch.

To Tigran's original point, there are several different ways in which the field would be used, and in many cases it will not be used at all. Semantic conventions would appear to be a better fit here, allowing each use case its own expectations for the information that might have been placed in this field.

Finally, if we're unsure whether to keep the field, I think we should remove it now. An additive change later is far preferable to supporting a field we no longer want to have.

@jkwatson
Copy link
Contributor

My opinion: If we're going to use the log data model for modeling "events", then the name field (or equivalent) is necessary to identify the type of event. Yes, we could use an attribute for this, but that will mean a much more annoying experience for people who want to distinguish normal logs from events, IMO.

I don't feel super strongly about this, but having a really solid event API for OpenTelemetry would, I think, be a great thing, and if it's one thing events have, it's a name.

@alanwest
Copy link
Member

Yes, we could use an attribute for this, but that will mean a much more annoying experience for people who want to distinguish normal logs from events, IMO.

Can you elaborate a little more? My gut was to think this at first too, but I'm becoming less convinced. What's the difference if I were to inspect the value of the name field on a log record versus the presence of an attribute to distinguish things from what I might consider a normal log?

Also, I share the concern of others that the name field is currently being used for entirely disparate things. e.g. the Syslog MSGID and a notion of an "event type" seem very different to me.

@scheler
Copy link
Contributor

scheler commented Dec 20, 2021

@alanwest the backend processing is different for logs and events and we need a simple way to distinguish between the two. If we are using an attribute for naming events, we could designate a specific attribute key, say "event_name", to identify the log_record as being an event. However, @jack-berg also talked about using the full event name as the attribute key - that makes it hard to distinguish events from logs.

Note that having the name field will not help the above if Syslog uses this field. Since we are talking about removing the field entirely, why not repurpose it solely for use with events?

Otherwise, we will have to designate another attribute 'record_type' to distinguish between logs and events.

@tigrannajaryan
Copy link
Member Author

the backend processing is different for logs and events and we need a simple way to distinguish between the two

@scheler @jkwatson According to Otel spec there is no difference between logs and events. They are totally synonymous words for the exact same concept. It is fine if you want to have this distinction in your backend, but there should not be an expectation that they are going to be conceptually different from Otel perspective, with a specific top-level field allowing to distinguish logs from events. If this is what you expect to get from the Name field then this is a totally wrong expectation. Don't use the presence of the Name field as an indicator that the particular LogRecord is an "event" (whatever you expect that to mean).

If you need some LogRecord to be categorized as an "event in Backend Foo World" then define a semantic convention for that and make sure to include that attribute in the LogRecord.

@tigrannajaryan
Copy link
Member Author

the name field (or equivalent) is necessary to identify the type of event

@jkwatson Can you show some examples of "type of event"? Is this going to be a value from a finite enumeration defined in Otel?

@jkwatson
Copy link
Contributor

the name field (or equivalent) is necessary to identify the type of event

@jkwatson Can you show some examples of "type of event"? Is this going to be a value from a finite enumeration defined in Otel?

It will not be a finite enumeration, no. There will be some standard event names, but this is also something that users can create themselves.

If the logs data model doesn't have a way to identify events vs normal logs, perhaps the logs are not the right way to model events.

@djaglowski
Copy link
Member

If the logs data model doesn't have a way to identify events vs normal logs, perhaps the logs are not the right way to model events.

It's not clear to me how the Name field is used to differentiate between normal logs and events. If we keep the field, are logs allowed to use it? Aren't we then just talking about establishing semantic conventions that apply to this one particular field?

@tigrannajaryan
Copy link
Member Author

If the logs data model doesn't have a way to identify events vs normal logs, perhaps the logs are not the right way to model events.

@jkwatson Logs and Events are the same thing in OpenTelemetry spec. Please read this https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/overview.md#events-and-logs

It seems like you have some other concept of "event" in your mind. It would be useful to explain what you want to model in more details and clarify how it is different from the current definition of the Event/Log Record in Otel.

@jkwatson
Copy link
Contributor

There is a semantic difference between application logs and user initiated events in the RUM use case. Back ends/analysis tools need to treat them separately. They will need to be routed differently to different places.

I don't know that the name field is useful for this (it probably isn't). But there needs to be some efficient way for that routing to occur and if there isn't, the RUM event will indeed need to be a new signal, which I believe no one wants at this point.

@tigrannajaryan
Copy link
Member Author

@jkwatson can we record the RUM event name in a LogRecord attribute rum.event_name or similar? Routing logic can then lookup this attribute and send it to RUM destination if the attribute is present. Generally, rum.* can be also used to record other RUM-specific information (if there is any), so all information can be grouped under one namespace.

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Dec 21, 2021

Also, if we want to discern between different types of records (such as events vs logs), perhaps those could be grouped at the InstrumentationLibraryLogs level? E.g. could one anticipate a specific InstrumentationLibrary.Name in case of RUM events?

@scheler
Copy link
Contributor

scheler commented Dec 21, 2021

Some products may want to make a distinction between Events collected from certain sources and Logs collected from other sources. OpenTelemetry believes that there is nothing inherently different between log records and events from data modeling perspective, the differences are in the sources themselves. Thus where it matters the products should make that distinction based on the source of the data rather than attempt to arbitrarily categorize the data as events vs logs.

@tigrannajaryan the above section from the otel spec is questionable. Are we sure the same resource cannot emit both logs and events? This sounds restrictive. We are talking only of events in the context of RUM currently but why exclude the possibility of logs in future?

can we record the RUM event name in a LogRecord attribute rum.event_name or similar?

This would work. If we are alluding to using attribute to distinguish between events and logs, then the next thing to discuss would be whether to use a fixed attribute key for all events (rum.event_name) or use a broad event category for the attribute key (rum.browser.event).

@jkwatson
Copy link
Contributor

To summarize extensive discussion at the spec SIG meeting today:

  1. The client side telemetry group believes that some sort of Event structure is very important in modeling the telemetry from mobile apps, web apps, etc. These events are primarily a name with an optional set of attributes, along with a timestamp.
  2. Unfortunately, the log data model doesn't provide a good way to model the causality from event -> spans/metrics/events.
  3. In order to model the causality, the client telemetry group will discuss the possibility of using zero-duration spans as a better way to model client side Events. If that is adopted, that group will probably not be concerned with the name field in the log data model, and will probably propose another way to distinguish zero-duration spans ("events") from standard spans [one possible proposal is a new span Kind of EVENT which might lead to confusion with log-oriented events].

@djaglowski
Copy link
Member

  1. In order to model the causality, the client telemetry group will discuss the possibility of using zero-duration spans as a better way to model client side Events. If that is adopted, that group will probably not be concerned with the name field in the log data model, and will probably propose another way to distinguish zero-duration spans ("events") from standard spans [one possible proposal is a new span Kind of EVENT which might lead to confusion with log-oriented events].

@jkwatson Is there any update on this from the client telemetry group?

@jkwatson
Copy link
Contributor

  1. In order to model the causality, the client telemetry group will discuss the possibility of using zero-duration spans as a better way to model client side Events. If that is adopted, that group will probably not be concerned with the name field in the log data model, and will probably propose another way to distinguish zero-duration spans ("events") from standard spans [one possible proposal is a new span Kind of EVENT which might lead to confusion with log-oriented events].

@jkwatson Is there any update on this from the client telemetry group?

We're still discussing it. Things got a bit slowed down by the holidays, but we are working on it, leaning toward using zero-duration spans currently.

@jack-berg
Copy link
Member

@jkwatson - we discussed this at the Log SIG today and there doesn't seem to be strong reason to keep the field:

  • The client SIG is leaning towards using zero-duration spans to satisfy their use case.
  • If the client SIG chooses to use logs instead, they can still use attributes with semantic conventions to achieve the desired effect. I've argued attributes are a better approach because they allow you to use the attribute key to "namespace" the type of event and allow backends to conditionally change their processing logic. You counter that having an event API would be nice, and events definitely include a name. But it doesn't seem wise to keep a field around based on a speculative API, since it could be the case that if / when an event API comes about, events may have both a type and a namespace.
  • Dropping name is low risk since if we eventually decide its valuable (e.g. to accommodate an event API), we can add it back with a backwards compatible change.
  • Finally, as argued by @djaglowski, the data model requires that top level field "semantics must be the same for all known log and event formats and can be mapped directly and unambiguously to this data model". Without having these well defined semantics, name fails to meet the criteria for a top level field.

@jjharr
Copy link

jjharr commented Mar 3, 2022

I sincerely apologize for being late to comment on this issue, and I rarely comment on forums. But I feel compelled to complain about this, futile as it might be. There are really good reasons NOT to make this change:

  1. While I realize that OpenTelemetry is not 1.0 yet, there are actually a lot of people using Open Telemetry. And this change hard-breaks all their code, for no reason than "we're not sure that people will need it".

  2. The original intent of the field was as a low cardinality field. This was mentioned in the discussions above, but not really addressed adequately. My company, and some of our very large customers, use it in exactly that way. They keep a Grafana dashboard that is a histogram of error types. It is an extremely important dashboard. You can only create a histogram like that if you have a low cardinality field. This is a very broad use case. Yes, you can theoretically do that if you just keep it as an attribute, but it increases processing overhead, which matters at the volumes we deal with.

  3. Finally, and by far most importantly in the long term, this change completely undermines good semantic logging patterns. We just barely, after great effort, trained our developers to use fixed strings (instead of interpolated strings) as log messages and augment those fixed strings with context-specific attributes. That pattern makes log searchability vastly better. Demoting name to an attribute completely undermines that pattern. I cannot over-emphasize how important this is. Pretty much every logging API in the world allows the developer to add a single message, and if this message is not cardinality constrained, then developers will just use interpolated strings like they always have. Nobody is going to enter a high-cardinality value and then a low cardinality one for attributes. So by demoting name to attributes, all the benefits of low-cardinality have been discarded. Since Open Telemetry is on track to become a de-facto logging standard, it is my view that by making this change, Open Telemetry is losing the opportunity to meaningfully, and positively, change how zillions of developers in next many years do their logging.

Please reconsider this change.

@tigrannajaryan
Copy link
Member Author

@jjharr it is way too late to reconsider removal of the Name field. It is done for all intents and purposes, removed from the specification, deprecated in the protocol definition, deleted from the implementation in the Collector.

We can consider adding the Name field back. I am open to discuss it. It will be a backward compatible, additive change that we can consider.

Since you feel strong about it I suggest that you open a new issue in this repository and include the justification that you posted in your comment (and any other additional justifications that you can think of) and we can take it from there. It would be great if you could also join Otel Spec SIG meetings and Otel Log SIG meetings to advocate for this.

@epsteina16
Copy link

I'd like to echo @jjharr comment.

This issue just came up for me as well since we have implemented a new API that customers are using. Our API is based off of the OpenTelemetry log data model and we've placed a significant focus on the Name field.

There are two arguments made in this discussion that I would like to challenge:

Dropping name is low risk since if we eventually decide its valuable (e.g. to accommodate an event API), we can add it back with a backwards compatible change.

I understand that OpenTelemetry is in it's early stages but we really need to consider backwards compatibility. Once you have others working off of this specification a change like this IS high risk. This is a breaking change for us.

Finally, as #2074 (comment) by @djaglowski, the data model requires that top level field "semantics must be the same for all known log and event formats and can be mapped directly and unambiguously to this data model". Without having these well defined semantics, name fails to meet the criteria for a top level field.

I completely agree that it is important for existing log formats to map well into the log data model. This excludes the category of new applications that can be written with better logging practices in mind. I strongly believe the Name field should be recommended for use by new logging APIs and new applications because of the low cardinality aspect. Just because other log formats don't have a "Name" equivalent doesn't mean it should be dropped from the log data model. It can just be Optional and left out of example mappings.

@jjharr
Copy link

jjharr commented Mar 3, 2022

@tigrannajaryan Thanks for the guidance. I'll open a new issue, and I will also try to start attending those meetings.

@tigrannajaryan
Copy link
Member Author

@jjharr when you open the issue please post the link in this thread too.

@epsteina16 please also add your supporting comments to the issue once it is opened.

@jjharr
Copy link

jjharr commented Mar 3, 2022

Related discussion in issue 2398

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:required-logdatamodel-ga Required for declaring log data model stable spec:logs Related to the specification/logs directory
Projects
None yet
10 participants