Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sink level metric for end to end latency #3494

Closed
graytaylor0 opened this issue Oct 12, 2023 · 10 comments · Fixed by #3546
Closed

Sink level metric for end to end latency #3494

graytaylor0 opened this issue Oct 12, 2023 · 10 comments · Fixed by #3546
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@graytaylor0
Copy link
Member

Is your feature request related to a problem? Please describe.
Users of Data Prepper are not given an accurate look at the time it takes for Events to be processed through a Data Prepper pipeline, from the time the Event was received by the source, to the time that it is successfully sent to the sink.

Additionally, users would like to know the delay from a timestamp that is in the Events already and get an idea of the end to end latency from a source in an external system to Data Prepper to being successfully sent to the sink.

Describe the solution you'd like
A sink-level metric (ideally this full metric name would include plugin-id but since we don't have that it will have to be aggregated for now between sinks of the same type in the same sub-pipeline) named pipelineEventLatency. This metric will track the time between when the Event was created (the from_time_received timestamp in the EventMetadata) to the time that the Event was successfully sent to the sink. The metric will be calculated by the releaseEvent call made by the sink to acknowledge that it was sent successfully to the sink.

Additionally, for the use case of tracking latency for an external source from a timestamp in the Event, we will add another parameter to the common SinkContext for end_to_end_latency_metric_start_time_key (any way to shorten this name?) which can take the name of a key in the Event to use at the starting time calculation for latency. My thought is that this would result in a second metric, endToEndEventLatency, which will track the latency from the user's external source timestamp to the time that the Event was successfully received by the sink.

Additional context
Add any other context or screenshots about the feature request here.

@kkondaka
Copy link
Collaborator

@graytaylor0 What would be the format of the time in the end_to_end_latency_metric_start_time_key field? Do we have to take a format string? If yes, does it mean we need another field end_to_end_latency_metric_start_time_format?

@graytaylor0
Copy link
Member Author

graytaylor0 commented Oct 23, 2023

for the first iteration I think we could just require it be an ISO_8601 string (which can be converted by date processor)

@dlvenable
Copy link
Member

The metric will be calculated by the releaseEvent call made by the sink to acknowledge that it was sent successfully to the sink.

I agree with this approach. However, we currently don't always use the EventHandle. We will need to make some changes in Data Prepper core to ensure that every Event has an EventHandle. And then update the sinks to all call release and not have a null check on event.getHandle().

we will add another parameter to the common SinkContext for end_to_end_latency_metric_start_time_key (any way to shorten this name?)

This property (end_to_end_latency_metric_start_time_key) is more related to the source and the overall model. So I'm not sure configuring it on the sink is ideal. You'd actually have to duplicate the same value for all sinks. Perhaps this could be a configuration on the pipeline itself.

A few alternate name ideas:

  • source_origination_datetime_key
  • source_origination_timestamp_key
  • origination_datetime_key
  • origination_timestamp_key

for the first iteration I think we could just require it be an ISO_8601 string (which can be converted by date processor)

I agree. We need to improve our date processor and then maybe we can support other formats using that.

@kkondaka
Copy link
Collaborator

Thanks @graytaylor0 @dlvenable. I like origination_timestamp_key. Will use that. So, the time we expect in the value of the origination_timestamp_key is of the ISO_8601 format. Which is UTC time. And Java's Instant.now() also returns UTC time. So, we can use simple Duration.between() API to get the latency.

@dlvenable regarding creating EventHandle for every event, I am thinking of doing it in the Sink when we stop using event and start using eventHandle. Or do you think we just create it in the JacksonEvent construction time with timeReceived from event metadata passed to it?

@dlvenable
Copy link
Member

I am thinking of doing it in the Sink when we stop using event and start using eventHandle.

I think we should update data-prepper-core to guarantee an EventHandle for each Event. It should also allow us to avoid conditions like these:

if (event.getEventHandle() != null) {
bufferedEventHandles.add(event.getEventHandle());
}

@dlvenable dlvenable added this to the v2.6 milestone Oct 24, 2023
@kkondaka
Copy link
Collaborator

@dlvenable. Makes sense. I will create it default with every JacksonEvent constrctor

@graytaylor0
Copy link
Member Author

graytaylor0 commented Oct 24, 2023

You'd actually have to duplicate the same value for all sinks. Perhaps this could be a configuration on the pipeline itself.

This was my initial thought too, but you do lose some granularity on what is skewing this metric when it's at the pipeline level. I guess we have sink metrics that would help you show which one is slower vs faster. @dlvenable You're saying just a single metric even for multiple sub-pipelines? (pipeline level instead of sub-pipeline level)

@dlvenable
Copy link
Member

@graytaylor0 , I'm suggest that the customer configure the key to use as the "origination" time at the pipeline level. All sinks would then calculate the time based on the value in the Event for that key and the time it is sent to the sink. So each sink may have different metric values. But, the origination time is consistently configured.

@kkondaka
Copy link
Collaborator

OK, I am going to add this option to DataPrepperConfiguration.java. How about internal latency, is that turned-on always?

@kkondaka
Copy link
Collaborator

That was not correct, I think the suggestion was to add it to PipelineModel.java. But I think it needs to be at the source level because we may be having multiple sources in future and the "key" may not be data received by all sources. Also typically Events are created in a source and the origination time stamp key be populated in each event/eventHandle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
3 participants