Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send metrics from oldest to newest, always #5633

Closed
PierreF opened this issue Mar 26, 2019 · 8 comments · Fixed by #7814
Closed

Send metrics from oldest to newest, always #5633

PierreF opened this issue Mar 26, 2019 · 8 comments · Fixed by #7814
Assignees
Labels
area/agent feature request Requests for new plugin and for new features to existing plugins rfc Request for comment - larger issues that require clarification
Milestone

Comments

@PierreF
Copy link
Contributor

PierreF commented Mar 26, 2019

Feature Request

Proposal:

Telegraf should always send metrics point from the older to the newer. During "normal" behavior and while catch-up backlog if an output become slow.

Current behavior:

Since #5287, when buffer is not empty and contains more than one gather cycle, metrics are sent from newest to older.

Desired behavior:

Batch() should send oldest metrics and metrics in the batch should be in increasing order of age.

We should still drop the oldest metrics if the buffer become full, so #5194 is still fixed

Use case:

Currently metric points are not in consistent order. When buffer is empty, the order is from older to newer, but if buffer start to fill, batch with be in the opposite order (newer to older).

In practice, the buffer while often contains few metrics from previous gather cycle, so it may happen even if output is not down.

At the end, that means that the output could not rely on having metrics ordered, which was (mosly) true before #5287.
Having metrics ordered allow to easily do some transformation on the fly (e.g. difference between two point to compute the rate) and generally make working with time series easier.

If this change seems good, I could come with a PR.

@danielnelson
Copy link
Contributor

I would like the batch to be old to new, but I disagree that the points should always be sent older to newer when doing catch-up. If you have downtime it seems to me that you would want to fill in the latest data immediately and then process the backlog. This way you do not need to wait for current data.

I do think there is an argument for processing the backlog in either order, you might want more recent data first because its more temporally relevant, or perhaps you want it in order for some of the reasons you mentioned.

This is particularly true when using certain outputs, some would prefer old to new ordering, for example the prometheus output or the stackdriver output would always want data in add order. The current ordering is the cause of #5598.

I also think some inputs would prefer their data be sent in order, for example when reading from a queue consumer input old to new seems to make the most sense. I haven't thought about any solutions for this, normally with these plugins you should set a lower batch size and limit the input to the size of the batch using max_undelivered_messages.

What I have been thinking to modify for 1.11 is to provide old-to-new ordering within the batch, and agent level and per output control over backlog order. Ordering in all cases means "add order", not timestamp order. I have some design bits worked out on paper planning a rewrite of the metric buffer code, but that's as far as I have gotten so far.

Currently metric points are not in consistent order. When buffer is empty, the order is from older to newer, but if buffer start to fill, batch with be in the opposite order (newer to older).

Right now they should always be consistently newer to older.

@danielnelson danielnelson added feature request Requests for new plugin and for new features to existing plugins rfc Request for comment - larger issues that require clarification labels Mar 26, 2019
@danielnelson danielnelson added this to the 1.11.0 milestone Mar 26, 2019
@goller goller removed the bug unexpected problem or unintended behavior label Apr 1, 2019
@danielnelson danielnelson self-assigned this May 28, 2019
@danielnelson danielnelson modified the milestones: 1.11.0, 1.12.0 May 28, 2019
@danielnelson
Copy link
Contributor

@PierreF Sorry, I'm going to have to push this to 1.12, I haven't been able to complete the work yet. Still very high on my list of tasks though.

@Legogris
Copy link
Contributor

Legogris commented Jun 5, 2019

Even with the recentish changes to the Stackdriver output, we are still seeing regular out-of-order errors:

Jun 04 20:22:40  telegraf[23336]: 2019-06-04T20:22:40Z E! [outputs.stackdriver] unable to write to Stackdriver: rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: Field timeSeries[140] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.: timeSeries[140]; Field timeSeries[77] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.: timeSeries[77]
Jun 04 20:22:40  telegraf[23336]: 2019-06-04T20:22:40Z E! [agent] Error writing to output [stackdriver]: rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: Field timeSeries[140] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.: timeSeries[140]; Field timeSeries[77] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.: timeSeries[77]
Jun 05 01:22:40  telegraf[23336]: 2019-06-05T01:22:40Z E! [outputs.stackdriver] unable to write to Stackdriver: rpc error: code = Unavailable desc = transport is closing
Jun 05 01:22:40  telegraf[23336]: 2019-06-05T01:22:40Z E! [agent] Error writing to output [stackdriver]: rpc error: code = Unavailable desc = transport is closing
Jun 05 02:22:40  telegraf[23336]: 2019-06-05T02:22:40Z E! [outputs.stackdriver] unable to write to Stackdriver: rpc error: code = Unavailable desc = transport is closing
Jun 05 02:22:40  telegraf[23336]: 2019-06-05T02:22:40Z E! [agent] Error writing to output [stackdriver]: rpc error: code = Unavailable desc = transport is closing
Jun 05 04:22:40  telegraf[23336]: 2019-06-05T04:22:40Z E! [outputs.stackdriver] unable to write to Stackdriver: rpc error: code = Unavailable desc = transport is closing
Jun 05 04:22:40  telegraf[23336]: 2019-06-05T04:22:40Z E! [agent] Error writing to output [stackdriver]: rpc error: code = Unavailable desc = transport is closing
Jun 05 05:22:40  telegraf[23336]: 2019-06-05T05:22:40Z E! [outputs.stackdriver] unable to write to Stackdriver: rpc error: code = Unavailable desc = transport is closing
Jun 05 05:22:40  telegraf[23336]: 2019-06-05T05:22:40Z E! [agent] Error writing to output [stackdriver]: rpc error: code = Unavailable desc = transport is closing

Interestingly enough these errors often seem to occur with exactly 1h intervals. Config as below.

    interval = "1m"
    round_interval = true
    flush_interval = "40s"

When this happens, metrics end up being >5 min late, resulting in triggered alarms and other shenanigans.

@danielnelson Do you think this comes from the same issue or should I open a new one?

@danielnelson
Copy link
Contributor

Can you open a new issue, this looks like something else.

@Legogris
Copy link
Contributor

Legogris commented Jun 6, 2019

Can you open a new issue, this looks like something else.

#5963

@russorat russorat modified the milestones: 1.12.0, 1.13.0 Aug 12, 2019
@sjwang90 sjwang90 modified the milestones: 1.13.0, 1.14.0 Nov 15, 2019
@PhRX
Copy link

PhRX commented Dec 11, 2019

The old-to-new ordering is crucial in some cases. For example, InfluxDB guarantees that if datapoints are written for the same timestamp, the last written value will be the final result.

When am Influx output is coupled to a Kafka input through Telegraf, and the Kafka input contains aggregations - which can contain multiple values for the same timestamp, one for each "recalculation" when new data is processed for the aggregation time window, the ordering must be preserved, or the end result in Influx is uncertain.

#6784 seems to correspond to this bug.

@zak-pawel
Copy link
Collaborator

I have problem with newer-to-older ordering when buffer is not empty and contains more than one gather cycle in TICK Stack.

My scenario:
I send telemetry data to MQTT broker and consume it using MQTT Consumer Input Plugin in Telegraf. MQTT broker is configured in the way assuring that messages are provided to consumer in particular order (timestamp ascending). I use TICKscript to create stateDuration-based metric when metric value exceeds defined threshold. During "normal" behavior everything works as a charm - mentioned metric is added properly to Influx.
Problem starts when (from time to time) I need to send 10k-15k metrics to single MQTT topic at once. MQTT broker delivers all of them to consumer in preserved order but then these fixes insert all batch to Influx in opposite order. Then my TICKscript starts to calculate and insert stateDuration-based metrics starting from newer metrics, they look like this:

time                 state_duration value
----                 -------------- -----
2020-03-26T23:00:00Z -249.5         68
2020-03-26T23:00:30Z -249           68
2020-03-26T23:01:00Z -248.5         67
2020-03-26T23:01:30Z -248           66
2020-03-26T23:02:00Z -247.5         66
2020-03-26T23:02:30Z -247           67
...
2020-03-27T03:07:30Z -2             72
2020-03-27T03:08:00Z -1.5           72
2020-03-27T03:08:30Z -1             72
2020-03-27T03:09:00Z -0.5           72
2020-03-27T03:09:30Z 0              71
2020-03-27T03:10:00Z 0.5            71
2020-03-27T03:10:30Z 1              72
2020-03-27T03:11:00Z 1.5            72
2020-03-27T03:11:30Z 2              72

The newest metric (for which state duration started counting) from batch has state_duration equal 0 and the oldest metric's state_duration is -249.5 (this value is counted here in Kapacitor code). I'd expect that the oldest metric's state_duration is 0 and the newest metric's state_duration is 249.5.

It'd be great to have configuration option in Telegraf for older-to-newer and newer-to-older orderings.

Is there any known workaround or set of configuration options to achieve older-to-newer order? (except downgrading Telegraf to 1.9.2)

@danielnelson danielnelson modified the milestones: 1.15.0, 1.16.0 Jun 26, 2020
@danielnelson
Copy link
Contributor

@PierreF Sorry, I haven't been able to do this like I thought I would be able to. If your offer to open a PR to swapping the order of the batches still stands, so that batches are in ascending order, I'd definitely take you up on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agent feature request Requests for new plugin and for new features to existing plugins rfc Request for comment - larger issues that require clarification
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants