Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[statsd] Convert counters to floats #1978

Closed
kostasb opened this issue Nov 1, 2016 · 14 comments · Fixed by #15371
Closed

[statsd] Convert counters to floats #1978

kostasb opened this issue Nov 1, 2016 · 14 comments · Fixed by #15371
Labels
area/statsd feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin help wanted Request for community participation, code, contribution size/m 2-4 day effort

Comments

@kostasb
Copy link

kostasb commented Nov 1, 2016

Proposal:

Emit statsd counter values as float instead of int in Line protocol.

Current behavior:

Counters are emitted as integers while gauges are emitted as floats.
This causes a type conflict in InfluxDB when trying to store a counter and a gauge in the same field. It also makes it impossible to send floating increments to counters.

Desired behavior:

Convert statsd counters from ints to floats in Line Protocol.

Because this is a breaking change, it makes sense to make it configurable in the statsd plugin level, with default value to false:

# WARNING: This is a breaking change that will default to true in a future release.
# float_counters = false

@sparrc

@sparrc sparrc added this to the 1.2.0 milestone Nov 2, 2016
@sparrc sparrc added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Nov 3, 2016
@sparrc sparrc modified the milestones: 1.2.0, 1.3.0 Jan 10, 2017
@danielnelson danielnelson modified the milestones: 1.3.0, 1.4.0 Apr 20, 2017
@rijnhard
Copy link

rijnhard commented May 11, 2017

we took a few weeks to debug this issue. It was brutal.
where the documentation clearly states:

This also allows for mixed types in a single line: foo:1|c:200|ms

in practice if you write gauges & counters & timers to the same measurement it causes the data type
conflicts in influx.

Off the top of my head, with no idea of whats best these are the possible solutions:
using example foo:1|c:200|ms

  • add the metric_type as a suffix to the measurement name e.g. foo_counter value = 1(integer)
    • what about backwards compatibility?
  • cast everything to float and leave it at that. so any statsd metric will be a float in influx.
    • this will probably cause a lot of existing tables to suddenly start dropping writes and probably not be the safest option
  • write different metric types to different fields (since fields aren't indexed it suits this)
    • but I'm confused by the field data type documentation, whether a measurement must have the same data type for all fields in a measurement (which would render this option impossible) or not.

@danielnelson
Copy link
Contributor

A measurement can have fields with different data types, but the type must remain the same for each field.

The example metric foo:1|c:200|ms should be fine, since all of the fields on the single measurement are independent:

foo,metric_type=timing count=1i,90_percentile=1,mean=1,stddev=0,upper=1,lower=1 1494526954000000000
foo,metric_type=counter value=1i 1494526954000000000

But if you send a gauge and counter it is a problem foo:2|g:1|c:

foo,metric_type=gauge value=2 1494527106000000000
foo,metric_type=counter value=1i 1494527106000000000

Since these have the same field value, the second reading will overwrite the first, but since they type has changed you will get an error.

The proposal above will have minimal impact since it is opt in, but it may be difficult to transition to floats if you have existing data, and you still overwrite the data point by mistake.

I think ideally the metric would look something like this:

foo count=1
foo gauge=2
foo ms=1 90_percentile=1,mean=1,stddev=0,upper=1,lower=1

I think statsd is technically supposed to be ints, though the spec is not very tight, so it either makes sense to send in floats or have it be configurable by metric type.

@jryberg
Copy link

jryberg commented Jul 7, 2017

Was this ever resolved? I have some issues where some metrics from statsd are handled as int in some cases but mostly as float (for the same metric). This causing issues since I'm loosing stats.

@danielnelson
Copy link
Contributor

@jryberg It has not been resolved yet. The best way to deal with this currently is not to send any stats that map to the same measurement but have different types.

@skyrocknroll
Copy link

We encountered this issue. @sparrc Your attention would to this would be lot more helpful. I can raise an pull request if you are ok with the @kostasb approach.

@sparrc
Copy link
Contributor

sparrc commented Jul 10, 2017

I'm not the maintainer of this project anymore, please ping @danielnelson instead

@danielnelson
Copy link
Contributor

@skyrocknroll If you would like to implement the solution kostasb suggested we can merge it, but don't add the bit about changing the type in future versions, I'm not sure if this is the right approach.

I think the best solution for storing different metric types with the same measurement is to rename the default field names. To do this, we need make a larger change like what I proposed above, perhaps introducing a statsd_format_version=2 option.

@jryberg I should have said this in my last comment, but you can use the template option to set the field names to avoid conflicts, this will allow you to have a measurement with multiple fields.

@thomas-tomlinson
Copy link

what about an option to name the field for counters and gauges? this could default to "value" like it does today but then would allow for a blanket override that works for the various combinations in place today. For example
counter_field = value
gauge_field = gauge_value

@bryanspears
Copy link

Sending a float value counter is cast to an int. This does not match the statsd spec.

foo:123.45|c is stored in influx as 123

Is there a work around for "Float value counters being cast to integer" issue?

@danielnelson
Copy link
Contributor

I don't believe there is a way to workaround the integer counters.

@bryanspears
Copy link

Thanks for the quick reply, @danatinflux

The change and PR seem like they would be straightforward, as the code currently forces int casting, however some comments here mention it would be a breaking change? Not sure how that's the case.

@danielnelson
Copy link
Contributor

In InfluxDB it isn't easy to switch from a integer field type to a float field type, it can only be done at a shard boundary which come around only every several days. So we would need this to be an opt-in feature, at least until the next major release (2.x).

@bryanspears
Copy link

Would it be safe and sane to always cast and store it as a float then? Assuming we wrap this change with opt-in flag.

@danielnelson
Copy link
Contributor

Yes I think so

@sspaink sspaink added help wanted Request for community participation, code, contribution size/m 2-4 day effort labels Oct 26, 2022
powersj added a commit to powersj/telegraf that referenced this issue May 16, 2024
By default counters are stored as ints. However, if a user is using a
value as both a float or a int it can lead to type conflicts. This
maintains the interal representation as an int to avoid messing with the
caching code, but allows the counter to be reported as a float when a
metric is reported.

fixes: influxdata#1978
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/statsd feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin help wanted Request for community participation, code, contribution size/m 2-4 day effort
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants