Feature/add datadog #915

mstoykov · 2019-01-31T13:34:38Z

Fixes on top of #893

There is no reason for them to be separate

stats/statsd/common/collector.go

codecov-io · 2019-01-31T14:21:59Z

Codecov Report

Merging #915 into master will increase coverage by 0.56%.
The diff coverage is 89.81%.

@@            Coverage Diff             @@
##           master     #915      +/-   ##
==========================================
+ Coverage   69.82%   70.38%   +0.56%     
==========================================
  Files         112      118       +6     
  Lines        8823     9088     +265     
==========================================
+ Hits         6161     6397     +236     
- Misses       2261     2285      +24     
- Partials      401      406       +5

Impacted Files	Coverage Δ
cmd/collectors.go	`0% <0%> (ø)`	⬆️
stats/datadog/collector.go	`100% <100%> (ø)`
stats/statsd/common/api.go	`100% <100%> (ø)`
stats/statsd/collector.go	`100% <100%> (ø)`
lib/options.go	`91.81% <100%> (+0.45%)`	⬆️
cmd/config.go	`41.05% <100%> (+1.26%)`	⬆️
stats/statsd/common/config.go	`100% <100%> (ø)`
stats/statsd/common/collector.go	`84.53% <84.53%> (ø)`
stats/statsd/common/testutil/test_helper.go	`97.95% <97.95%> (ø)`
core/engine.go	`92.99% <0%> (-0.94%)`	⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45f9913...1c93908. Read the comment docs.

na--

Various minor and not-so-minor issues noted inline. There is also a generic issue that this code doesn't have even a single test...

cmd/collectors.go

na-- · 2019-02-01T05:37:15Z

release notes/upcoming.md

+Both are very similar but DataDog has a concept of tags. By default both send on `localhost:8125` and currently only UDP is supported as transport.
+In order to change this you can use the `K6_DATADOG_ADDR` or `K6_STATSD_ADDR` env variable which has to be in the format of `address:port`.
+The new outputs also can add a `namespace` which a prefix before all the samples with `K6_DATADOG_NAMESPACE` or `K6_STATSD_NAMESPACE` respectively. By default the value is `k6.`, notice the dot at the end.
+In the case of DataDog there is an additional configuration `K6_DATADOG_TAG_WHITELIST` which by default is equal to `status, method`. This is a comma separated list of tags that should be sent to DataDog. All other tags that k6 emits are discarded. This is done because DataDog does indexing on top of the tags and some highly variable tags like `vu` and `iter` will lead to problems with the service.


Not sure we should limit the tags only to status and method - all of those tags (with the possible exception of url and maybe name) seem like good candidates to also be included

@robingustafsson Any opinion on this ? I am agnostic just want to note that people do complain that datadog gets with a lot of tags. We probably won't hit it given that we wouldn't have all that many but still

I think we can start with this small set of whitelisted tags and then if it turns out users need more by default we can expand later. The only other tags that I would consider useful is the url and name that @na-- thinks should be excluded 😄. Given the kind of product that Datadog offers (being a long time customer) I don't believe anyone would really use it as a general purpose load testing result analysis tool. I can see though how you'd use it to store monitoring type of results but then you'd also likely set up custom metrics to track rather than just the default metrics.

@robingustafsson I don't think we should exclude both url and name, only url. Since they are either the same, or the user explicitly set name because url was too dynamic.
To reduce confusion, I think that the default value for TagWhitelist should be proto, subproto, status, method, name, group, check, error, tls_version (i.e. the DefaultSystemTagList, but without any highly variable things (and maybe with error_code instead of error, once that's merged?) )

@na-- Ah ok, misread you. I'm not sure we need such an expansive tag whitelist though to start with for Datadog, but I'm not strongly opposed to it either 🙂Yes, error_code would be preferred over error, once available, if you ask me.

Hmm something else that I realized - I'm not sure if a whitelist is actually the best approach here. It will work for system tags, but any custom user tags will be silently filtered, which isn't the best UX. If I, as a user, explicitly set a certain tag value on an HTTP request, I usually want it and can depend on it later for filtering the data...

In the InfluxDB collector, this is handled by transforming any highly variable tags into (non-indexable) fields. This is great, since user-supplied data won't be lost, while at the same time the InfluxDB server won't be overwhelmed with indexing highly-variable data.

I'm not very familiar with Datadog, but a quick check doesn't reveal any similar functionality there... So if no such functionality exists in Datadog, I wonder whether a blacklist (to block the most highly-variable tags like vu, iter, url, and error (once error_code is merged)) will be more user-friendly than the current whitelist. Or whether we need such an option at all, considering that users can just set the emitted tags directly via the global --system-tags option?

stats/datadog/collector.go

stats/statsd/common/client.go

stats/statsd/common/api.go

stats/statsd/common/collector.go

stats/statsd/common/client.go

stats/statsd/common/collector.go

stats/statsd/common/config.go

…or logging

na--

About unit tests: not sure how much we can test this, but we should at least have some... even on that tag filtering logic

stats/datadog/collector.go

na-- · 2019-02-01T12:47:17Z

stats/statsd/common/collector.go

+	for _, entry := range data {
+		if err := c.dispatch(entry); err != nil {
+			// No need to return error if just one metric didn't go through
+			c.logger.WithError(err).Warnf("Error while sending metric %s", entry.Metric)


This seems like it could be very spammy in some situations... It's better to collect the number of errors and display only a single warning message like "we couldn't dispatch X out of Y metrics to datadog" or something like that. And maybe the individual errors could still be debug messages, so they're not shown by default, but users can see them with verbose mode enabled.

I think that this will lead to people wondering where their metrics are and it will make it harder to debug. If people complain that it is way too spammy in situations which are okay to not be spammy we can always make it less spammy

Come on, this line can generate literally thousands of warning messages - not very useful. I'm not saying to suppress those errors, just to aggregate them and say "1234 metrics weren't sent" every 1 second, instead of 1234 messages every second...

my argument is that knowing what didn't get sent may help you debug it ... hiding the error behind why we couldn't send something and just telling you we couldn't will lead to questions of "why do I have no metrics?" and "what does 'we couldn't sent your metrics' mean?".

If someone is continuously getting 1234 messages, that we could not sent the data, they told us to sent, the error message might help them debug it. In case someone doesn't care that they can't sent the majority of they are metrics ... maybe they should not ask us to do it ?

Read my original comment again - I don't want to totally suppress the messages, just aggregate them by default. If they run k6 with verbose mode enabled, users will see the original error messages, however many there are. We can even suggest turning on the verbose mode in the aggregated error message. Or we can show the first 5 or 10 error messages and just mention that there are 1229 more 😄

I'm fine with a lot of different variants, as long as we don't bury the user in thousands of (likely similar) error messages on the console every second by default, which is terrible UX.

stats/datadog/collector.go

stats/statsd/common/config.go

na-- · 2019-02-01T13:45:31Z

release notes/upcoming.md

@@ -8,6 +8,18 @@ You can now specify a file for all things logged by `console.log` to get written

 Thanks to @cheesedosa for both proposing and implementing this!

+### New result outputs: statsd and DataDog (#915)


Suggestion for slightly expanded release notes, with some minor typos and formatting fixes:

### New result outputs: StatsD and Datadog (#915) You can now output any metrics k6 collects to StatsD or Datadog by running `k6 run --out statsd script.js` or `k6 run --out datadog script.js` respectively. Both are very similar, but Datadog has a concept of metric tags, the key-value metadata pairs that will allow you to distinguish between requests for different URLs, response statuses, different groups, etc. Some details: - By default both outputs send metrics to a local agent listening on `localhost:8125` (currently only UDP is supported as a transport). You can change this address via the `K6_DATADOG_ADDR` or `K6_STATSD_ADDR` environment variables, by setting their values in the format of `address:port`. - The new outputs also support adding a `namespace` - a prefix before all the metric names. You can set it via the `K6_DATADOG_NAMESPACE` or `K6_STATSD_NAMESPACE` environment variables respectively. Its default value is `k6.` - notice the dot at the end. - You can configure how often data batches are sent via the `K6_STATSD_PUSH_INTERVAL` / `K6_DATADOG_PUSH_INTEVAL` environment variables. The default value is `1s`. - Another performance tweak can be done by changing the default buffer size of 20 through `K6_STATSD_BUFFER_SIZE` / `K6_DATADOG_BUFFER_SIZE`. - In the case of Datadog, there is an additional configuration `K6_DATADOG_TAG_WHITELIST`, which by default is equal to `status,method,group`. This is a comma separated list of tags that should be sent to Datadog. All other metric tags that k6 emits are discarded. This is done because Datadog does indexing on top of the tags and some highly variable tags like `vu` and `iter` will lead to problems with the service.

rendered:

New result outputs: StatsD and Datadog (#915)

You can now output any metrics k6 collects to StatsD or Datadog by running k6 run --out statsd script.js or k6 run --out datadog script.js respectively. Both are very similar, but Datadog has a concept of metric tags, the key-value metadata pairs that will allow you to distinguish between requests for different URLs, response statuses, different groups, etc.

Some details:

By default both outputs send metrics to a local agent listening on localhost:8125 (currently only UDP is supported as a transport). You can change this address via the K6_DATADOG_ADDR or K6_STATSD_ADDR environment variables, by setting their values in the format of address:port.

The new outputs also support adding a namespace - a prefix before all the metric names. You can set it via the K6_DATADOG_NAMESPACE or K6_STATSD_NAMESPACE environment variables respectively. Its default value is k6. - notice the dot at the end.

You can configure how often data batches are sent via the K6_STATSD_PUSH_INTERVAL / K6_DATADOG_PUSH_INTEVAL environment variables. The default value is 1s.

Another performance tweak can be done by changing the default buffer size of 20 through K6_STATSD_BUFFER_SIZE / K6_DATADOG_BUFFER_SIZE.

In the case of Datadog, there is an additional configuration K6_DATADOG_TAG_WHITELIST, which by default is equal to status,method,group. This is a comma separated list of tags that should be sent to Datadog. All other metric tags that k6 emits are discarded. This is done because Datadog does indexing on top of the tags and some highly variable tags like vu and iter will lead to problems with the service.

And I still think we should change the default value to be the same as infuxdb's (i.e. all default system tags, minus the ones marked as tagsAsFields (i.e. the most variable ones))

I would like @robingustafsson to weight in on this - I personally have no real opinion apart from at some point @ivoreis decided those are fine I don't know if they've discussed it of github for example

… config and config with the same values

…cklist anything

na--

LGTM

ivoreis and others added 6 commits January 10, 2019 19:52

Add datadog / statsd integration

e35e515

fixup! Add datadog / statsd integration

1ff30ed

statsd/datadog: combine addr and port together

2f3a932

There is no reason for them to be separate

statsd/datadog: refactoring and making tag whitelisting faster

03e262a

statsd/datadog: add namespace for statsd as well

b9fadaf

statsd/datadog: set namespace by default to 'k6.'

6aa196d

mstoykov requested review from robingustafsson and na-- January 31, 2019 13:34

Fix using old name of Threshold.LastFailed

b1bd6ef

mstoykov commented Jan 31, 2019

View reviewed changes

stats/statsd/common/collector.go Outdated Show resolved Hide resolved

mstoykov added 3 commits January 31, 2019 16:00

statsd/datadog: Remove unused threshold

ce3e7e6

statsd/datadog: use string concat instead of Sprintf

d796a0d

statsd/datadog: shrink and simplify internal sample struct

3b19b49

Add release note for the new datadog and statsd collectors

493022a

na-- requested changes Feb 1, 2019

View reviewed changes

mstoykov added 6 commits February 1, 2019 10:27

statsd/datadog: update year in copyright

b6fb058

statsd/datadog: refactor sample struct to be smaller

4eb763c

statsd/datadog: remove summary sending

cb3a384

statsd/datadog: move client making to the initalization of the collector

7730884

stasd/datadog: Remove unneeded types and some connection making refactor

47750bb

statsd/datadog: Make push interval configurable

30b28fa

This was referenced Feb 1, 2019

Add datadog / statsd integration #893

Closed

Return summary sending in datadog collector #916

Closed

mstoykov added 6 commits February 1, 2019 11:47

statsd/datadog: Add/Fix error logging

a746bff

statsd/datadog: More refactoring - fixing visibility and refactor err…

abf2381

…or logging

statsd/datadog: Fix comments and better naming

1cd780b

statsd/datadog: stop using envconfig for setting default values

8af8dc3

Unmarshall TagSet from a list of tags fix #768

884ac1e

statsd/datadog: update the release notes

338ec03

na-- requested changes Feb 1, 2019

View reviewed changes

na-- reviewed Feb 1, 2019

View reviewed changes

statsd/datadog: change json names to camelCase

d3db1e9

na-- mentioned this pull request Feb 1, 2019

Configuration issues #883

Open

mstoykov added 4 commits February 1, 2019 15:57

statsd/datadog: fix not being able to distinguish between the default…

3082d67

… config and config with the same values

statsd/datadog: update the release notes

5d644b5

statsd/datadog: fix TagWhitelist default value ... quotation marks ...

268e594

Add test for TagSet's UnmarhalText

e38b216

mstoykov force-pushed the feature/addDatadog branch from 07ea5da to e38b216 Compare February 4, 2019 08:17

mstoykov added 4 commits February 4, 2019 10:33

statsd/datadog: print less by default when metrics coulnd't be send

96801f3

statsd/datadog: fix error starting with capital letter

84e7813

statsd/datadog: fix merging base statsd config in case of datadog

2828701

statsd/datadog: add base datadog test

c0a3b1f

mstoykov force-pushed the feature/addDatadog branch from 62dbecc to c0a3b1f Compare February 7, 2019 08:23

statsd/datadog: add test for checks

6cfe4d8

mstoykov force-pushed the feature/addDatadog branch from 32187b3 to 6cfe4d8 Compare February 7, 2019 09:37

mstoykov added 2 commits February 7, 2019 11:56

stasd/datadog: add stupid tests

d4cce2e

statsd/datadog: refactor tests

5244715

mstoykov force-pushed the feature/addDatadog branch from 6c83b5a to 5244715 Compare February 7, 2019 11:57

mstoykov added 2 commits February 12, 2019 14:46

Remove vim swap file

3d9b426

statsd/datadog: Change datadog's whitelist to blacklist and don't bla…

4c5e049

…cklist anything

na-- approved these changes Feb 13, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into feature/addDatadog

1c93908

mstoykov merged commit 8bcf39a into master Feb 13, 2019

mstoykov deleted the feature/addDatadog branch February 13, 2019 14:49

na-- mentioned this pull request Aug 12, 2019

K6_SYSTEM_TAGS is returning an error when set #768

Closed

ppcano mentioned this pull request Apr 21, 2020

Datadog integration: send the checks metric. #1403

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/add datadog #915

Feature/add datadog #915

mstoykov commented Jan 31, 2019 •

edited

Loading

codecov-io commented Jan 31, 2019 •

edited by codecov bot

Loading

na-- left a comment

na-- Feb 1, 2019

mstoykov Feb 1, 2019

robingustafsson Feb 8, 2019

na-- Feb 12, 2019

robingustafsson Feb 12, 2019

na-- Feb 12, 2019

na-- left a comment

na-- Feb 1, 2019

mstoykov Feb 1, 2019

na-- Feb 1, 2019

mstoykov Feb 1, 2019

na-- Feb 1, 2019

na-- Feb 1, 2019

na-- Feb 1, 2019 •

edited

Loading

mstoykov Feb 1, 2019

na-- left a comment

		@@ -8,6 +8,18 @@ You can now specify a file for all things logged by `console.log` to get written

		Thanks to @cheesedosa for both proposing and implementing this!

		### New result outputs: statsd and DataDog (#915)

Feature/add datadog #915

Feature/add datadog #915

Conversation

mstoykov commented Jan 31, 2019 • edited Loading

codecov-io commented Jan 31, 2019 • edited by codecov bot Loading

Codecov Report

na-- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

na-- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

New result outputs: StatsD and Datadog (#915)

na-- Feb 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

na-- left a comment

Choose a reason for hiding this comment

mstoykov commented Jan 31, 2019 •

edited

Loading

codecov-io commented Jan 31, 2019 •

edited by codecov bot

Loading

na-- Feb 1, 2019 •

edited

Loading