Support non-indexable high-cardinality metric tags / metadata #2584

na-- · 2022-07-04T11:53:06Z

Currently, the k6 system metric tags are a mix of high-cardinality ones (url, error, vu and iter) and relatively low-cardinality ones (everything else). This has already been a problem multiple times and has forced us to adopt some hacks to work around the issues, e.g. these workarounds in the built-in outputs: 1, 2, 3). On top of that, even the current high-cardinality tags are going be a big problem for the upcoming time series changes (#2580), but we'll need to add even more and higher cardinality tags for things like tracing (#2128), to store the trace and span IDs. Finally, k6 has no information whether user-defined custom tags have high or low cardinality.

So, because of all of these reasons, we'll need to have some way to differentiate between two distinct types of metric sample tags/metadata:

low-cardinality tags that can be used for indexing... so, in the time series mentioned above, in sub-metric thresholds, in certain outputs that do aggregation based on tags, etc.
and tags with high cardinality, containing metadata that's often unique for every measurement / data point

Different outputs will be free to treat these high-cardinality tags/metadata however they want - they can discard it (probably the behavior of outputs like statsd, datadod, prometheus, etc. that work with time series), sample it (e.g. to export some tracing information), transform it (e.g. in influxdb this will correspond to data point fields) or output it directly like they do to tags (e.g. json, csv and other similar outputs can just write it to the disk, since we don't do any indexing or aggregation, just dump raw data). This will finally allow us to deprecate the hacks we currently have.

The exact API changes on how we'll add this for custom tags in k6 are still under consideration, but the currently favored approach is to extend the tags API to support both strings and objects, something like this:

http.get('http://httpbin.test.k6.io/', {
  tags: {
    my_tag: {
      value: "I'm a a non-indexable tag",
      index: false
    },
    more_tag: "backwards-compatible indexable tag",
  }
);

Unfortunately, we will probably have to do a minor breaking change in the k6 core and transform some of the currently problematic (because of their high cardinality) k6 system tags like vu, iter, url into non-indexable ones by default... So thresholds based on them will no longer work. Warnings about this very likely breaking change were added to k6 v0.39.0 (#2583) for scripts that have thresholds on them, and we mention them in the v0.39.0 release notes (#2581).

The text was updated successfully, but these errors were encountered:

na-- · 2022-08-18T15:15:06Z

While working on the latest iteration of this, I realized a few issues with the plan to make url an non-indexed tag... 😞 However, I think I have an easy to implement suggestion for a better alternative 🤞

Here are the problems I saw:

The biggest problem is the k6 cloud actually expects a url tag and we can't unilaterally remove this from k6 😞 and probably k6 users who have already configured other external outputs
The second biggest problem is that we don't emit a name tag for websocket metrics, just a url one, in either the current k6/ws code or in the new xk6-websockets one 😞 so we can't unilaterally say that the url is now going to be non-indexed metadata for all metrics...
The third biggest problem is a UX one, setting a threshold on http_req_duration{url:whatever} is kind of nice and logical, whereas name is a k6 term that people first need to be taught about...
Finally, while the k6/net/grpc API has a name tag, its url tag is actually pretty low-cardinality, so it's not a problem for atlas 🎉

So, my proposal to fix all of these issues is this:

keep url as an normal (index-able) tag, but always set it equal to the value of the name tag, if name was specified (which is basically what we currently do in the cloud output)
if name was not specified, then keep the current behavior and set both url and name to the cleaned-up URL
so, the url and name tag values will always be identical, but users will be able to avoid memory runoff effects from high-cardinality metrics by manually setting the name tag
however, if the URL value is different from the name tag, then we will save the raw high-cardinality URL value in a new non-indexed raw_url metadata entry, so it's still accessible to users that want it (e.g. in the JSON output)

This will leave us with (mostly) sensible defaults and with an easy way for users to avoid high-cardinality metrics. It will also minimize the disruption to users who already have thresholds on url sub-metrics, only some of them will need to adjust them.

The code for this even looks relatively clean 😅

k6/lib/netext/httpext/transport.go

Lines 84 to 108 in 6ed8668

    
           enabledTags := t.state.Options.SystemTags 
        
           cleanURL := URL{u: unfReq.request.URL, URL: unfReq.request.URL.String()}.Clean() 
        
           // After k6 v0.41.0, the `name` and `url` tags  have the exact same values: 
        
           nameTagValue, nameTagManuallySet := ctm.Tags.Get(metrics.TagName.String()) 
        
           if !nameTagManuallySet { 
        
           	// If the user *didn't* manually set a `name` tag value and didn't use 
        
           	// the http.url`` template literal helper to have k6 automatically set 
        
           	// it (see `lib/netext/httpext.MakeRequest()`), we will use the cleaned 
        
           	// URL value as the value of both `name` and `url` tags. 
        
           	ctm.SetSystemTagOrMetaIfEnabled(enabledTags, metrics.TagName, cleanURL) 
        
           	ctm.SetSystemTagOrMetaIfEnabled(enabledTags, metrics.TagURL, cleanURL) 
        
           } else { 
        
           	// However, if the user set the `name` tag value somehow, we will use 
        
           	// whatever they set as the value of the `url` tags too, to prevent 
        
           	// high-cardinality values in the indexed tags. 
        
           	ctm.SetSystemTagOrMetaIfEnabled(enabledTags, metrics.TagURL, nameTagValue) 
        
           	// However, if the actual clean-URL value would have been different than 
        
           	// the `name` value, we will also attach it as the `raw_url` non-indexed 
        
           	// tag (i.e. add it to the Sample.Metadata). 
        
           	if nameTagValue != cleanURL { 
        
           		ctm.SetSystemTagOrMetaIfEnabled(enabledTags, metrics.TagRawURL, cleanURL) 
        
           	} 
        
           }

And if everyone agrees with this approach, we should probably merge a small PR before k6 v0.40.0 is released that rolls back these parts of #2583:

k6/metrics/engine/engine.go

Lines 93 to 99 in b89a3aa

    
           if _, ok := sm.Tags.Get("url"); ok { 
        
           	me.logger.Warnf("Thresholds like '%s', based on the high-cardinality 'url' metric tag, "+ 
        
           		"are deprecated and will not be supported in future k6 releases. "+ 
        
           		"To prevent breaking changes and reduce bugs, use the 'name' metric tag instead, see"+ 
        
           		"URL grouping (https://k6.io/docs/using-k6/http-requests/#url-grouping) for more information.", name, 
        
           	) 
        
           }

k6/cmd/integration_test.go

Lines 318 to 320 in b89a3aa

    
           assert.True(t, testutils.LogContains(logs, logrus.WarnLevel, 
        
           	"Thresholds like 'http_req_duration{url:https://test.k6.io}', based on the high-cardinality 'url' metric tag, are deprecated", 
        
           ))

(edit: created PR for v0.40.0 to remove the warning: #2655)

mstoykov · 2022-08-18T15:20:05Z

Can you remember the reasons we were particularly against this in the original discussion?

na-- · 2022-08-18T15:21:05Z

No... 😞 I tried to find some discussion on the topic, but I don't think we delved that much into it, we mostly discussed other details.

This was mistakenly added in k6 v0.39.0 (#2583), but it turns out we probably don't need it (see #2584 (comment))

codebien · 2022-08-18T16:43:03Z

I'm not getting when the url as an indexable tag will be useful per-se, all the cases seem covered by the name tag and when it will be different we will move to Metadata.
For the WebSocket case, I guess is it just a feature that we can add for consistency? Regarding the thresholds, I guess in that case the user would set the name so I'm not sure the UX is better because the threshold will use the low-cardinality value so the name.

na-- · 2022-08-18T18:33:09Z

To put it another way - with #2584 (comment), we get all of the benefits of #2584 (comment), but with a much smaller breaking change.

We get a way to have low-cardinality URLs, users just have to set the name tag or use the http.url helper. But we don't break the majority of existing tests that might use a threshold or sub-metric with the url tag. If a script has just a handful of URLs, you don't need to mess with name, it should just work as it is with url, since it will be low-cardinality. And if it isn't, users will have an easy way to fix it with name.

So, only the people with many URLs or with dynamic URLs need to bother with name, everyone else is fine and doesn't need to change anything. And most people with such scripts probably already use name, so they also don't have to do anything. Even more importantly, the k6 cloud continues to work exactly as before, no changes needed there before we can merge #2654.

In summary, it seems to me that it's a much smaller breaking change with basically the same benefits.

I'm not sure the UX is better

Setting a threshold on http_req_duration{url:https://test.k6.io/whatever} seems like a much more understandable thing for a beginner k6 user than setting a threshold on http_req_duration{name:https://test.k6.io/whatever}. URL is a valid concept outside of k6, while a "name" is not. And for the majority of scripts, this will probably be fine, since they don't have thousands of URLs. The rest will need to understand URL grouping and use name, but this is basically already the case if you want to have a threshold on them... The only change is that now the url tag will always be equal to name, instead of just some of the time.

For the WebSocket case, I guess is it just a feature that we can add for consistency?

Sure, I am not against adding it, but we don't have it yet. So, unless we add it right now and release it in v0.40.0, in both websocket APIs, there will be no transitional k6 version that has both name and url websocket tags for people to gracefully migrate. But it's a moot point, since with my suggestion we don't need such a version 😅

This was mistakenly added in k6 v0.39.0 (#2583), but it turns out we probably don't need it (see #2584 (comment))

na-- · 2022-10-06T10:33:43Z

After some lengthy internal discussions, we didn't reach consensus if the JS APIs from #2654 are the best one for the job. I'll try to do a longer public summary later, but suffice it to say that there were some strong and valid internal objections when it came to its UX... 😅

However, the need to support some sort of high-cardinality metrics metadata is still there and needs to be handled, so we decided to do these things for now:

Split apart the url == name tag fixes that reduce the cardinality of the url tag from Support non-indexable high-cardinality metric tags/metadata #2654 and merge them separately. This was already done. First as a PR (atlas: url tag value always equal to the name tag #2703) on top of the atlas branch (Refactor metric samples to use Atlas for tags and have time series #2594). Merging it there allowed us to merge the atlas branch in master without a worry that the k6 built-in system tags will cause high cardinality issues.
Spit apart other mostly refactoring and validation parts from Support non-indexable high-cardinality metric tags/metadata #2654, like ApplyCustomUserTags(), but without any Metadata or JS API changes, and merge them separately in master.
Make another PR that adds support for Metadata in every metrics.Sample Go struct, but without any direct access to it from the JS APIs. So, no custom user-specified metadata, for now. However, this Metadata map[string]string value will be usable by the k6 core code (e.g. for things like iter, vu and maybe raw_url) and by xk6 extensions!
(Everything ⬆️ would hopefully make it in k6 v0.41.0 at the end of the month, everything below would be nice to have, but unlikely to make it in time.)
Iterate over possible JS APIs and try to deliver a more user-friendly and less error-prone API than non-indexable tags, i.e. something better than tags: { foo: {value: 'bar', index: false}}. We have some ideas we'll iterate on internally and share in the issue later, but it's likely we'll start with some k6/experimental stand-alone API, instead of trying to thread it in every JS HTTP/WS/gRPC/etc. call.

na-- added performance refactor breaking change for PRs that need to be mentioned in the breaking changes section of the release notes labels Jul 4, 2022

na-- added this to the v0.40.0 milestone Jul 4, 2022

na-- self-assigned this Jul 4, 2022

na-- mentioned this issue Jul 4, 2022

Add a deprecation warning for thresholds on tags we'll make non-indexable #2583

Merged

na-- mentioned this issue Aug 1, 2022

Refactor metric samples to use Atlas for tags and have time series #2594

Merged

na-- mentioned this issue Aug 18, 2022

Always throw exception deliberately for invalid vu.tags values in k6/execution #2653

Closed

na-- mentioned this issue Aug 18, 2022

Support non-indexable high-cardinality metric tags/metadata #2654

Closed

na-- added a commit that referenced this issue Aug 18, 2022

Remove the warning for thresholds on url sub-metrics

e5b1d64

This was mistakenly added in k6 v0.39.0 (#2583), but it turns out we probably don't need it (see #2584 (comment))

na-- mentioned this issue Aug 18, 2022

Remove the warning for thresholds on url sub-metrics #2655

Merged

na-- modified the milestones: v0.40.0, v0.41.0 Aug 19, 2022

na-- added a commit that referenced this issue Aug 19, 2022

Remove the warning for thresholds on url sub-metrics

e09bb87

This was mistakenly added in k6 v0.39.0 (#2583), but it turns out we probably don't need it (see #2584 (comment))

na-- mentioned this issue Aug 26, 2022

K6 outputting duplicate metrics #2662

Closed

na-- mentioned this issue Sep 27, 2022

Add support for starting traces and saving their metadata as non-indexed tags grafana/k6-jslib-httpx#9

Open

codebien mentioned this issue Sep 30, 2022

atlas: url tag value always equal to the name tag #2703

Merged

codebien mentioned this issue Oct 13, 2022

metrics.Sample.Metadata: Data structure for high cardinality tags #2726

Merged

codebien closed this as completed in #2726 Oct 17, 2022

na-- mentioned this issue Nov 10, 2022

JS API for user-specified high-cardinality metrics metadata #2766

Closed

TomRiddle01 mentioned this issue Mar 1, 2023

Option to use separate name and url tags (Possibly via raw_url) #2950

Open

codebien mentioned this issue Jan 17, 2024

Documentation about name and url out of sync with implementation grafana/k6-docs#1481

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-indexable high-cardinality metric tags / metadata #2584

Support non-indexable high-cardinality metric tags / metadata #2584

na-- commented Jul 4, 2022

na-- commented Aug 18, 2022 •

edited

Loading

mstoykov commented Aug 18, 2022

na-- commented Aug 18, 2022

codebien commented Aug 18, 2022

na-- commented Aug 18, 2022 •

edited

Loading

na-- commented Oct 6, 2022

Support non-indexable high-cardinality metric tags / metadata #2584

Support non-indexable high-cardinality metric tags / metadata #2584

Comments

na-- commented Jul 4, 2022

na-- commented Aug 18, 2022 • edited Loading

mstoykov commented Aug 18, 2022

na-- commented Aug 18, 2022

codebien commented Aug 18, 2022

na-- commented Aug 18, 2022 • edited Loading

na-- commented Oct 6, 2022

na-- commented Aug 18, 2022 •

edited

Loading

na-- commented Aug 18, 2022 •

edited

Loading