Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Autocomplete for tag keys and tag values #779

Merged
merged 99 commits into from
Dec 19, 2017
Merged

Autocomplete for tag keys and tag values #779

merged 99 commits into from
Dec 19, 2017

Conversation

replay
Copy link
Contributor

@replay replay commented Dec 7, 2017

This implements the autocomplete endpoints for tags and tag values.

In order to do this more efficiently the whole query execution has been changed from the previous "evaluate everything one-by-one and then consolidate" to something that's more like multi-threaded map/reduce, which can also be aborted in the case of tag autocomplete queries once a sufficient number of results has been collected.

Especially in the case of tag autocomplete this brings some pretty significant speed improvements.
This benchmark queries for the tag prefix di with expressions "metric=~.*_time$", "direction!=~re", "host=~host9[0-9]0" on an index with 1.68 million entries:

# Before abortable multi-threaded map/reduce
BenchmarkTagQueryKeysByPrefixExpressions-8   	      50	 200014598 ns/op	36032354 B/op	  602109 allocs/op
# After
BenchmarkTagQueryKeysByPrefixExpressions-8   	   10000	    923016 ns/op	  262924 B/op	    1960 allocs/op

api/graphite.go Outdated
func (s *Server) graphiteAutoCompleteTags(ctx *middleware.Context, request models.GraphiteAutoCompleteTags) {
tags, err := s.clusterAutoCompleteTags(ctx.Req.Context(), ctx.OrgId, request.TagPrefix, request.Expr, request.From, request.Limit)
if err != nil {
response.Write(ctx, response.WrapError(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to use the same JSON response format as graphite {"error":"<message>"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@replay replay force-pushed the autocomplete branch 3 times, most recently from 5d5030b to 29fce0c Compare December 8, 2017 12:34
Tag string `json:"tag"`
Expr []string `json:"expressions"`
From int64 `json:"from"`
Limit uint16 `json:"limit"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these default to 100 in graphite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in graphite the default is 100 if no limit value is given. If the given limit value is 0 then this will be accepted as a valid value.
In Macaron's request binding I can't see an easy way to make it behave exactly the same way because if no value is given it will be set to 0 and there's no None. But I cannot think of a case where somebody would actually want the limit to be 0 so I'd suggest that we'll just update 0 to a configurable default value (like in graphite).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use binding:"Default(100)"?

Copy link
Contributor Author

@replay replay Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, thx i've missed that somehow. Although with that the problem is that afaict we can't make the default value configurable anymore.
The solution I was looking at was via the Validate() method, but in that method the information whether a field was not set or if it was set to 0 is already lost.
Not sure what the preferred behavior is in that situation, I think making the default configurable is good and we want to keep that.

}
}
res = append(res, tag)
if uint16(len(res)) >= limit {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this can return a single result for a limit of 0 (which is the current default value for limit). I'm not sure that limit of 0 is really valid.

Copy link
Contributor Author

@replay replay Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the above changes a limit of 0 isn't valid anymore

// otherwise, the generation of the result set is much simpler
if len(expressions) > 0 {
if len(valPrefix) > 0 {
expressions = append(expressions, tag+"^="+valPrefix)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "=~"?

Copy link
Contributor

@DanCech DanCech Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a special prefix operator that MT knows about, just used for this stuff right now but I might add it to graphite also

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. Well, this doesn't work on my local build, but changing it to "=~" does. I'll figure that out real quick.


// testByPrefix filters a given metric by matching prefixes against the values
// of a specific tag
func (q *TagQuery) testByPrefix(def *idx.Archive, exprs []kv) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't handle the name special case like testbyMatch does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be better: 544a435

}

// testByTagPrefix filters a given metric by matching prefixes against its tags
func (q *TagQuery) testByTagPrefix(def *idx.Archive) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name special case?

idx/idx.go Outdated
@@ -147,6 +147,9 @@ type MetricIndex interface {
// LastUpdate time is >= the given value.
Tags(int, string, int64) ([]string, error)

AutoCompleteTags(int, string, []string, int64, uint) ([]string, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you document these functions here? (I know you documented the memory implementations, but typically the docs people see when navigating the code is the interface methods)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure 818e0b2

@replay replay force-pushed the autocomplete branch 2 times, most recently from 9d1f11f to 8b78830 Compare December 15, 2017 07:57
@@ -148,6 +148,8 @@ log-min-dur = 5min
time-zone = local
# maximum number of concurrent threads for fetching data on the local node. Each thread handles a single series.
get-targets-concurrency = 20
# what the default limit for tagdb query results should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"what the... should be" is redundant.
and if this setting is only a default, we should document what may override it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idx/idx.go Outdated
@@ -147,6 +147,17 @@ type MetricIndex interface {
// LastUpdate time is >= the given value.
Tags(int, string, int64) ([]string, error)

// AutoCompleteTags is used to generate a list of possible values that could
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is used to generate/generates/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idx/idx.go Outdated
// narrow down the result set.
AutoCompleteTags(int, string, []string, int64, uint) ([]string, error)

// AutoCompleteTagValues is used to generate a list of possible values that could
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idx/idx.go Outdated
@@ -147,6 +147,17 @@ type MetricIndex interface {
// LastUpdate time is >= the given value.
Tags(int, string, int64) ([]string, error)

// AutoCompleteTags is used to generate a list of possible values that could
// complete a given prefix. It also accepts additional conditions to further
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"values that could complete a given prefix" this is confusing because AFAICT we're not talking about tag values, but keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, what kind of conditions? expressions such as those accepted by FindByTag ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated: 9d217cd


type IndexAutoCompleteTagValues struct {
OrgId int `json:"orgId" binding:"Required"`
ValPrefix string `json:"valPrefix"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type already makes it explicit that we're looking for values, so this field can be simplified to "prefix". same remark for IndexAutoCompleteTags

Copy link
Contributor Author

@replay replay Dec 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -4,6 +4,9 @@ import (
"github.com/grafana/metrictank/idx"
)

//go:generate msgp
type StringList []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this custom type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we need to be able to json marshal/unmarshal a slice of strings. Is there a better way to do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean msgp? or json and also msgp? not sure, i thought it should just work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry, yes msgp

// tagPrefix: the string to be completed
// expressions: tagdb expressions in the same format as graphite uses
// from: only tags will be returned that have at least one metric
// with a LastUpdate above from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>= from

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for the other function

// consecutive queries and the limit is applied after sorting
//
func (m *MemoryIdx) AutoCompleteTags(orgId int, tagPrefix string, expressions []string, from int64, limit uint) ([]string, error) {
res := make([]string, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var res []string is equivalent but saves needless allocation. same for the other function below.

if len(expressions) > 0 {
// incorporate the tag prefix into the tag query expressions
if len(tagPrefix) > 0 {
expressions = append(expressions, fmt.Sprintf("__tag^=%s", tagPrefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be simpler and more performant by just doing "__tag^=" + tagPrefix


tagsSorted := make([]string, 0, len(tags))
for tag := range tags {
if len(tagPrefix) > 0 && (len(tagPrefix) > len(tag) || tag[:len(tagPrefix)] != tagPrefix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace this entire condition with if !strings.HasPrefix(tag, tagPrefix) {
tagPrefix being "" doesn't need anything special, it just works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that works 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that works 👍. I've used that in a ton of places now: d75149a

if len(valPrefix) > 0 {
expressions = append(expressions, tag+"^="+valPrefix)
} else {
// if no value prefix has been specified we still require that at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be "at least" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :)


res = make([]string, 0, len(vals))
for val := range vals {
if len(valPrefix) > 0 && (len(valPrefix) > len(val) || val[:len(valPrefix)] != valPrefix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use strings.HasPrefix here I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep: d75149a

MATCH_TAG // __tag=~ relies on special key __tag
NOT_MATCH // !=~
PREFIX // ^= exact prefix, not regex
PREFIX_TAG // __tag^= exact prefix with tag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all these exactly the same as in graphite? we should document somewhere which operators we support and what the various tricks are (e.g. tag!= doing a check that the tag is non zero afaik). this is probably not the right place. maybe docs/http-api.md is, or we can also just refer to the graphite docs if we're 100% compatible.
we also currently don't really document the kind of regular patterns (non-tag) a user can use. not sure if we want to maintain all that ourself though, i would rather justpoint to graphite docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be the documentation used, because it's the reference implementation as well: http://graphite.readthedocs.io/en/latest/tags.html

Copy link
Contributor Author

@replay replay Dec 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not all those operations are supported by graphite. some of them were only implemented because we need them internally to satisfy queries like for example by tag prefix (for tag autocomplete). if a user figures out how to use them that's fine, but they are not intended to be used by users and hence not documented in the reference documentation ^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so in tag_query.go lets link to that URL and also clearly mark the non-standard ones.
looking at docs/http-api.md again, it's not the right place because it just talks about the different user accessible http endpoints, such as /metrics/find and /render; but doesn't go into much more detail regarding graphite processing functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

// getMaxTagCount calculates the maximum number of results a tag query could
// possibly return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is called cardinality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really? i thought cardinality is how many unique values each tag has. but this function returns the max number of tags

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can have cardinality of data and cardinality of a resultset, both simply convey the number of distinct values

Copy link
Contributor Author

@replay replay Dec 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagPrefix string // only used for auto complete of tags to match exact prefix

index TagIndex
byId map[string]*idx.Archive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when i see attributes for a type, i assume the attributes are scoped to the lifetime of the type. e.g. the fields are set when an instance is created, or very shortly after.
but index and byId are only initialized when Run() or RunGetTags() are called. this warrants a comment (or unless these fields can be set at creation time, that would be even a bit cleaner imho)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'd set the index during instantiation of the query we'd already need to acquire the read lock then for this bit: https://github.com/grafana/metrictank/blob/master/idx/memory/memory.go#L591
Currently when the query gets instantiated we verify that it is valid, if it is not valid we return an error without ever acquiring the index lock. That would not be possible then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine. but let's add a comment than explaining whose responsibility is it to set those fields (seems Run and RunGetTags set it)

resultSet = q.getInitialByEqual(index, q.equal[0])
q.equal = q.equal[1:]
// Run executes the tag query on the given index and returns a list of ids
func (q *TagQuery) Run(index TagIndex, byId map[string]*idx.Archive) TagIDs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the TagIDs type is confusing. it's not a list but a set. and it's not id's of tags but id's of metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that mean i should change it? I guess i could rename it to something like IdSet, but it seems more consistent like this: https://github.com/grafana/metrictank/blob/master/idx/memory/memory.go#L63-L65

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that, or MetricIDs I suppose. I like IdSet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, renamed it to IdSet

// defined in the query expressions. then they will extract the tags of
// those that satisfy all conditions and push them into tagCh.
// when a worker completes it pushes an empty item into the completeCh to
// signal its completion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no need for the completeCh stuff. a more common pattern to do this is:

go func() {
q.wg.Wait()
close(tagCh)
}()

then just do a for range over tagCh to consume everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, will implement that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is much better: 8661c47

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, beautiful :)

// always anchor all regular expressions at the beginning
if (e.operator == MATCH || e.operator == NOT_MATCH) && e.value[0] != byte('^') {
// always anchor all regular expressions at the beginning if they do not start with ^
if (e.operator == MATCH || e.operator == NOT_MATCH || e.operator == MATCH_TAG) && e.value[0] != 94 {
Copy link
Contributor

@Dieterbe Dieterbe Dec 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use '^' instead of 94.
you can compile the below 2 programs with go build -gcflags=-S and see that the assembly is identical.

~/t/rune ❯❯❯ cat 1/main.go                                                                                                                                                                   ⏎
package main

import "fmt"

func main() {
	a := "^abc"
	fmt.Println(a[0] == 94)
}
~/t/rune ❯❯❯ cat 2/main.go
package main

import "fmt"

func main() {
	a := "^abc"
	fmt.Println(a[0] == '^')
}

the same comment goes for a few other places in the code where we use numbers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, i can also change the switch / case in parseExpression() then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dieterbe
Copy link
Contributor

I need to review further, but another thing i noticed is that getInitial* functions all do a chan send for every value that should be considered, which seems a bit expensive. especially since idCh is an unbuffered channel. buffering it could probably bring a perf gain.

@replay
Copy link
Contributor Author

replay commented Dec 18, 2017

@Dieterbe seems reasonable. Is there a good way to determine what buffer size would make sense? I guess just benchmark and see what performs the best, right?

@Dieterbe
Copy link
Contributor

yep, use your best judgment + benchmarks on realistic queries/cases

@Dieterbe
Copy link
Contributor

i'm just running some final benchmarks and if all looks good, will merge

@Dieterbe Dieterbe merged commit 7b99d38 into master Dec 19, 2017
@Dieterbe Dieterbe deleted the autocomplete branch September 18, 2018 08:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants