Support for `groupByTags` and `aliasByTags` #780

shanson7 · 2017-12-08T20:02:42Z

Alright, this is the last PR that I have staged for tag function support.

Also snuck in was a change to the precision of the returned datapoints to better match graphite's.

Add support for groupByTags and aliasByTags

replay · 2017-12-11T06:12:12Z

expr/func_groupbytags.go

+}
+
+func (s *FuncGroupByTags) Context(context Context) Context {
+	// set this? context.consol = consolidation.FromConsolidateBy(s.aggregator)


I think this should be answered before merging the PR

Agreed, but I'm not quite sure how to tell. I think the answer is 'no' after looking through graphite-web code.

replay · 2017-12-11T06:46:55Z

expr/func_groupbytags.go

+		groups[key] = append(groups[key], serie)
+	}
+
+	aggFunc := getCrossSeriesAggFunc(s.aggregator)


is anything verifying that this given aggregator exists? if it doesn't getCrossSeriesAggFunc() would return nil which would then get called further down

Nope, I'll add a check before doing any real work to return and error early.

replay · 2017-12-14T14:00:32Z

expr/func_groupbytags.go

+	output := make([]models.Series, 0, len(groups))
+
+	// Now, for each key perform the requested aggregation
+	for name, groupSeries := range groups {


i think if len(groupSeries) == 1 the aggregation here could probably be skipped, right?

Probably. Let me look into if it would complicate the series add code.

replay · 2017-12-14T14:02:36Z

expr/func_groupbytags.go

+
+		tags["name"] = tagSplits[0]
+
+		if len(tagSplits) > 1 {


i think this actually can't happen, because if a series has no tags it cannot get in-here. but i guess it's still better to have that check for safety

It could be just grouping by the name tag, which is handled differently.

DanCech · 2017-12-14T14:38:32Z

expr/func_groupbytags.go

+			tagSplits = tagSplits[1:]
+		}
+
+		for _, split := range tagSplits {


this should be for _, split := range tagSplits[1:] { and the whole if block above should be removed. Right now it'll fail because it'll try to split the name on =.

Good catch. I'll add a test and the fix.

replay · 2017-12-14T14:57:44Z

Apart from some minor comments this looks good to me

Dieterbe · 2017-12-14T20:19:21Z

expr/expr.go

@@ -146,6 +156,16 @@ func (e expr) consumeBasicArg(pos int, exp Arg) (int, error) {
 			return 0, ErrBadArgumentStr{"string", string(got.etype)}
 		}
 		*v.val = got.bool
+	case ArgStringsOrInts:
+		// special case! consume all subsequent args (if any) in args that will also yield a string


why do ArgStrings and ArgStringsOrInts look different from the other multiple-arg cases? eg they don't validate the current element. do they allow there to be 0 args?

Graphite allows aliasByNodes with no trailing args (sets target to ""). I didn't particular intend for this case. I'm open to making sure there is at least one valid argument following.

hmm the graphite docs for both aliasByNode and aliasByTags both state "one or more"
http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.aliasByNode
let's check in with @DanCech to see if either graphite's docs should say "zero or more", or graphite implementation should change to require at least one.

yeah we likely just need to add a validation check in graphite. that said, anyone calling alias functions without any specifications deserves what they get.

@DanCech so to be clear, you're saying we should stick with "one or more" and can validate as such, as the fact that zero or more is currently allowed (for both aliasByNode and aliasByTags) is not something we actually want / should rely on / should implement in MT ?

Right now it should follow graphite. If we want to change the graphite behavior we can talk about that (PRs appreciated!)

The MT parser doesn't allow 0 arguments for this parameter. I think this is fine, but it doesn't match graphite. I'd rather not muck with the parser to allow 0 args if we don't think that's the right thing to do.

Ok, so this function won't even get called if there are 0 arguments. ArgStrings and ArgStringsOrInts can both be 0-length if the wrong type is encountered (e.g. groupByTags(...,"sum",1,"tag2") would stop at 1 and not encounter "tag2". I think maybe changing the assumption to be that all subsequent args must be the right type or throw the error. That seems like the correct behavior.

Interestingly, this would go against what ArgInts does today, but nothing seems to use that.

Dieterbe · 2017-12-14T20:21:52Z

expr/func_groupbytags.go

+
+	aggFunc := getCrossSeriesAggFunc(s.aggregator)
+	if aggFunc == nil {
+		return nil, errors.New("Invalid aggregation func: " + s.aggregator)


can't we do this via a validator?

Dieterbe · 2017-12-14T20:22:00Z

expr/func_groupbytags.go

+	}
+
+	if len(s.tags) == 0 {
+		return nil, errors.New("No tags specified")


can't we do this via a validator?

For the ArgStrings I run the validators on each element, not the end result. Which would make more sense?

validators on each element is definitely how it's applied for the other multi-types as well, and it makes sense because a validator is defined to take an *expr which means we can use the same Validator type for any of the types in expr/types.go

type Validator func(e *expr) error

perhaps we can add an additional validator attribute that will validate the entire []strings slice (once all elements have been populated), but then I don't think we can still use the Validator type since the argument wouldn't be an *expr anymore.

So, the parser actually returns a 400 if there are no tags/the tags weren't strings. So, it's likely that this check is redundant from the parse level.

Dieterbe · 2017-12-14T20:26:19Z

expr/func_groupbytags.go

+	for _, serie := range series {
+		name := strings.SplitN(serie.Target, ";", 2)[0]
+
+		buffer.Reset()


hooray for buffer reuse!

Dieterbe · 2017-12-14T20:28:13Z

expr/func_groupbytags.go

+		if len(groupSeries) == 1 {
+			out = groupSeries[0].Datapoints
+		} else {
+			out = pointSlicePool.Get().([]schema.Point)


every time you get a []schema.Point out of pointSlicePool, you must also store it in cache, so that it can be restored into the pool at the end (see Plan.Clean in plan.go and also the NOTES file in the expr dir)

Ah, yes. I suppose I will need to refactor the logic a little to avoid adding it to the cache if I don't need the pool.

Dieterbe · 2017-12-14T20:30:08Z

expr/func_groupbytags_test.go

+	schema "gopkg.in/raintank/schema.v1"
+)
+
+func getModel(name string, data []schema.Point) models.Series {


can't we rely on Series.SetTags here ?

Yep. This predated that function, good catch.

Done in 448b4d9

Dieterbe · 2017-12-14T20:32:37Z

expr/func_groupbytags_test.go

+		t.Fatalf("case %q: err should not be nil but was", "TestnoTags")
+	}
+}
+


for the erroring cases, wouldn't it make sense to rely on testGroupByTags also? (just extend it to check for the error?)

Dieterbe · 2017-12-14T20:35:25Z

expr/func_groupbytags_test.go

+		}
+
+		testGroupByTags("AllAggregators:"+agg.name, in, out, agg.name, []string{"tag1", "name"}, t)
+	}


Dieterbe · 2017-12-14T20:36:47Z

expr/func_groupbytags_test.go

+	})
+	sort.Slice(out, func(i, j int) bool {
+		return out[i].Target < out[j].Target
+	})


is the order of got predictable ? (e.g sorted) ? does graphite sort it? if so, could we just pass in the correct out so that neither requires sorting?

In my implementation, it's definitely not sorted (it's just in whatever order the map puts it). I'm not sure if graphite sorts it. I guess it would sort by target if anything? But if graphite does this, I imagine it would do it after functions complete. It wouldn't make sense to sort it in the middle of processing, when an alias function could come in and require re-sorting.

right, makes sense

Dieterbe · 2017-12-14T20:40:06Z

expr/seriesaggregators.go

+			Val: math.NaN(),
+		}
+
+		if !math.IsNaN(maxes[i].Val) {


is this check needed? seems like the subtraction would do the right thing in all cases. https://play.golang.org/p/hiROasfqyq

Apparently not! Cool.

Dieterbe · 2017-12-14T20:44:15Z

expr/seriesaggregator_test.go

 	for j, p := range got {
 		bothNaN := math.IsNaN(p.Val) && math.IsNaN(out[j].Val)
-		if (bothNaN || p.Val == out[j].Val) && p.Ts == out[j].Ts {
+		if (bothNaN || p.Val == out[j].Val || math.Abs(p.Val-out[j].Val) < EPSILON) && p.Ts == out[j].Ts {


interesting. have you seen falsely triggering test failures due to rounding errors?

Ah, I was, but the issue was in my math for the test data. I never went back and removed this.

my solution to that problem has been to pick different test values that don't trigger this problem :p rounding is to be expected so I think it's ok to use such data that we can ignore it.

Dieterbe

wow sean. what a piece of work you're delivering here. overall looks very good, i just have some minor comments here and there. PS: seems like in your last few pr's you've had to dive deeper into the expr parsing stuff, function arg validation etc, how does it look? (it's too complicated for my taste but i'm not sure how to refactor it)

shanson7 · 2017-12-18T16:31:15Z

Parsing code tends to be complicated, especially when there are a variety of style to support (like kwargs, variadic arguments, etc). I think that this implementation isn't that bad, and makes it pretty easy to add new functions.

shanson7 · 2017-12-27T16:59:20Z

@Dieterbe - Are there any open questions other than the ArgStrings[OrInts] thing?

Dieterbe · 2017-12-28T10:59:58Z

expr/func_groupbytags_test.go

+		if expectedErr == nil {
+			t.Fatalf("case %q: expected no error but got %q", name, err)
+		} else if err == nil || err.Error() != expectedErr.Error() {
+			t.Fatalf("case %q: err %q but expected %q", name, err, expectedErr)


for consistency, expected goes first, then got.

Dieterbe · 2017-12-28T11:01:07Z

expr/func_groupbytags_test.go

-	}
-}
-
-func TestInvalidAggregator(t *testing.T) {


why is this one removed?

The check was moved into the parse validator per your suggestion. There is no longer any checking in Exec for agg validity.

I could add it back for robustness/sanity.

we could add it back, and adjust testGroupByTags to test more of the aggregator lifecycle rather than just exec, but that probably becomes too much hassle.
let's not worry about it for now, maybe later we can come up with an elegant way to test the validors (ideally in a more general way), but maybe we don't have to, cause it's also fairly obvious stuff.

Dieterbe · 2017-12-28T11:04:11Z

expr/func_groupbytags.go

+		}
+
+		if len(groupSeries) == 1 {
+			newSeries.Datapoints = groupSeries[0].Datapoints


I think this isn't always correct. eg when aggFunc is stdev. the functions for which is true can optimize this case, no need to have this branching in each caller.

Good catch.

Dieterbe · 2017-12-28T11:06:36Z

expr/func_groupbytags.go

+	}
+
+	if len(series) <= 1 {
+		return series, nil


I think this isn't always correct. eg when aggFunc is stdev.

shanson7 · 2017-12-28T15:09:22Z

@Dieterbe - with the new SeriesAggregators I could add to funcs support for:
diffSeries
medianSeries
multiplySeries
stddevSeries
rangeOfSeries

Would you like that as part of this PR?

Dieterbe · 2017-12-28T15:54:22Z

Would you like that as part of this PR?

seems like a new one would be better

…hey should

Dieterbe · 2017-12-28T20:28:59Z

expr/func_groupbytags_test.go

@@ -211,7 +211,7 @@ func testGroupByTags(name string, in []models.Series, out []models.Series, agg s
 		if expectedErr == nil {
 			t.Fatalf("case %q: expected no error but got %q", name, err)
 		} else if err == nil || err.Error() != expectedErr.Error() {
-			t.Fatalf("case %q: err %q but expected %q", name, err, expectedErr)
+			t.Fatalf("case %q: expected error %q but got %q", name, err, expectedErr)


don't forget to switch the vars

shanson7 and others added 8 commits December 5, 2017 11:23

Add groupByTags

4900828

Add another test for groupByTags

18c562b

Add aliasByTag as alias to aliasByNode

165fec4

Add more seriesAggregators to match graphite support

141b42c

Make groupByTags stable

6b6b789

Make datapoints values as precise as they need to be

e18bb2d

Merge pull request #6 from bloomberg/groupByTags

9bc33fc

Add support for groupByTags and aliasByTags

Fix test case

9f8b75d

replay reviewed Dec 11, 2017

View reviewed changes

shanson7 added 2 commits December 11, 2017 12:31

Add check for invalid aggregator, tests for error cases

c1e54cf

Add more aggregation types to the 'TestAllAggregators' test

81372f0

replay reviewed Dec 14, 2017

View reviewed changes

DanCech reviewed Dec 14, 2017

View reviewed changes

shanson7 added 2 commits December 14, 2017 10:04

Fix parse logic for grouping by just name tag

e901381

Small optimization for if a single series is grouped

e28faa3

Dieterbe reviewed Dec 14, 2017

View reviewed changes

Dieterbe self-requested a review December 14, 2017 20:45

Dieterbe suggested changes Dec 14, 2017

View reviewed changes

shanson7 added 4 commits December 18, 2017 11:31

Remove unnecessary epsilon check

37a9016

Remove unnecessary isNaN check

c871253

Support error tests in testGroupByTags

448b4d9

Move aggfunc check to validator

bed5580

Dieterbe reviewed Dec 28, 2017

View reviewed changes

Fix single series bug

a86cc10

shanson7 added 4 commits December 28, 2017 13:03

Fix case where multi-value expr types were returning one past where t…

98075ab

…hey should

Add some tests to plan_test

e3e9354

Use SetTags in groupByTags

c20a355

Put expected first in test error message

ea88818

Dieterbe reviewed Dec 28, 2017

View reviewed changes

Fix order of errors in test output

7b80086

Dieterbe merged commit 545fe5f into grafana:master Dec 28, 2017

Aergonus deleted the feature_groupByTags branch January 22, 2018 14:46

Support for groupByTags and aliasByTags #780

Support for groupByTags and aliasByTags #780

Conversation

shanson7 commented Dec 8, 2017

replay Dec 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay Dec 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay commented Dec 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe Dec 28, 2017 • edited Loading

Choose a reason for hiding this comment

Dieterbe left a comment

Choose a reason for hiding this comment

shanson7 commented Dec 18, 2017

shanson7 commented Dec 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe Dec 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shanson7 commented Dec 28, 2017

Dieterbe commented Dec 28, 2017

Choose a reason for hiding this comment

Support for `groupByTags` and `aliasByTags` #780

Support for `groupByTags` and `aliasByTags` #780

replay Dec 11, 2017 •

edited

Loading

replay Dec 14, 2017 •

edited

Loading

Dieterbe Dec 28, 2017 •

edited

Loading

Dieterbe Dec 28, 2017 •

edited

Loading