Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Support for groupByTags and aliasByTags #780

Merged
merged 22 commits into from
Dec 28, 2017

Conversation

shanson7
Copy link
Collaborator

@shanson7 shanson7 commented Dec 8, 2017

Alright, this is the last PR that I have staged for tag function support.

Also snuck in was a change to the precision of the returned datapoints to better match graphite's.

}

func (s *FuncGroupByTags) Context(context Context) Context {
// set this? context.consol = consolidation.FromConsolidateBy(s.aggregator)
Copy link
Contributor

@replay replay Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be answered before merging the PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I'm not quite sure how to tell. I think the answer is 'no' after looking through graphite-web code.

groups[key] = append(groups[key], serie)
}

aggFunc := getCrossSeriesAggFunc(s.aggregator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is anything verifying that this given aggregator exists? if it doesn't getCrossSeriesAggFunc() would return nil which would then get called further down

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I'll add a check before doing any real work to return and error early.

output := make([]models.Series, 0, len(groups))

// Now, for each key perform the requested aggregation
for name, groupSeries := range groups {
Copy link
Contributor

@replay replay Dec 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think if len(groupSeries) == 1 the aggregation here could probably be skipped, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. Let me look into if it would complicate the series add code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


tags["name"] = tagSplits[0]

if len(tagSplits) > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this actually can't happen, because if a series has no tags it cannot get in-here. but i guess it's still better to have that check for safety

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be just grouping by the name tag, which is handled differently.

tagSplits = tagSplits[1:]
}

for _, split := range tagSplits {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be for _, split := range tagSplits[1:] { and the whole if block above should be removed. Right now it'll fail because it'll try to split the name on =.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll add a test and the fix.

@replay
Copy link
Contributor

replay commented Dec 14, 2017

Apart from some minor comments this looks good to me

expr/expr.go Outdated
@@ -146,6 +156,16 @@ func (e expr) consumeBasicArg(pos int, exp Arg) (int, error) {
return 0, ErrBadArgumentStr{"string", string(got.etype)}
}
*v.val = got.bool
case ArgStringsOrInts:
// special case! consume all subsequent args (if any) in args that will also yield a string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do ArgStrings and ArgStringsOrInts look different from the other multiple-arg cases? eg they don't validate the current element. do they allow there to be 0 args?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graphite allows aliasByNodes with no trailing args (sets target to ""). I didn't particular intend for this case. I'm open to making sure there is at least one valid argument following.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm the graphite docs for both aliasByNode and aliasByTags both state "one or more"
http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.aliasByNode
let's check in with @DanCech to see if either graphite's docs should say "zero or more", or graphite implementation should change to require at least one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we likely just need to add a validation check in graphite. that said, anyone calling alias functions without any specifications deserves what they get.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanCech so to be clear, you're saying we should stick with "one or more" and can validate as such, as the fact that zero or more is currently allowed (for both aliasByNode and aliasByTags) is not something we actually want / should rely on / should implement in MT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it should follow graphite. If we want to change the graphite behavior we can talk about that (PRs appreciated!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MT parser doesn't allow 0 arguments for this parameter. I think this is fine, but it doesn't match graphite. I'd rather not muck with the parser to allow 0 args if we don't think that's the right thing to do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this function won't even get called if there are 0 arguments. ArgStrings and ArgStringsOrInts can both be 0-length if the wrong type is encountered (e.g. groupByTags(...,"sum",1,"tag2") would stop at 1 and not encounter "tag2". I think maybe changing the assumption to be that all subsequent args must be the right type or throw the error. That seems like the correct behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, this would go against what ArgInts does today, but nothing seems to use that.


aggFunc := getCrossSeriesAggFunc(s.aggregator)
if aggFunc == nil {
return nil, errors.New("Invalid aggregation func: " + s.aggregator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we do this via a validator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

}

if len(s.tags) == 0 {
return nil, errors.New("No tags specified")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we do this via a validator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the ArgStrings I run the validators on each element, not the end result. Which would make more sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validators on each element is definitely how it's applied for the other multi-types as well, and it makes sense because a validator is defined to take an *expr which means we can use the same Validator type for any of the types in expr/types.go

type Validator func(e *expr) error

perhaps we can add an additional validator attribute that will validate the entire []strings slice (once all elements have been populated), but then I don't think we can still use the Validator type since the argument wouldn't be an *expr anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the parser actually returns a 400 if there are no tags/the tags weren't strings. So, it's likely that this check is redundant from the parse level.

for _, serie := range series {
name := strings.SplitN(serie.Target, ";", 2)[0]

buffer.Reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hooray for buffer reuse!

if len(groupSeries) == 1 {
out = groupSeries[0].Datapoints
} else {
out = pointSlicePool.Get().([]schema.Point)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every time you get a []schema.Point out of pointSlicePool, you must also store it in cache, so that it can be restored into the pool at the end (see Plan.Clean in plan.go and also the NOTES file in the expr dir)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I suppose I will need to refactor the logic a little to avoid adding it to the cache if I don't need the pool.

schema "gopkg.in/raintank/schema.v1"
)

func getModel(name string, data []schema.Point) models.Series {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we rely on Series.SetTags here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This predated that function, good catch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 448b4d9

t.Fatalf("case %q: err should not be nil but was", "TestnoTags")
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the erroring cases, wouldn't it make sense to rely on testGroupByTags also? (just extend it to check for the error?)

}

testGroupByTags("AllAggregators:"+agg.name, in, out, agg.name, []string{"tag1", "name"}, t)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow. 🥇

})
sort.Slice(out, func(i, j int) bool {
return out[i].Target < out[j].Target
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the order of got predictable ? (e.g sorted) ? does graphite sort it? if so, could we just pass in the correct out so that neither requires sorting?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my implementation, it's definitely not sorted (it's just in whatever order the map puts it). I'm not sure if graphite sorts it. I guess it would sort by target if anything? But if graphite does this, I imagine it would do it after functions complete. It wouldn't make sense to sort it in the middle of processing, when an alias function could come in and require re-sorting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, makes sense

Val: math.NaN(),
}

if !math.IsNaN(maxes[i].Val) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this check needed? seems like the subtraction would do the right thing in all cases. https://play.golang.org/p/hiROasfqyq

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently not! Cool.

for j, p := range got {
bothNaN := math.IsNaN(p.Val) && math.IsNaN(out[j].Val)
if (bothNaN || p.Val == out[j].Val) && p.Ts == out[j].Ts {
if (bothNaN || p.Val == out[j].Val || math.Abs(p.Val-out[j].Val) < EPSILON) && p.Ts == out[j].Ts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. have you seen falsely triggering test failures due to rounding errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was, but the issue was in my math for the test data. I never went back and removed this.

Copy link
Contributor

@Dieterbe Dieterbe Dec 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my solution to that problem has been to pick different test values that don't trigger this problem :p rounding is to be expected so I think it's ok to use such data that we can ignore it.

@Dieterbe Dieterbe self-requested a review December 14, 2017 20:45
Copy link
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow sean. what a piece of work you're delivering here. overall looks very good, i just have some minor comments here and there. PS: seems like in your last few pr's you've had to dive deeper into the expr parsing stuff, function arg validation etc, how does it look? (it's too complicated for my taste but i'm not sure how to refactor it)

@shanson7
Copy link
Collaborator Author

Parsing code tends to be complicated, especially when there are a variety of style to support (like kwargs, variadic arguments, etc). I think that this implementation isn't that bad, and makes it pretty easy to add new functions.

@shanson7
Copy link
Collaborator Author

@Dieterbe - Are there any open questions other than the ArgStrings[OrInts] thing?

if expectedErr == nil {
t.Fatalf("case %q: expected no error but got %q", name, err)
} else if err == nil || err.Error() != expectedErr.Error() {
t.Fatalf("case %q: err %q but expected %q", name, err, expectedErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency, expected goes first, then got.

}
}

func TestInvalidAggregator(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check was moved into the parse validator per your suggestion. There is no longer any checking in Exec for agg validity.

I could add it back for robustness/sanity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add it back, and adjust testGroupByTags to test more of the aggregator lifecycle rather than just exec, but that probably becomes too much hassle.
let's not worry about it for now, maybe later we can come up with an elegant way to test the validors (ideally in a more general way), but maybe we don't have to, cause it's also fairly obvious stuff.

}

if len(groupSeries) == 1 {
newSeries.Datapoints = groupSeries[0].Datapoints
Copy link
Contributor

@Dieterbe Dieterbe Dec 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't always correct. eg when aggFunc is stdev. the functions for which is true can optimize this case, no need to have this branching in each caller.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

}

if len(series) <= 1 {
return series, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't always correct. eg when aggFunc is stdev.

@shanson7
Copy link
Collaborator Author

@Dieterbe - with the new SeriesAggregators I could add to funcs support for:
diffSeries
medianSeries
multiplySeries
stddevSeries
rangeOfSeries

Would you like that as part of this PR?

@Dieterbe
Copy link
Contributor

Would you like that as part of this PR?

seems like a new one would be better

@@ -211,7 +211,7 @@ func testGroupByTags(name string, in []models.Series, out []models.Series, agg s
if expectedErr == nil {
t.Fatalf("case %q: expected no error but got %q", name, err)
} else if err == nil || err.Error() != expectedErr.Error() {
t.Fatalf("case %q: err %q but expected %q", name, err, expectedErr)
t.Fatalf("case %q: expected error %q but got %q", name, err, expectedErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to switch the vars

@Dieterbe Dieterbe merged commit 545fe5f into grafana:master Dec 28, 2017
@Aergonus Aergonus deleted the feature_groupByTags branch January 22, 2018 14:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants