Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Fix 323 #333

Merged
merged 8 commits into from
Nov 2, 2016
Merged

Fix 323 #333

merged 8 commits into from
Nov 2, 2016

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Oct 3, 2016

No description provided.

this results in:

--- FAIL: TestGetTarget (0.00s)
	dataprocessor_test.go:489: case "off-by-5" req "quantization moves points 55s later" from 3011 to 3121 - exp: [{3005 3060} {3065 3120}] - got [{NaN 3060} {3065 3120}]
FAIL
exit status 1
FAIL	github.com/raintank/metrictank	0.029s
when we quantize, sometimes we need an earlier point to move to the
right (into the requested range), but the point we need fell before
the requested range.
So we just fetch `interval-1` more when needed to make sure we have the
right point in case this happens.
We can just extend the use of the aggSpan (now interval) argument of
getSeries to always refer to the interval of the requested archive
(whether aggregated or not)
// so that if we have quantize later (in fix()) we have a valid first point.
// (at this point we can't tell whether the raw series will be quantized or not)
if consolidator == consolidation.None {
fromUnix = fromUnix - interval + 1
Copy link
Member

@woodsaj woodsaj Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesnt seem right to me.

assume my interval is 10, and my request is from 15 to 35. Then i would expect quantized values at 20 and 30.
With this change, you are now requesting between 6 and 35.
If the raw values are offset by between 1 and 6, then things will work, as the adjusted from time will only match points between 11 and 16 inclusive, which all get quantized to 20
but if the offset is 7,8,9 or 0 the request will match points at 7,8,9,10 which get quantized to 10

I think what you need to do is adjust both fromUnix and ToUnix to be on the appropriate quantized boundaries.
eg. the fromUnix=10, toUnix=30 so that any points that match are quantized to either 20 or 30

Raw quantized expected match this commit match
5 10 no no
6 10 no no
7 10 no yes
8 10 no yes
9 10 no yes
10 10 no yes
11 20 yes yes
12 20 yes yes
. . . .
19 20 yes yes
20 20 yes yes
21 30 yes yes
. . . .
29 30 yes yes
30 30 yes yes
31 40 no yes
32 40 no yes

So if there are raw values recorded at 9, 19, 29, 39, our query of 15-35 should return the values at 19 and 29. However as can be seen from the table above, values at 9, 19 and 29 are returned when using the logic in this commit.

if the raw values are at 4, 14, 24, 34 then our query should return values at 14 and 24. However this commit would result in the values 14, 24 and 34 being returned.

Both of these outcomes are clearly wrong as a query for a 20second range can't possibly include 3 values that are 10seconds apart.

@woodsaj woodsaj assigned Dieterbe and unassigned woodsaj Oct 5, 2016
while we now use half the amount of lines, we actually test
all of the possible scenarios for a given interval.
with this approach we make more precise adjustments to from/to
which has two advantages:

1) we now query a slightly narrower timerange, which can be faster
2) historically, fix() has always been needed after getSeries().
   the previous patch relied on this and also requires it in order to:
   A provide the correct amount of points to the user
     (fix removes unneeded points out of the range)
   B provide the user points with clean, quantized timestamps that
     actually lie in the requested range

   With the new patch, we only require calling fix() after getSeries()
   for B, not for A.  So it's still a requirement, and we still have to
   honor it.

Technically, if you only wanted to query for points between a given
range and did not care about B, we don't need to call fix to
"pretty up" the output, and could bypass fix() and be more efficient.
Maybe later we can make it optional.
Instead of having the somewhat awkward requirement that
data returned by getSeries() needs an additional pass through fix(),
just have getSeries take care of it.

An additional benefit is that it simplifies the code.
@Dieterbe
Copy link
Contributor Author

@woodsaj I implemented the improved solution, added more complete unit tests and also made getSeries itself call fix() which removes any confusion about getSeries() output requiring an additional fix() step.

Copy link
Member

@woodsaj woodsaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great.

now that getSeries calls fix() itself, and by calling getTarget with
archive 0 and aggNum 1, it basically just calls getSeries anyway,
which is the thing we wanted to test in the first place,
we can just call getSeries()
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants