Fix #454 #455

replay · 2017-01-08T18:41:14Z

I've gone through the Add/Delete/Search methods of the chunk cache one more time, with the goal of making sure that there can't be a situation where the returned from is > until, as described in #454 .
Since I don't have enough debug data to be sure what exactly happened in #454 I can't exactly reproduce the problem and hence I can't be sure this really fixes it.

While going through the code I also found a short list of other optimizations:

If the cache search returns a result where from == until we don't need to hit Cassandra. This situation can occur if the chunks are not span aware and there is a disconnect in the cache internal linked list despite the fact that all queried chunks are present (for example if insertions happened at a weird order).
If the cache is queried for a from == until we can save a few ops by not even looking up whether the specified metric is present
The debugMetrics method can reuse the already present keys slice, we don't need to rebuild it
Saving the allocation of some variables that are not necessary
When adding a chunk to the cache without specifying the ts of the previous chunk (happens if this is the first chunk of that insertion) then the Add() method should try to figure out if the previous chunk is present, and if it is it connects them in the internal linked list. This only works reliably if either the chunks are span aware or if the span has not changes since > 2 * chunkSpan
adding more tests for chunk insertion scenarios
the chunk cache used to keep track of the oldest and newest present chunk per metric, but these variables are not used at all, so we can drop them
we keep regenerating the sorted list of keys in CCacheMetric, let's rather just keep it until it changes instead

if from == until there is no point in even search for the specified metric, saving a few ops

Dieterbe · 2017-01-09T16:33:07Z

mdata/cache/ccache_metric.go

 	// points at cached data chunks, indexed by their according time stamp
 	chunks map[uint32]*CCacheChunk
+
+	// instead of sorting the keys at each search we can save some ops by
+	// keeping the list until a chunk gets added or deleted


this comment describes the change from previous code to new code, for which you should use the commit message. (which you already do). you should only mention here something like "maintain sorted list of chunk timestamps" or something which is more descriptive of the purpose of the variable.

Dieterbe · 2017-01-09T16:40:55Z

mdata/cache/ccache_metric.go

-
-		if ts <= from {
-			break
-		}


Why this change? is it related to setting prev on cache add? if yes, how so? if not, then what is this? is this a bugfix? what's the bug?

That's mentioned in the last sentence of the commit message. Moving that up fixed a bug which made searchBackward() seek one chunk too far: ecf02ee

I can well imagine that this is the reason for the error that @woodsaj saw. because if there is a disconnect in the linked list then it would seek to the disconnect from both sides, but the backwards one would seek one chunk too far, leading to the cassandra query receiving a to that's before the from.

Dieterbe · 2017-01-09T16:50:49Z

mdata/cache/ccache_test.go

 	if !res.Complete {
-		t.Fatalf("complete is expected to be false")
+		t.Fatalf("complete is expected to be true")


what is going on here? this test has become identical to TestSearchFromBeginningComplete ? we're not testing for incomplete ends I think.

Good catch. I was only looking at this test and didn't pay attention to the one directly above!

Dieterbe · 2017-01-09T17:01:24Z

api/dataprocessor.go

-			// currently we rely on cassandra returning results in order
-			go s.Cache.Add(key, prevts, itgen)
-			prevts = itgen.Ts()
-			iters = append(iters, *it)


this commit doesn't seem to have anything to do with what it claims to be about.
the actual fix somehow snuck in " figure out previous chunk on cache add" and this change is functionality identical than the previous code.

protip: use git add -p to add the right stuff to the right commit, and git rebase -i to make adjustments if needed.
in a few minutes you can set up the history in such a way that I can save many more minutes during the review process, and you'll also save minutes to anyone in the future trying to make sense of what happened to this code.

cleaned up the history

in a scenario where a metric keeps getting queried for the last few seconds (less than one chunk span) this chunk would get added into the cache without the previous chunk being known. with this change the `Add()` method uses the already existing `seekDesc()` method to check if the previous chunk is in the cache and if so they get connected accordingly. the `searchBackward()` method gets changed to not search further than the given `from`.

we're regenerating the sorted list of keys on each cache search. that's not necessary, we might as well just keep the list until the next modification (Add/Del) happens

replay added 3 commits January 8, 2017 07:31

tiny optimization

9b820c2

if from == until there is no point in even search for the specified metric, saving a few ops

no need to rebuild the keys slice when we already have it

2deba6a

remove unnecessary variable

a3f197a

woodsaj approved these changes Jan 9, 2017

View reviewed changes

Dieterbe suggested changes Jan 9, 2017

View reviewed changes

replay force-pushed the 454 branch from 8e5cc77 to ff16df3 Compare January 9, 2017 18:14

replay added 4 commits January 9, 2017 10:42

adds more cache insertion scenarios to tests

59539f4

there is no point in keeping track of oldest and newest cache chunk

c6b35db

if cache returns from == until we do not need to hit cassandra

196703c

don't unnecessarily regenerate keys list

c53dc2d

we're regenerating the sorted list of keys on each cache search. that's not necessary, we might as well just keep the list until the next modification (Add/Del) happens

replay force-pushed the 454 branch from ff16df3 to c53dc2d Compare January 9, 2017 18:45

Dieterbe approved these changes Jan 9, 2017

View reviewed changes

Dieterbe merged commit c4bf293 into master Jan 9, 2017

replay mentioned this pull request Jan 12, 2017

"start must be before end" when requested time range is correct #454

Closed

Dieterbe deleted the 454 branch December 15, 2017 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #454 #455

Fix #454 #455

replay commented Jan 8, 2017

Dieterbe Jan 9, 2017

Dieterbe Jan 9, 2017

replay Jan 9, 2017

replay Jan 9, 2017

Dieterbe Jan 9, 2017

replay Jan 9, 2017

Dieterbe Jan 9, 2017

replay Jan 9, 2017

Fix #454 #455

Fix #454 #455

Conversation

replay commented Jan 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment