Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Fix #454 #455

Merged
merged 8 commits into from
Jan 9, 2017
Merged

Fix #454 #455

merged 8 commits into from
Jan 9, 2017

Conversation

replay
Copy link
Contributor

@replay replay commented Jan 8, 2017

I've gone through the Add/Delete/Search methods of the chunk cache one more time, with the goal of making sure that there can't be a situation where the returned from is > until, as described in #454 .
Since I don't have enough debug data to be sure what exactly happened in #454 I can't exactly reproduce the problem and hence I can't be sure this really fixes it.

While going through the code I also found a short list of other optimizations:

  • If the cache search returns a result where from == until we don't need to hit Cassandra. This situation can occur if the chunks are not span aware and there is a disconnect in the cache internal linked list despite the fact that all queried chunks are present (for example if insertions happened at a weird order).
  • If the cache is queried for a from == until we can save a few ops by not even looking up whether the specified metric is present
  • The debugMetrics method can reuse the already present keys slice, we don't need to rebuild it
  • Saving the allocation of some variables that are not necessary
  • When adding a chunk to the cache without specifying the ts of the previous chunk (happens if this is the first chunk of that insertion) then the Add() method should try to figure out if the previous chunk is present, and if it is it connects them in the internal linked list. This only works reliably if either the chunks are span aware or if the span has not changes since > 2 * chunkSpan
  • adding more tests for chunk insertion scenarios
  • the chunk cache used to keep track of the oldest and newest present chunk per metric, but these variables are not used at all, so we can drop them
  • we keep regenerating the sorted list of keys in CCacheMetric, let's rather just keep it until it changes instead

if from == until there is no point in even search for the specified
metric, saving a few ops
// points at cached data chunks, indexed by their according time stamp
chunks map[uint32]*CCacheChunk

// instead of sorting the keys at each search we can save some ops by
// keeping the list until a chunk gets added or deleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment describes the change from previous code to new code, for which you should use the commit message. (which you already do). you should only mention here something like "maintain sorted list of chunk timestamps" or something which is more descriptive of the purpose of the variable.


if ts <= from {
break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? is it related to setting prev on cache add? if yes, how so? if not, then what is this? is this a bugfix? what's the bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's mentioned in the last sentence of the commit message. Moving that up fixed a bug which made searchBackward() seek one chunk too far: ecf02ee

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can well imagine that this is the reason for the error that @woodsaj saw. because if there is a disconnect in the linked list then it would seek to the disconnect from both sides, but the backwards one would seek one chunk too far, leading to the cassandra query receiving a to that's before the from.

if !res.Complete {
t.Fatalf("complete is expected to be false")
t.Fatalf("complete is expected to be true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is going on here? this test has become identical to TestSearchFromBeginningComplete ? we're not testing for incomplete ends I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I was only looking at this test and didn't pay attention to the one directly above!

// currently we rely on cassandra returning results in order
go s.Cache.Add(key, prevts, itgen)
prevts = itgen.Ts()
iters = append(iters, *it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this commit doesn't seem to have anything to do with what it claims to be about.
the actual fix somehow snuck in " figure out previous chunk on cache add" and this change is functionality identical than the previous code.

protip: use git add -p to add the right stuff to the right commit, and git rebase -i to make adjustments if needed.
in a few minutes you can set up the history in such a way that I can save many more minutes during the review process, and you'll also save minutes to anyone in the future trying to make sense of what happened to this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaned up the history

in a scenario where a metric keeps getting queried for the last few
seconds (less than one chunk span) this chunk would get added into the
cache without the previous chunk being known. with this change the
`Add()` method uses the already existing `seekDesc()` method to check if
the previous chunk is in the cache and if so they get connected
accordingly.

the `searchBackward()` method gets changed to not search further than
the given `from`.
we're regenerating the sorted list of keys on each cache search. that's
not necessary, we might as well just keep the list until the next
modification (Add/Del) happens
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants