Chunk cache #417

replay · 2016-12-07T14:57:59Z

Replaces PR #416

Dieterbe · 2016-12-07T17:02:46Z

iter/itergen.go

+func (ig *IterGen) Get() (*Iter, error) {
+	it, err := tsz.NewIterator(ig.b)
+	if err != nil {
+		log.Error(3, "failed to unpack cassandra payload. %s", err)


logging should be left up to the caller. inside lowlevel structures such as IterGen we can't assume any particular logging approach

Agree, I'll just return the error I got from tsz.NewIterator.

Dieterbe · 2016-12-07T17:04:33Z

iter/itergen.go

+type IterGen struct {
+	b   []byte
+	ts  uint32
+	len uint64


instead of having a len attribute and expecting callers of NewGen to provide it, can't we simply have the Length() method return len(ig.b) ?

totally, i'm annoyed I didn't see that.

Dieterbe · 2016-12-07T17:12:05Z

iter/iter.go

@@ -7,12 +7,10 @@ import (
 // Iter is a simple wrapper around a tsz.Iter to make debug logging easier


this comment is no longer true. the purpose is now simply to abstract away go-tsz

Dieterbe · 2016-12-07T17:24:25Z

mdata/cache/chunk_cache.go

+	itergen iter.IterGen
+}
+
+type MetricCache struct {


this name is a bit confusing. we need a more specific name, perhaps ByMetric (can't come up with anything better for now). or maybe we call this one ChunkCache and what's now ChunkCache could be GlobalChunkCache, or something

I agree, I'm just not sure what to rename it to. I think the name ChunkCache for the most outer scoped struct still makes sense, because it is theoretically possible to have more than one of them (I wouldn't know why).
How about I rename MetricCache to ChunkCacheMetric? I know you don't like long names, but I think it's kind of logical because that makes it clear that this is a more specifically scoped piece that belongs to ChunkCache. To make the names shorter I could also just rename ChunkCache -> CCache & ChunkCacheMetric -> CCacheMetric.

Or like we have Aggmetric and Aggmetrics we could do ChunkCache and ChunkCaches

So how about this:

CCache for the whole cache struct

CCacheMetric for each of the metrics inside the CCache

CCacheChunk for each of the chunks inside a CCacheMetric

That seems quite consistent, not too long, easy to type, easy to understand. Do you think so?

Dieterbe · 2016-12-07T17:27:06Z

mdata/cache/chunk_cache.go

+}
+
+// this assumes we have a lock
+func (cc CacheChunk) SetNext(next uint32) {


lowercase the method, e.g. setNext, this way it can't be called by code in other packages

replay · 2016-12-08T03:53:20Z

I don't want to force push because I don't like changing the past, so I'll just change everything we discussed in the comments and push the changes as a new commit. Once we both agree that this is ready to merge I'll squash them into one.

Dieterbe · 2016-12-09T10:44:09Z

mdata/store_cassandra.go

-				return iters, err
-			}
-			iters = append(iters, iter.New(it, true))
+			itergens = append(itergens, iter.NewGen(b[1:], uint32(ts)))


I think we should pass the chunks as-is to the itergens. e.g. don't remove the first byte which holds the chunk format. and I think we should leave it up to Iter to inspect the format and call into go-tsz when the format is chunk.FormatStandardGoTsz

ok sure. that would probably mean that it makes more sense to move the error definition for errUnknownChunkFormat into iter/itergen.go as well.

why not iter.go?

Well, they are both the same name space. but the reason why i would put it into itergen.go is because that's where the check happens whether the format is ok or not, right?
The check should be in itergen.go in the initializer NewGen because that's where the iterator also gets instantiated which the gets passed to iter.Iter.

i personally would put all tsz and chunk format things in iter.go but if you feel differently about it, then do it your way it doesn't matter much

Dieterbe · 2016-12-09T11:01:41Z

mdata/cache/chunk_cache.go

+		c.Unlock()
+	}
+
+	c.RLock()


c is already rlocked here.

Dieterbe · 2016-12-09T11:04:46Z

mdata/cache/chunk_cache.go

+	if _, ok := c.metricCache[metric]; !ok {
+		c.RUnlock()
+
+		c.Lock()


isn't this a race condition? in between the RUnlock and the Lock another thread could have added the CCacheMetric and perhaps even added chunks to it.
Probably better to just do the ok check and the adding to the map in 1 critical section

You're totally right. I should make these locks less granular

Dieterbe · 2016-12-19T08:59:34Z

metrictank.go

@@ -384,6 +385,8 @@ func main() {
 	/***********************************
 		Initialize our MetricIdx
 	***********************************/
+	cache := cache.NewCCache()
+	metrics = mdata.NewAggMetrics(store, chunkSpan, numChunks, chunkMaxStale, metricMaxStale, ttl, gcInterval, finalSettings)


this doesn't look right. why is the diff adding a NewAggMetrics() line but not removing one?

right, i think i messed up a rebase there. fixing

Dieterbe · 2016-12-19T09:11:36Z

mdata/store_cassandra.go

 			if err != nil {
-				log.Error(3, "failed to unpack cassandra payload. %s", err)


why do we not log the error we get back anymore?

Dieterbe · 2017-01-04T22:07:17Z

in 4cb3845 you re-introduced a bug that you had fixed, we now add to cache again before the check that a chunk is not corrupt

also fixes a bug that has been found by the new test added in this commit

adds a new target to the make file which calles build.sh and build_tools.sh with -race. this makes them use the race flag.

this is a bit more clear that "endTs" is really the ts that is the beginning of the next chunk. If we used continuous math (and/or if everyone thought of our timestamps as continuous) then we could have used the same name. However since our timestamps are discrete and at second resolution, it's more clear this way. (the end of a chunk is one second prior to the t0 of the next)

- improves the getSeries test to go through various combinations of datasources - moves cache and cache accounting into interfaces - adds an EndTs() method to chunk.IterGen - adds Stop() method to accounting and cache to stop their go routines

if an itergen that has been returned by the store is invalid, then do not add it to the cache

replay changed the title ~~Chunk cache~~ [WIP] Chunk cache Dec 7, 2016

Dieterbe reviewed Dec 7, 2016

View reviewed changes

replay force-pushed the chunk-cache branch 3 times, most recently from 2a9fc82 to 02b8447 Compare December 8, 2016 05:05

Dieterbe reviewed Dec 9, 2016

View reviewed changes

mdata/cache/chunk_cache.go

c.Unlock()

}

c.RLock()

Copy link

Contributor

Dieterbe Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c is already rlocked here.

Dieterbe reviewed Dec 9, 2016

View reviewed changes

replay force-pushed the chunk-cache branch 7 times, most recently from 415e538 to 67fc9dc Compare December 14, 2016 09:32

Dieterbe self-assigned this Dec 15, 2016

Dieterbe added the this week label Dec 15, 2016

Dieterbe added this to the hosted-metrics-alpha milestone Dec 15, 2016

replay force-pushed the chunk-cache branch 5 times, most recently from ea2a343 to 3752791 Compare December 19, 2016 07:13

Dieterbe reviewed Dec 19, 2016

View reviewed changes

replay force-pushed the chunk-cache branch from 0df6d75 to 3844698 Compare January 4, 2017 21:21

Dieterbe force-pushed the chunk-cache branch from 3844698 to f0fee96 Compare January 4, 2017 21:26

replay and others added 21 commits January 5, 2017 00:08

add dataprocessor tests

a9f9b75

Adds a test for getSeriesCached

a1dabf5

also fixes a bug that has been found by the new test added in this commit

add comments to mock store

33aa06a

add more comments

4119685

comment fix

79a8164

comment fix

a0f5e15

add make target for builds with -race

37f22ba

adds a new target to the make file which calles build.sh and build_tools.sh with -race. this makes them use the race flag.

fix unlock/yield order, because defers get executed LIFO!

9460d78

Directly use FlatAccnt implementation instead of interface

1761226

Directly use CCache implementation instead of abstract interface

d67ff1f

fix naming of variables in tests to be more explanatory

e12fba0

docs

7aef999

do chunk cache stats the new way

7f34a97

refactor getSeries test and small fixes

bbfa75f

- improves the getSeries test to go through various combinations of datasources - moves cache and cache accounting into interfaces - adds an EndTs() method to chunk.IterGen - adds Stop() method to accounting and cache to stop their go routines

remove legacy stuff

e2cfafc

verify chunk cache hits in getSeries() test

92eed67

fix an error i already fixed & reintroduced

eb4e8ff

if an itergen that has been returned by the store is invalid, then do not add it to the cache

simplify

26041f6

add interesting case

138c7b4

changing prints to log statements

f3f60df

Dieterbe force-pushed the chunk-cache branch from 3dccd18 to f3f60df Compare January 4, 2017 23:11

Dieterbe approved these changes Jan 4, 2017

View reviewed changes

Dieterbe merged commit a2679fd into master Jan 4, 2017

Dieterbe removed the in progress label Jan 4, 2017

Dieterbe mentioned this pull request Jan 4, 2017

facilitating cache warmup / caching cassandra data #401

Closed

Dieterbe deleted the chunk-cache branch December 15, 2017 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunk cache #417

Chunk cache #417

replay commented Dec 7, 2016

Dieterbe Dec 7, 2016

replay Dec 8, 2016

Dieterbe Dec 7, 2016

replay Dec 8, 2016

Dieterbe Dec 7, 2016

Dieterbe Dec 7, 2016 •

edited

Loading

replay Dec 8, 2016

Dieterbe Dec 8, 2016

replay Dec 21, 2016

Dieterbe Dec 21, 2016

Dieterbe Dec 7, 2016

replay commented Dec 8, 2016

Dieterbe Dec 9, 2016

replay Dec 9, 2016

Dieterbe Dec 9, 2016

replay Dec 9, 2016

Dieterbe Dec 9, 2016

Dieterbe Dec 9, 2016

Dieterbe Dec 9, 2016

replay Dec 9, 2016

Dieterbe Dec 19, 2016

replay Dec 19, 2016

Dieterbe Dec 19, 2016

Dieterbe commented Jan 4, 2017

		@@ -7,12 +7,10 @@ import (
		// Iter is a simple wrapper around a tsz.Iter to make debug logging easier

		if err != nil {
		log.Error(3, "failed to unpack cassandra payload. %s", err)

Chunk cache #417

Chunk cache #417

Conversation

replay commented Dec 7, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe Dec 7, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay commented Dec 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe commented Jan 4, 2017

Dieterbe Dec 7, 2016 •

edited

Loading