WIP Graphitey consolidation step 2 and 3 #557

Dieterbe · 2017-02-28T22:26:42Z

see #463 (comment)
mostly working on step 3 first. need to find an elegant way to make available the retentions for every req in alignRequests. perhaps the index could store it for each idx.Node (e.g. do all the matching when ingesting new points into index), or the query engine could keep a cache of paths->retention scheme. (do the matching when data is requested)

Dieterbe · 2017-02-28T22:29:30Z

mdata/aggmetrics.go

-		chunkMinTs := now - (now % ms.chunkSpan) - uint32(ms.chunkMaxStale)
-		metricMinTs := now - (now % ms.chunkSpan) - uint32(ms.metricMaxStale)
+		chunkMinTs := now - uint32(ms.chunkMaxStale)
+		metricMinTs := now - uint32(ms.metricMaxStale)


I don't recall why we had to subtract the chunkspan for these calculations. if we still have to do this, this would be trickier, as the chunkspan is specific to the metric that we'll check.

The old code calculated the last chunk T0 that is older then chunkMaxStale. I believe this is because the logic in AggMetric.GC() compared hte chunkMinTs and metricMinTs to the chunk.T0, but as we now compare against chunk.LastWrite this new way should be fine.

The question this raises though, is should our maxStale settings be per retention policy? If a MT service is receiving some metrics every 10seconds and some every few hours, then applying the same maxStale settings to both is not ideal.

Dieterbe · 2017-03-01T21:57:05Z

todo:

awareness of which consolidation funcs are available for which series, handle that properly. set consolidation function correctly
fix unit tests
extend alignRequests for multiple rules
test in docker
currently the schemas & aggregation rules stuff is split between mdata (reads the config files, holds the structures) and the idx (tracks which rule applies for each metric). some mdata notifiers need to do idx lookups, and the idx needs to do schema rule lookups when loading in a persisted index. It seems a bit too messy, and would like to clean it up more, but not sure if i can make it more elegant
also the library we use isn't too clean to integrate with. might want to just bring in the code we need and clean it up

Dieterbe · 2017-03-06T22:57:48Z

@woodsaj @replay PTAL
I still have to iron out some of the tests but I think some feedback at this point would be very useful. thanks.

Dieterbe · 2017-03-07T20:57:03Z

TODO: manual tests:

verify it creates the right archives
verify it reads from the right one
test normalizing
verify/query TTL from cassandra

note that whisper files and structures are untouched. We only use this extended format for metrictank. also: if failure to parse, return error

ChunkSpan, NumChunks and Ready fields are now supported, also optionally specified via the config file

so we can efficiently reference a schema or aggregation definition also make WhisperAggregation.Match public

because we want to work with them in metrictank

replay · 2017-03-08T16:37:07Z

api/query_engine.go

 				break
 			}
 		}

-	}
+		if req.Archive == -1 {
+			return nil, errUnSatisfiable


do we really want the whole http request to end up in an error if one of the reqs was not satisfiable? couldn't we just skip that one and continue with the others?

IMHO yes. this is such a rare case (basically all archives have ready=false). it's okay to make this problem very clear so that the admin can fix it. once we have grafana/grafana#6448 we could make this a bit nicer (it would allow us to show a warning/error and still return the data we can)

replay · 2017-03-08T16:55:13Z

api/query_engine.go

+
+			// let's see first if we can deliver it via lower-res rollup archives, if we have any
+			retentions := mdata.GetRetentions(req.SchemaI)
+			for i, ret := range retentions[req.Archive+1:] {


if !ret.Ready then the retention will not be useful here, so we might as well do if !ret.Ready {continue} and skip the potential unnecessary uint32(ret.SecondsPerPoint()). then it can be removed from the condition 2 lines down.

so you'd add another if clause ? i don't see how that's so much better.

replay · 2017-03-08T16:58:49Z

api/query_engine.go

+				if interval == archInterval && ret.Ready {
+					// we're in luck. this will be more efficient than runtime consolidation
+					req.Archive = req.Archive + 1 + i
+					req.ArchInterval = uint32(ret.SecondsPerPoint())


we already did that conversion, so this could be archInterval

replay · 2017-03-08T17:49:17Z

input/input.go

@@ -67,15 +67,18 @@ func (in DefaultHandler) Process(metric *schema.MetricData, partition int32) {
 		return
 	}

+	schemaI, _ := mdata.MatchSchema(metric.Name)


If i see that right this means that every time when a metric datapoint arrives we're going to do a regex pattern match at https://github.com/raintank/metrictank/blob/0da87741ab2039f5c3ff2988eb2c74c5bd5b87eb/vendor/github.com/lomik/go-carbon/persister/whisper_schema.go#L87
I'm wondering if it wouldn't be more efficient to associate these schemaI and aggI with the Nodes in idx/memory/memory.go and then this regex match would only be necessary once per metric, when a new metric gets added to the index.

2 pattern matches actually. one for schema, once for aggregation.
that's a good idea. So basically instead of passing in the schemaI and aggI into AddOrUpdate, AddOrUpdate would return those values instead?
That seems like a good idea to me. For now I've purposely left all the matching obviously unoptimized, to get an idea of how it would perform, but you're right that it'll likely be quite the impact. I still have to measure it though.

i think it would make sense to have idx.AddOrgUpdate() to return idx.Node

i think it would make sense to have idx.AddOrgUpdate() to return idx.Node

in the case where the metric already exists, we can currently just check for the def in DefById via a simple id lookup, and return if it exists. To support what you suggest, I think we would have either do a lookup in the tree in that case, or somehow link to the idx.Node, with extra data in DefsById or something.
I think It's not the index' concern that the caller of the index would like to know these properties about an entry when it tries to add a new entry, I rather keep the index focused on it's task of being an index rather than trying to accommodate for this.
the current caching mechanism seems to work well enough.

replay · 2017-03-08T18:36:46Z

Overall looking really good to me.
Since there are quite a few modifications in the vendored repo github.com/lomik/go-whisper it might make sense to fork it and then maintain our own fork of it

Dieterbe · 2017-03-08T21:28:21Z

Since there are quite a few modifications in the vendored repo github.com/lomik/go-whisper it might make sense to fork it and then maintain our own fork of it

I would rather just pull out all that code and have our own mini library (based on that patched code) for parsing of the config files, and none of all the other stuff that that library comes with (e.g. the whisper specific bits). but that's lower prio.

Dieterbe · 2017-03-09T08:40:58Z

ok so it is confirmed. both carbon input plugin spends the majority of its time, and DefaultHandler.Process the vast majority of its time, doing regex matching. @replay your suggestion would help for DefaultHandler but not for carbon plugin, as the carbon plugin needs to do the schema match to know the interval so it can generate the id which it needs to query the index. the carbon plugin could potentially call idx.Find("metric.name") but idx.Find does more than what's needed for this case, so i would rather not do that for every point that comes in. Perhaps a map to cache match results would be best.

Dieterbe · 2017-03-09T10:05:51Z

@replay see the new commits.

1) use storage-schemas.conf and storage-aggregation.conf to configure numchunks, chunkspans, ttls and aggregations. For carbon also to find out the raw interval. For other input plugins we only honor interval specified in the data Note: - persist message now need to mention which metric names, in case metrics need to be created on the other end. 2) support configurable retention schemes based on patterns, like graphite. adjust alignRequests accordingly 3) support configurable aggregation bands to control which rollups get created per metric pattern. 4) add back 'last' roll-up aggregation and runtime consolidation It was removed in #142 / 2be3546 based on a discussion in https://github.com/raintank/strategy/issues/11 on the premise that storing this extra series for all our metrics wasn't worth it and it's not queryable via consolidateBy which only does sum, avg, min and max. However: 1) now aggregations are configurable: One only enables specific aggregation bands on an as-needed basis. 2) graphite supports last rollups, so we must too. 3) for certain types of data (e.g. counters) it's simply the best (better than all the rest) approach 4) storage level aggregation is not as tied to consolidateBy() as we originally made it out to be, so that point doesn't really apply anymore. 5) we also restore the 'last' runtime consolidation code, even though we will no longer do runtime consolidation in MT for now but the upcoming normalization uses the same code. As we should use 'last' for normalization if rollups was 'last'. (see next commit for more info)

"runtime consolidation" used to be controlled by consolidateBy and affect consolidation after api processing by (and in) graphite, as well as before (when normalizing data before processing). Now these two are distinct: - only the post-function processing should be affected by consolidateBy, and should be referred to as "runtime consolidation". Since this code is currently in graphite, metrictank shouldn't really do anything. In particular we still parse out consolidateBy(..., "func") since that's what graphite-metrictank sends, but we ignore it - the only runtime consolidation metrictank currently does, is normalizing archives (making sure they are in the same resolution) before feeding them into the api processing pipeline. To avoid confusion this is now called normalizing, akin to `normalize` in graphite's functions.py This is an extension of the consolidation mechanism used to create the rollup archive. This used to be controlled by consolidateBy, but not anymore. Now: * the archive being read from is whathever is the primary (first) aggregationMethod defined in storage-aggregation.conf . * consolidation method for normalization is the same (always). Both are controlled via the consolidation property of Req. Later we can extend this to choose the method via queries, but it'll have to use a mechanism separate of consolidateBy.

@replay

before this, all regex matching dominated the cpu profile. With this, cpu usage reduced by easily 5x Though we still have: flat cumulative 70ms 0.86% 69.79% 770ms 9.49% github.com/raintank/metrictank/mdata/matchcache.(*Cache).Get due to the map locking We could further optimize this, probably, by changing the idx.AddOrUpdate signature to returning SchemaI and AggI, instead of requiring it as input as @replay suggested. This way we only have to match if it wasn't in the index already. However this requires more intensive changes to the index than I'm comfortable with right now (DefById only has the metricdef, not the properties, we could add them but then we need to adjust how we work with DefById everywhere and do we still need to store the properties in the tree, etc) I rather re-address this when the need is clearer and we have time to give this the attention it deserves.

on my laptop: MASTER: BenchmarkProcess-8 2000000 718 ns/op PASS ok github.com/raintank/metrictank/input 2.796s HEAD: go test -run='^$' -bench=. BenchmarkProcessUniqueMetrics-8 1000000 2388 ns/op BenchmarkProcessSameMetric-8 2000000 885 ns/op PASS ok github.com/raintank/metrictank/input 6.081s So we're a bit slower but carbon input should be a lot faster (for which we don't have benchmarks) since it used to do regex matching all the time

Dieterbe · 2017-03-09T22:42:57Z

manual testing

setup

using this tool i write some fake data into MT as specific series names that will match all the different settings below
raintank/fakemetrics@589b6cd
I then run MT (docker-dev stack) with these configs:

storage-schemas.conf:

# note ttl 1d -> metrics_16, 2d -> metrics_32, 3d/4d -> metrics_64. 7d (auto added default by MT) -> metrics_128, but we don't add anything in that table

[agg-max]
pattern = agg.*max$
retentions = 1s:1d:10min:7,5s:4d:10min:2

[agg-min]
pattern = agg.*min$
retentions = 1s:1d:10min:7,5s:4d:10min:2

[agg-all]
pattern = agg.*all$
retentions = 1s:1d:10min:7,10s:3d:10min:2

[agg-default]
pattern = .*agg.*
retentions = 1s:1d:10min:7,10s:2d:10min:2

[default]
pattern = .*
retentions = 1s:1d:10min:7

storage-aggregation.conf:

[min]
pattern = .*min$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = .*max$
xFilesFactor = 0.1
aggregationMethod = max

[all]
pattern = .*all$
xFilesFactor = 0.1
aggregationMethod = avg,min,max,last,sum

[default]
pattern = .*
xFilesFactor = 0.1
aggregationMethod = avg,min,max

results

verified:

reads from the correct default aggregation (look at data of few days ago. current data looks identical because it's raw)
fakemetrics.agg.all -> averaged
fakemetrics.agg.default -> averaged
fakemetrics.agg.min -> minned
fakemetrics.agg.max -> maxxed
note: can also see fakemetrics.raw.* data for many days back, but raw in fact, due to how we apply TTL's (even for old data, more than few days ago, will get ttl of 1d when we insert it now)
note: when plotting fakemetrics.raw.* by themselves, all the points are identical across the different series, both for zoomed-in where we just get to see the raw secondly data and zoomed-out views when graphite applies same runtime consolidation to all (if needed) of course on top of the raw data
However, when we plot both fakemetrics.raw.* and fakemetrics.agg.* and we look at old data (not raw 1s where they are all the same), the data becomes 10-secondly, the raw metrics also get the correct normalization functions applied. This was a pleasant somewhat-surprise: while the series have no rollups defined, if it needs to be normalized (setting series to the same resolution. the secondly raw data and the 10s rolled up data, it does so using the correct functions (e.g. as if we had the proper rolled up serie).
Although I noticed the values are not the same as the agg metrics. One explanation is that rollup series include exactly the points within clean 10s boundaries, whereas runtime normalization aggregates unaligned, but it should only do 10s at each step, so i don't understand yet why there's a difference of 11 between the min and max.

perf looks good (400kHz on my laptop via carbon, no http workload -> profile looks good. though about 10% in matchcache.Get)
I ran select count(*) from metric_16; also _32, _64 and specifically _128 (created due to the default 7d retention policy) remains empty, because all series have proper schemas that match something specific.

TODO:

verify correct retentions applied
verify it creates the right archives using the correct rollup functions.
verify/query TTL from cassandra

currently working on this:

./mt-index-cat -prefix fakemetrics -max-age 3d cass '{{.Id}} {{.Name}}'
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min -> 5s for 4d min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max -> 5s for 4d max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all -> 10s for 3d avg,min,max,last,sum
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default -> 10s for 2d avg,min,max
1.44df58d8b4cc9400cc59ffa5261ab5c6 fakemetrics.raw.all -> none
1.77c8c77afa22b67ef5b700c2a2b88d5f fakemetrics.raw.min -> none
1.b3889b3258d8c489b2953f2f2faa53eb fakemetrics.raw.default -> none 
1.dc37a8981996b54a8684c6ed6f79e632 fakemetrics.raw.max -> none

will either try to get something like this working:
(need to make the ttl less precise, e.g. in hours, so i can easier get summary stats using uniq but having some trouble with a '^M' char making it into $ttl
once I get this loop working, should be able to confirm all the remaining todo's.

docker exec -it $(dockernametoid cassandra) cqlsh -e "select key, ttl(data) from metrictank.metric_16" --no-color | egrep '1.0b06d3c572f05abc92401944f8ee3759|1.236a99835b46cf4e2ecc0d912488421c|1.37cf8e3731ee4c79063c1d55280d1bbe|1.de816e43b0c3555eff9157fe0f026b12|1.44df58d8b4cc9400cc59ffa5261ab5c6|1.77c8c77afa22b67ef5b700c2a2b88d5f|1.b3889b3258d8c489b2953f2f2faa53eb|1.dc37a8981996b54a8684c6ed6f79e632' | sed 's/[[:blank:]]//g' | tr '|' ' ' | tr -d "\M" | while read key ttl; do echo $key $((ttl/3600)); done

or will use mt-store-cat, maybe.

much cleaner code if I may say so. also: consistently apply a simpler "add default to catch all remaining metrics" default mechanism.

This reverts commit 3bae47f. They consume about 1% cpu (flat and cumul) under heavy ingest load and they basically just confirm it works exactly like it should. But the commit is here in case someone ever wants to re-apply it.

prints how many chunks per series in each table, with TTL's optionally filter down by metric prefix

Dieterbe · 2017-03-10T12:53:43Z

below is the different series in each table, with TTL in hours (rounded) and how many chunk each has.
the latter number is not very important, loading charts shows that the data is there. this is mostly to confirm that all the right aggregations and TTL's are applied per the settings above.
looks correct to me :) this completes the last todo's. Though I will think a bit more about how we can test more (e.g. where to apply unit tests)
also not sure where table metric_0 is coming from. I'll look into that too.

./mt-store-cat -cassandra-keyspace metrictank normal full 3600 fake
# Looking for these metrics:
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all
1.44df58d8b4cc9400cc59ffa5261ab5c6 fakemetrics.raw.all
1.77c8c77afa22b67ef5b700c2a2b88d5f fakemetrics.raw.min
1.b3889b3258d8c489b2953f2f2faa53eb fakemetrics.raw.default
1.dc37a8981996b54a8684c6ed6f79e632 fakemetrics.raw.max
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default
# Keyspace "metrictank" contents:
## Table metric_0
## Table metric_16
1.77c8c77afa22b67ef5b700c2a2b88d5f_615 23 863
1.dc37a8981996b54a8684c6ed6f79e632_615 23 863
1.37cf8e3731ee4c79063c1d55280d1bbe_615 23 863
1.b3889b3258d8c489b2953f2f2faa53eb_615 23 863
1.de816e43b0c3555eff9157fe0f026b12_615 23 863
1.0b06d3c572f05abc92401944f8ee3759_615 23 863
1.236a99835b46cf4e2ecc0d912488421c_615 23 863
1.44df58d8b4cc9400cc59ffa5261ab5c6_615 23 863
## Table metric_32
1.de816e43b0c3555eff9157fe0f026b12_min_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_max_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_sum_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_cnt_10_615 47 863
## Table metric_64
1.0b06d3c572f05abc92401944f8ee3759_min_5_615 95 863
1.236a99835b46cf4e2ecc0d912488421c_max_5_615 95 863
1.37cf8e3731ee4c79063c1d55280d1bbe_sum_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_max_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_min_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_cnt_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_lst_10_615 71 863

Dieterbe · 2017-03-10T13:58:21Z

one thing i should definitely still test, is the same thing as ^^ but metrics that fall into the default schema/agg settings.

Dieterbe · 2017-03-12T19:50:31Z

I can confirm, when i remove the default rules from the storage-aggregations.conf and storage-schemas.conf posted above, it relies on the defaults built into MT : minutely points for 7 days, for all raw.* metricsand only avg aggregation for those metrics that get a rollup schema applied but use the default aggregation, which is 1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default

./mt-store-cat -cassandra-keyspace metrictank normal full 3600 fake
# Looking for these metrics:
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all
1.4884e7c2e831082a184e5d52ae4fbd73 fakemetrics.raw.default
1.4f3a41215ef4c81b44aae23ce6557e97 fakemetrics.raw.max
1.5b5d1932b770fafe2e7a1d67947cb7bb fakemetrics.raw.all
1.79c6422293c1d7a1458d7fe0199b52d0 fakemetrics.raw.min
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default
# Keyspace "metrictank" contents:
## Table metric_0
## Table metric_16
1.37cf8e3731ee4c79063c1d55280d1bbe_615 23 863
1.de816e43b0c3555eff9157fe0f026b12_615 23 863
1.0b06d3c572f05abc92401944f8ee3759_615 23 863
1.236a99835b46cf4e2ecc0d912488421c_615 23 863
## Table metric_32
1.de816e43b0c3555eff9157fe0f026b12_sum_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_cnt_10_615 47 863
## Table metric_64
1.0b06d3c572f05abc92401944f8ee3759_min_5_615 95 863
1.236a99835b46cf4e2ecc0d912488421c_max_5_615 95 863
1.37cf8e3731ee4c79063c1d55280d1bbe_sum_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_max_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_min_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_cnt_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_lst_10_615 71 863
## Table metric_128
1.5b5d1932b770fafe2e7a1d67947cb7bb_615 167 71
1.4f3a41215ef4c81b44aae23ce6557e97_615 167 71
1.79c6422293c1d7a1458d7fe0199b52d0_615 167 71
1.4884e7c2e831082a184e5d52ae4fbd73_615 167 71

Dieterbe · 2017-03-12T20:03:24Z

@woodsaj @replay can we merge?

replay · 2017-03-14T16:03:33Z

conf/aggregations.go

+
+		item.XFilesFactor, err = strconv.ParseFloat(s.ValueOf("xFilesFactor"), 64)
+		if err != nil {
+			return Aggregations{}, fmt.Errorf("[%s]: failed to parse xFilesFactor %q: %s", item.Name, s.ValueOf("xFilesFactor"), err.Error())


Why not return result here instead of instantiating a new Aggregations{}

I could go either way on this. this code is a bit more explicit that it's an empty value, but you're right we could reuse result. I don't see a strong reason to prefer either way.

replay · 2017-03-14T16:05:53Z

conf/aggregations.go

+		methodStrs := strings.Split(aggregationMethodStr, ",")
+		for _, methodStr := range methodStrs {
+			switch methodStr {
+			case "average", "avg":


is avg actually an allowed value in whisper? i can't see it here https://github.com/graphite-project/whisper/blob/f7e05cef81279bde327d8f1501acbf9caf730102/whisper.py#L515

good point. it's not documented in the config documentation either http://graphite.readthedocs.io/en/latest/config-carbon.html#storage-aggregation-conf ; so i guess metrictank will accept it because it is nice to its users.

replay · 2017-03-14T16:14:49Z

conf/retention.go

+			return fmt.Errorf("lower resolution retentions must be evenly divisible by higher resolution retentions (%d does not divide by %d)", ret.SecondsPerPoint, prev.SecondsPerPoint)
+		}
+
+		if prev.MaxRetention() >= ret.MaxRetention() {


shouldn't that be > instead of >= according to rule 4.

no? rule 4 says " Lower precision archives must cover larger time intervals than higher precision archives." we assert this by erroring out if the lower precision archive (ret) is smaller than or equal to the higher precision archive (prev).

oh right, all good

replay · 2017-03-14T16:45:47Z

mdata/matchcache/matchcache.go

+	return item.val
+}
+
+func (m *Cache) maintain() {


wouldn't it maybe make sense to have some way to shut this down? otherwise when there are unit test that initialize many instances of Cache we'll end up with tons of maintain() processes which also cause the cache to not get freed even though it might already not be used anymore, because it's still referenced by that proc.
could be a simple flag that's checked in the for, which makes it return if true

replay · 2017-03-14T16:51:17Z

mdata/matchcache/matchcache.go

+	diff := int64(m.expireAfter.Seconds())
+	for now := range ticker.C {
+		nowUnix := now.Unix()
+		m.Lock()


so this keeps the lock until the for iterated over every single entry in m.data. are you sure that's not going to lead to long blocks if there's a really large number of metrics? even though it's only on the ingestion path, it would be quite easy to split it up and check something like len(m.data) / 1000 items every m.cleanInterval / 1000 seconds.

woodsaj · 2017-03-15T09:54:24Z

I see a few problems with this implementation.

the schemaId and AggId are stored per series name, but as we support changing the interval which series are sent at, we also need to support having different retention policies based on the interval. eg if i have a series some.metric.1 being sent every minute with a retention of "1m:1w,10m:5week,1h:1y", and then i decide to now send the metric every second, the existing retention will store the raw 1second data for 1week. So i probably want a new retention of "1s:1d,1m:1w,10m:5w,1h:1y"
by using the mtachcache we now no longer have a single source of truth for the properties of a metric. I think this is a bad idea and we should just store all properties of a metric in the metric index. The retention policy is a property of the metric. Storing the schemaId and aggId with the def will also allow you to return it from idx.Get()
The schemaId and aggId dont get sent when a new leaf node is added when there is an existing branch node. https://github.com/raintank/metrictank/blob/14d0c7137f506a2501c458f6f947be1e993c5b71/idx/memory/memory.go#L205-L211

So my recommandation is to

define an idx.IndexEntry struct that uses an embedded schema.MetricDefinition.

type IndexEntry struct {
  schema.MetricDefinition
  SchemaId uint16
  AggId uint16
}

Change the idx.MetricIndex interface to

type MetricIndex interface {
	Init() error
	Stop()
	AddOrUpdate(*schema.MetricData, int32) error
	Get(string) (IndexEntry, bool)
	Delete(int, string) ([]IndexEntry, error)
	Find(int, string, int64) ([]Node, error)
	List(int) []IndexEntry
	Prune(int, time.Time) ([]IndexEntry, error)
}

in MemoryId, change

type MemoryIdx struct {
	sync.RWMutex
	FailedAdds map[string]error // by metric id
	DefById    map[string]*idx.IndexEntry
	Tree       map[int]*Tree
}

As metricDefs/metricData are added to the index, lookup the SchemaId and AggId.

When the the schemaId or AggId are needed they can be fetched from the index efficiently with
metricIdx.Get(def.Id) or if you dont know the def.Id and only know the series name (Carbon Input) you could use metrixIdx.Find(orgId, seriesName, 0)

woodsaj · 2017-03-15T13:17:31Z

Also please s/schemaI/schemaId/ and s/aggI/aggId/

Dieterbe · 2017-03-15T18:26:15Z

only know the series name (Carbon Input) you could use metrixIdx.Find(orgId, seriesName, 0)

we should then also update the idx.Node type I think, its Defs property should be a []IndexEntry not []schema.MetricDefinition
I suspect metricIdx.Find will turn out to be too expensive to run for every point that comes in. The current Find function is optimized for "rich" queries that may have patterns, may want data from org -1, it does a bunch of allocations and processing.
But we could do it like that, and if the numbers confirm a bottleneck, it should be easy to add a new interface function that is a narrower version of Find and only does a direct lookup in the Tree, and only for the given org

so with that in mind, i'll implement those changes.

Dieterbe · 2017-03-17T18:25:13Z

closing in favor of #570

Dieterbe commented Feb 28, 2017

View reviewed changes

Dieterbe force-pushed the graphitey-consolidation branch from e40a6a0 to 05555be Compare March 1, 2017 21:55

Dieterbe force-pushed the graphitey-consolidation branch 7 times, most recently from 85f424b to b68228a Compare March 6, 2017 22:35

Dieterbe force-pushed the graphitey-consolidation branch from b68228a to 0da8774 Compare March 7, 2017 10:16

Dieterbe added 4 commits March 8, 2017 13:26

go-carbon: support multiple aggregationMethods per pattern

9580fe2

note that whisper files and structures are untouched. We only use this extended format for metrictank. also: if failure to parse, return error

go-carbon: support extra fields for retentions

2b3e201

ChunkSpan, NumChunks and Ready fields are now supported, also optionally specified via the config file

go-carbon: upon schema/agg match, also return index

22a1e6e

so we can efficiently reference a schema or aggregation definition also make WhisperAggregation.Match public

go-carbon: expose WhisperAggregationItem and properties

0d51ea6

because we want to work with them in metrictank

replay reviewed Mar 8, 2017

View reviewed changes

Dieterbe force-pushed the graphitey-consolidation branch from 0da8774 to 486d645 Compare March 8, 2017 21:29

Dieterbe mentioned this pull request Mar 8, 2017

Whisper importer #533

Merged

Dieterbe force-pushed the graphitey-consolidation branch from 0842aa1 to 72f4cf4 Compare March 9, 2017 10:40

Dieterbe added 2 commits March 9, 2017 14:46

Dieterbe added 3 commits March 9, 2017 14:47

simplify a bit

d1fc89d

Dieterbe force-pushed the graphitey-consolidation branch 2 times, most recently from 6056bda to 3a6a85e Compare March 9, 2017 21:52

Dieterbe added 3 commits March 10, 2017 00:20

replace the already-quite-patched lomik libraries with our own version

02d9f42

much cleaner code if I may say so. also: consistently apply a simpler "add default to catch all remaining metrics" default mechanism.

hit/miss stats for matchcache

be0b16e

Revert "hit/miss stats for matchcache"

8ddd548

This reverts commit 3bae47f. They consume about 1% cpu (flat and cumul) under heavy ingest load and they basically just confirm it works exactly like it should. But the commit is here in case someone ever wants to re-apply it.

Dieterbe force-pushed the graphitey-consolidation branch from 3a6a85e to 8ddd548 Compare March 9, 2017 23:21

add a mode to mt-store-cat to see a full dump

14d0c71

prints how many chunks per series in each table, with TTL's optionally filter down by metric prefix

Dieterbe requested a review from woodsaj March 12, 2017 20:03

replay reviewed Mar 14, 2017

View reviewed changes

Dieterbe mentioned this pull request Mar 17, 2017

Graphitey consolidation step and 3 (redo) #570

Merged

Dieterbe closed this Mar 17, 2017

Dieterbe changed the title ~~WIPGraphitey consolidation step 2 and 3~~ WIP Graphitey consolidation step 2 and 3 Apr 4, 2017

Dieterbe mentioned this pull request Apr 24, 2017

chunk-max-stale / metrics-max-stale per retention #614

Closed

Dieterbe deleted the graphitey-consolidation branch September 18, 2018 09:12

WIP Graphitey consolidation step 2 and 3 #557

WIP Graphitey consolidation step 2 and 3 #557

Conversation

Dieterbe commented Feb 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe commented Mar 1, 2017 • edited Loading

Dieterbe commented Mar 6, 2017

Dieterbe commented Mar 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay Mar 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay Mar 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay commented Mar 8, 2017

Dieterbe commented Mar 8, 2017

Dieterbe commented Mar 9, 2017

Dieterbe commented Mar 9, 2017 • edited Loading

Dieterbe commented Mar 9, 2017 • edited Loading

manual testing

setup

results

Dieterbe commented Mar 10, 2017

Dieterbe commented Mar 10, 2017

Dieterbe commented Mar 12, 2017 • edited Loading

Dieterbe commented Mar 12, 2017

replay Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe Mar 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

replay Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

replay Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

woodsaj commented Mar 15, 2017

woodsaj commented Mar 15, 2017

Dieterbe commented Mar 15, 2017

Dieterbe commented Mar 17, 2017

Dieterbe commented Feb 28, 2017 •

edited

Loading

Dieterbe commented Mar 1, 2017 •

edited

Loading

replay Mar 8, 2017 •

edited

Loading

replay Mar 8, 2017 •

edited

Loading

Dieterbe commented Mar 9, 2017 •

edited

Loading

Dieterbe commented Mar 9, 2017 •

edited

Loading

Dieterbe commented Mar 12, 2017 •

edited

Loading

replay Mar 14, 2017 •

edited

Loading

Dieterbe Mar 15, 2017 •

edited

Loading

replay Mar 14, 2017 •

edited

Loading

replay Mar 14, 2017 •

edited

Loading