Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

WIP Graphitey consolidation step 2 and 3 #557

Closed
wants to merge 25 commits into from

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Feb 28, 2017

see #463 (comment)
mostly working on step 3 first. need to find an elegant way to make available the retentions for every req in alignRequests. perhaps the index could store it for each idx.Node (e.g. do all the matching when ingesting new points into index), or the query engine could keep a cache of paths->retention scheme. (do the matching when data is requested)

chunkMinTs := now - (now % ms.chunkSpan) - uint32(ms.chunkMaxStale)
metricMinTs := now - (now % ms.chunkSpan) - uint32(ms.metricMaxStale)
chunkMinTs := now - uint32(ms.chunkMaxStale)
metricMinTs := now - uint32(ms.metricMaxStale)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall why we had to subtract the chunkspan for these calculations. if we still have to do this, this would be trickier, as the chunkspan is specific to the metric that we'll check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code calculated the last chunk T0 that is older then chunkMaxStale. I believe this is because the logic in AggMetric.GC() compared hte chunkMinTs and metricMinTs to the chunk.T0, but as we now compare against chunk.LastWrite this new way should be fine.

The question this raises though, is should our maxStale settings be per retention policy? If a MT service is receiving some metrics every 10seconds and some every few hours, then applying the same maxStale settings to both is not ideal.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 1, 2017

todo:

  • awareness of which consolidation funcs are available for which series, handle that properly. set consolidation function correctly
  • fix unit tests
  • extend alignRequests for multiple rules
  • test in docker
  • currently the schemas & aggregation rules stuff is split between mdata (reads the config files, holds the structures) and the idx (tracks which rule applies for each metric). some mdata notifiers need to do idx lookups, and the idx needs to do schema rule lookups when loading in a persisted index. It seems a bit too messy, and would like to clean it up more, but not sure if i can make it more elegant
  • also the library we use isn't too clean to integrate with. might want to just bring in the code we need and clean it up

@Dieterbe Dieterbe force-pushed the graphitey-consolidation branch 7 times, most recently from 85f424b to b68228a Compare March 6, 2017 22:35
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 6, 2017

@woodsaj @replay PTAL
I still have to iron out some of the tests but I think some feedback at this point would be very useful. thanks.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 7, 2017

TODO: manual tests:

  • verify it creates the right archives
  • verify it reads from the right one
  • test normalizing
  • verify/query TTL from cassandra

note that whisper files and structures are untouched. We only use
this extended format for metrictank.

also: if failure to parse, return error
ChunkSpan, NumChunks and Ready fields are now supported,
also optionally specified via the config file
so we can efficiently reference a schema or aggregation definition
also make WhisperAggregation.Match public
because we want to work with them in metrictank
break
}
}

}
if req.Archive == -1 {
return nil, errUnSatisfiable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really want the whole http request to end up in an error if one of the reqs was not satisfiable? couldn't we just skip that one and continue with the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO yes. this is such a rare case (basically all archives have ready=false). it's okay to make this problem very clear so that the admin can fix it. once we have grafana/grafana#6448 we could make this a bit nicer (it would allow us to show a warning/error and still return the data we can)


// let's see first if we can deliver it via lower-res rollup archives, if we have any
retentions := mdata.GetRetentions(req.SchemaI)
for i, ret := range retentions[req.Archive+1:] {
Copy link
Contributor

@replay replay Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !ret.Ready then the retention will not be useful here, so we might as well do if !ret.Ready {continue} and skip the potential unnecessary uint32(ret.SecondsPerPoint()). then it can be removed from the condition 2 lines down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you'd add another if clause ? i don't see how that's so much better.

if interval == archInterval && ret.Ready {
// we're in luck. this will be more efficient than runtime consolidation
req.Archive = req.Archive + 1 + i
req.ArchInterval = uint32(ret.SecondsPerPoint())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already did that conversion, so this could be archInterval

@@ -67,15 +67,18 @@ func (in DefaultHandler) Process(metric *schema.MetricData, partition int32) {
return
}

schemaI, _ := mdata.MatchSchema(metric.Name)
Copy link
Contributor

@replay replay Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i see that right this means that every time when a metric datapoint arrives we're going to do a regex pattern match at https://github.com/raintank/metrictank/blob/0da87741ab2039f5c3ff2988eb2c74c5bd5b87eb/vendor/github.com/lomik/go-carbon/persister/whisper_schema.go#L87
I'm wondering if it wouldn't be more efficient to associate these schemaI and aggI with the Nodes in idx/memory/memory.go and then this regex match would only be necessary once per metric, when a new metric gets added to the index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 pattern matches actually. one for schema, once for aggregation.
that's a good idea. So basically instead of passing in the schemaI and aggI into AddOrUpdate, AddOrUpdate would return those values instead?
That seems like a good idea to me. For now I've purposely left all the matching obviously unoptimized, to get an idea of how it would perform, but you're right that it'll likely be quite the impact. I still have to measure it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would make sense to have idx.AddOrgUpdate() to return idx.Node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would make sense to have idx.AddOrgUpdate() to return idx.Node

in the case where the metric already exists, we can currently just check for the def in DefById via a simple id lookup, and return if it exists. To support what you suggest, I think we would have either do a lookup in the tree in that case, or somehow link to the idx.Node, with extra data in DefsById or something.
I think It's not the index' concern that the caller of the index would like to know these properties about an entry when it tries to add a new entry, I rather keep the index focused on it's task of being an index rather than trying to accommodate for this.
the current caching mechanism seems to work well enough.

@replay
Copy link
Contributor

replay commented Mar 8, 2017

Overall looking really good to me.
Since there are quite a few modifications in the vendored repo github.com/lomik/go-whisper it might make sense to fork it and then maintain our own fork of it

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 8, 2017

Since there are quite a few modifications in the vendored repo github.com/lomik/go-whisper it might make sense to fork it and then maintain our own fork of it

I would rather just pull out all that code and have our own mini library (based on that patched code) for parsing of the config files, and none of all the other stuff that that library comes with (e.g. the whisper specific bits). but that's lower prio.

@Dieterbe Dieterbe mentioned this pull request Mar 8, 2017
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 9, 2017

regex-slow
ok so it is confirmed. both carbon input plugin spends the majority of its time, and DefaultHandler.Process the vast majority of its time, doing regex matching. @replay your suggestion would help for DefaultHandler but not for carbon plugin, as the carbon plugin needs to do the schema match to know the interval so it can generate the id which it needs to query the index. the carbon plugin could potentially call idx.Find("metric.name") but idx.Find does more than what's needed for this case, so i would rather not do that for every point that comes in. Perhaps a map to cache match results would be best.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 9, 2017

@replay see the new commits.

1) use storage-schemas.conf and storage-aggregation.conf
to configure numchunks, chunkspans, ttls and aggregations.
For carbon also to find out the raw interval.
For other input plugins we only honor interval specified in the data

Note:
- persist message now need to mention which metric names, in case
metrics need to be created on the other end.

2) support configurable retention schemes based on patterns, like
graphite. adjust alignRequests accordingly

3) support configurable aggregation bands to control which rollups
get created per metric pattern.

4) add back 'last' roll-up aggregation and runtime consolidation

It was removed in #142
 / 2be3546
based on a discussion in https://github.com/raintank/strategy/issues/11
on the premise that storing this extra series for all our metrics wasn't
worth it and it's not queryable via consolidateBy which only does sum,
avg, min and max.

However:
1) now aggregations are configurable:
   One only enables specific aggregation bands on an as-needed basis.
2) graphite supports last rollups, so we must too.
3) for certain types of data (e.g. counters) it's
   simply the best (better than all the rest) approach
4) storage level aggregation is not as tied to consolidateBy() as we
   originally made it out to be, so that point doesn't really apply
   anymore.
5) we also restore the 'last' runtime consolidation code,
   even though we will no longer do runtime consolidation in MT for now
   but the upcoming normalization uses the same code.
   As we should use 'last' for normalization if rollups was 'last'.
   (see next commit for more info)
"runtime consolidation" used to be controlled by consolidateBy and
affect consolidation after api processing by (and in) graphite,
as well as before (when normalizing data before processing).
Now these two are distinct:

- only the post-function processing should be affected by consolidateBy,
  and should be referred to as "runtime consolidation".
  Since this code is currently in graphite, metrictank shouldn't really
  do anything.  In particular we still parse out
  consolidateBy(..., "func") since that's what graphite-metrictank sends,
  but we ignore it

- the only runtime consolidation metrictank currently does, is normalizing
  archives (making sure they are in the same resolution) before feeding
  them into the api processing pipeline. To avoid confusion this is now
  called normalizing, akin to `normalize` in graphite's functions.py
  This is an extension of the consolidation mechanism used to create the rollup archive.
  This used to be controlled by consolidateBy, but not anymore. Now:
  * the archive being read from is whathever is the primary (first)
  aggregationMethod defined in storage-aggregation.conf .
  * consolidation method for normalization is the same (always).
  Both are controlled via the consolidation property of Req.
  Later we can extend this to choose the method via queries, but it'll
  have to use a mechanism separate of consolidateBy.
before this, all regex matching dominated the cpu profile.
With this, cpu usage reduced by easily 5x
Though we still have:
      flat                     cumulative
70ms  0.86% 69.79%      770ms  9.49%
github.com/raintank/metrictank/mdata/matchcache.(*Cache).Get

due to the map locking

We could further optimize this, probably, by changing the
idx.AddOrUpdate signature to returning SchemaI and AggI, instead
of requiring it as input as @replay suggested.
This way we only have to match if it wasn't in the index already.
However this requires more intensive changes to the index than
I'm comfortable with right now (DefById only has the metricdef, not the
properties, we could add them but then we need to adjust how we work
with DefById everywhere and do we still need to store the properties in the tree, etc)
I rather re-address this when the need is clearer and we have time to
give this the attention it deserves.
on my laptop:

MASTER:
BenchmarkProcess-8     2000000         718 ns/op
PASS
ok    github.com/raintank/metrictank/input  2.796s

HEAD:
go test -run='^$' -bench=.
BenchmarkProcessUniqueMetrics-8      1000000        2388 ns/op
BenchmarkProcessSameMetric-8         2000000         885 ns/op
PASS
ok    github.com/raintank/metrictank/input  6.081s

So we're a bit slower but carbon input should be a lot faster
(for which we don't have benchmarks) since it used to do regex matching
all the time
@Dieterbe Dieterbe force-pushed the graphitey-consolidation branch 2 times, most recently from 6056bda to 3a6a85e Compare March 9, 2017 21:52
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 9, 2017

manual testing

setup

  • using this tool i write some fake data into MT as specific series names that will match all the different settings below
    raintank/fakemetrics@589b6cd
  • I then run MT (docker-dev stack) with these configs:

storage-schemas.conf:

# note ttl 1d -> metrics_16, 2d -> metrics_32, 3d/4d -> metrics_64. 7d (auto added default by MT) -> metrics_128, but we don't add anything in that table

[agg-max]
pattern = agg.*max$
retentions = 1s:1d:10min:7,5s:4d:10min:2

[agg-min]
pattern = agg.*min$
retentions = 1s:1d:10min:7,5s:4d:10min:2

[agg-all]
pattern = agg.*all$
retentions = 1s:1d:10min:7,10s:3d:10min:2

[agg-default]
pattern = .*agg.*
retentions = 1s:1d:10min:7,10s:2d:10min:2

[default]
pattern = .*
retentions = 1s:1d:10min:7

storage-aggregation.conf:

[min]
pattern = .*min$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = .*max$
xFilesFactor = 0.1
aggregationMethod = max

[all]
pattern = .*all$
xFilesFactor = 0.1
aggregationMethod = avg,min,max,last,sum

[default]
pattern = .*
xFilesFactor = 0.1
aggregationMethod = avg,min,max

results

verified:

  • reads from the correct default aggregation (look at data of few days ago. current data looks identical because it's raw)
    fakemetrics.agg.all -> averaged
    fakemetrics.agg.default -> averaged
    fakemetrics.agg.min -> minned
    fakemetrics.agg.max -> maxxed

  • note: can also see fakemetrics.raw.* data for many days back, but raw in fact, due to how we apply TTL's (even for old data, more than few days ago, will get ttl of 1d when we insert it now)

  • note: when plotting fakemetrics.raw.* by themselves, all the points are identical across the different series, both for zoomed-in where we just get to see the raw secondly data and zoomed-out views when graphite applies same runtime consolidation to all (if needed) of course on top of the raw data

  • However, when we plot both fakemetrics.raw.* and fakemetrics.agg.* and we look at old data (not raw 1s where they are all the same), the data becomes 10-secondly, the raw metrics also get the correct normalization functions applied. This was a pleasant somewhat-surprise: while the series have no rollups defined, if it needs to be normalized (setting series to the same resolution. the secondly raw data and the 10s rolled up data, it does so using the correct functions (e.g. as if we had the proper rolled up serie).
    Although I noticed the values are not the same as the agg metrics. One explanation is that rollup series include exactly the points within clean 10s boundaries, whereas runtime normalization aggregates unaligned, but it should only do 10s at each step, so i don't understand yet why there's a difference of 11 between the min and max.

hmmmmfoo

  • perf looks good (400kHz on my laptop via carbon, no http workload -> profile looks good. though about 10% in matchcache.Get)

  • I ran select count(*) from metric_16; also _32, _64 and specifically _128 (created due to the default 7d retention policy) remains empty, because all series have proper schemas that match something specific.

TODO:

  • verify correct retentions applied
  • verify it creates the right archives using the correct rollup functions.
  • verify/query TTL from cassandra

currently working on this:

./mt-index-cat -prefix fakemetrics -max-age 3d cass '{{.Id}} {{.Name}}'
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min -> 5s for 4d min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max -> 5s for 4d max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all -> 10s for 3d avg,min,max,last,sum
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default -> 10s for 2d avg,min,max
1.44df58d8b4cc9400cc59ffa5261ab5c6 fakemetrics.raw.all -> none
1.77c8c77afa22b67ef5b700c2a2b88d5f fakemetrics.raw.min -> none
1.b3889b3258d8c489b2953f2f2faa53eb fakemetrics.raw.default -> none 
1.dc37a8981996b54a8684c6ed6f79e632 fakemetrics.raw.max -> none

will either try to get something like this working:
(need to make the ttl less precise, e.g. in hours, so i can easier get summary stats using uniq but having some trouble with a '^M' char making it into $ttl
once I get this loop working, should be able to confirm all the remaining todo's.

docker exec -it $(dockernametoid cassandra) cqlsh -e "select key, ttl(data) from metrictank.metric_16" --no-color | egrep '1.0b06d3c572f05abc92401944f8ee3759|1.236a99835b46cf4e2ecc0d912488421c|1.37cf8e3731ee4c79063c1d55280d1bbe|1.de816e43b0c3555eff9157fe0f026b12|1.44df58d8b4cc9400cc59ffa5261ab5c6|1.77c8c77afa22b67ef5b700c2a2b88d5f|1.b3889b3258d8c489b2953f2f2faa53eb|1.dc37a8981996b54a8684c6ed6f79e632' | sed 's/[[:blank:]]//g' | tr '|' ' ' | tr -d "\M" | while read key ttl; do echo $key $((ttl/3600)); done

or will use mt-store-cat, maybe.

much cleaner code if I may say so.
also: consistently apply a simpler "add default to catch all remaining
metrics" default mechanism.
This reverts commit 3bae47f.

They consume about 1% cpu (flat and cumul) under heavy ingest load
and they basically just confirm it works exactly like it should.

But the commit is here in case someone ever wants to re-apply it.
prints how many chunks per series in each table, with TTL's
optionally filter down by metric prefix
@Dieterbe
Copy link
Contributor Author

below is the different series in each table, with TTL in hours (rounded) and how many chunk each has.
the latter number is not very important, loading charts shows that the data is there. this is mostly to confirm that all the right aggregations and TTL's are applied per the settings above.
looks correct to me :) this completes the last todo's. Though I will think a bit more about how we can test more (e.g. where to apply unit tests)
also not sure where table metric_0 is coming from. I'll look into that too.

./mt-store-cat -cassandra-keyspace metrictank normal full 3600 fake
# Looking for these metrics:
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all
1.44df58d8b4cc9400cc59ffa5261ab5c6 fakemetrics.raw.all
1.77c8c77afa22b67ef5b700c2a2b88d5f fakemetrics.raw.min
1.b3889b3258d8c489b2953f2f2faa53eb fakemetrics.raw.default
1.dc37a8981996b54a8684c6ed6f79e632 fakemetrics.raw.max
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default
# Keyspace "metrictank" contents:
## Table metric_0
## Table metric_16
1.77c8c77afa22b67ef5b700c2a2b88d5f_615 23 863
1.dc37a8981996b54a8684c6ed6f79e632_615 23 863
1.37cf8e3731ee4c79063c1d55280d1bbe_615 23 863
1.b3889b3258d8c489b2953f2f2faa53eb_615 23 863
1.de816e43b0c3555eff9157fe0f026b12_615 23 863
1.0b06d3c572f05abc92401944f8ee3759_615 23 863
1.236a99835b46cf4e2ecc0d912488421c_615 23 863
1.44df58d8b4cc9400cc59ffa5261ab5c6_615 23 863
## Table metric_32
1.de816e43b0c3555eff9157fe0f026b12_min_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_max_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_sum_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_cnt_10_615 47 863
## Table metric_64
1.0b06d3c572f05abc92401944f8ee3759_min_5_615 95 863
1.236a99835b46cf4e2ecc0d912488421c_max_5_615 95 863
1.37cf8e3731ee4c79063c1d55280d1bbe_sum_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_max_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_min_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_cnt_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_lst_10_615 71 863

@Dieterbe
Copy link
Contributor Author

one thing i should definitely still test, is the same thing as ^^ but metrics that fall into the default schema/agg settings.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 12, 2017

I can confirm, when i remove the default rules from the storage-aggregations.conf and storage-schemas.conf posted above, it relies on the defaults built into MT : minutely points for 7 days, for all raw.* metricsand only avg aggregation for those metrics that get a rollup schema applied but use the default aggregation, which is 1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default

./mt-store-cat -cassandra-keyspace metrictank normal full 3600 fake
# Looking for these metrics:
1.0b06d3c572f05abc92401944f8ee3759 fakemetrics.agg.min
1.236a99835b46cf4e2ecc0d912488421c fakemetrics.agg.max
1.37cf8e3731ee4c79063c1d55280d1bbe fakemetrics.agg.all
1.4884e7c2e831082a184e5d52ae4fbd73 fakemetrics.raw.default
1.4f3a41215ef4c81b44aae23ce6557e97 fakemetrics.raw.max
1.5b5d1932b770fafe2e7a1d67947cb7bb fakemetrics.raw.all
1.79c6422293c1d7a1458d7fe0199b52d0 fakemetrics.raw.min
1.de816e43b0c3555eff9157fe0f026b12 fakemetrics.agg.default
# Keyspace "metrictank" contents:
## Table metric_0
## Table metric_16
1.37cf8e3731ee4c79063c1d55280d1bbe_615 23 863
1.de816e43b0c3555eff9157fe0f026b12_615 23 863
1.0b06d3c572f05abc92401944f8ee3759_615 23 863
1.236a99835b46cf4e2ecc0d912488421c_615 23 863
## Table metric_32
1.de816e43b0c3555eff9157fe0f026b12_sum_10_615 47 863
1.de816e43b0c3555eff9157fe0f026b12_cnt_10_615 47 863
## Table metric_64
1.0b06d3c572f05abc92401944f8ee3759_min_5_615 95 863
1.236a99835b46cf4e2ecc0d912488421c_max_5_615 95 863
1.37cf8e3731ee4c79063c1d55280d1bbe_sum_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_max_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_min_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_cnt_10_615 71 863
1.37cf8e3731ee4c79063c1d55280d1bbe_lst_10_615 71 863
## Table metric_128
1.5b5d1932b770fafe2e7a1d67947cb7bb_615 167 71
1.4f3a41215ef4c81b44aae23ce6557e97_615 167 71
1.79c6422293c1d7a1458d7fe0199b52d0_615 167 71
1.4884e7c2e831082a184e5d52ae4fbd73_615 167 71

@Dieterbe Dieterbe requested a review from woodsaj March 12, 2017 20:03
@Dieterbe
Copy link
Contributor Author

@woodsaj @replay can we merge?


item.XFilesFactor, err = strconv.ParseFloat(s.ValueOf("xFilesFactor"), 64)
if err != nil {
return Aggregations{}, fmt.Errorf("[%s]: failed to parse xFilesFactor %q: %s", item.Name, s.ValueOf("xFilesFactor"), err.Error())
Copy link
Contributor

@replay replay Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return result here instead of instantiating a new Aggregations{}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could go either way on this. this code is a bit more explicit that it's an empty value, but you're right we could reuse result. I don't see a strong reason to prefer either way.

methodStrs := strings.Split(aggregationMethodStr, ",")
for _, methodStr := range methodStrs {
switch methodStr {
case "average", "avg":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. it's not documented in the config documentation either http://graphite.readthedocs.io/en/latest/config-carbon.html#storage-aggregation-conf ; so i guess metrictank will accept it because it is nice to its users.

return fmt.Errorf("lower resolution retentions must be evenly divisible by higher resolution retentions (%d does not divide by %d)", ret.SecondsPerPoint, prev.SecondsPerPoint)
}

if prev.MaxRetention() >= ret.MaxRetention() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't that be > instead of >= according to rule 4.

Copy link
Contributor Author

@Dieterbe Dieterbe Mar 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no? rule 4 says " Lower precision archives must cover larger time intervals than higher precision archives." we assert this by erroring out if the lower precision archive (ret) is smaller than or equal to the higher precision archive (prev).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right, all good

return item.val
}

func (m *Cache) maintain() {
Copy link
Contributor

@replay replay Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it maybe make sense to have some way to shut this down? otherwise when there are unit test that initialize many instances of Cache we'll end up with tons of maintain() processes which also cause the cache to not get freed even though it might already not be used anymore, because it's still referenced by that proc.
could be a simple flag that's checked in the for, which makes it return if true

diff := int64(m.expireAfter.Seconds())
for now := range ticker.C {
nowUnix := now.Unix()
m.Lock()
Copy link
Contributor

@replay replay Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this keeps the lock until the for iterated over every single entry in m.data. are you sure that's not going to lead to long blocks if there's a really large number of metrics? even though it's only on the ingestion path, it would be quite easy to split it up and check something like len(m.data) / 1000 items every m.cleanInterval / 1000 seconds.

@woodsaj
Copy link
Member

woodsaj commented Mar 15, 2017

I see a few problems with this implementation.

  1. the schemaId and AggId are stored per series name, but as we support changing the interval which series are sent at, we also need to support having different retention policies based on the interval. eg if i have a series some.metric.1 being sent every minute with a retention of "1m:1w,10m:5week,1h:1y", and then i decide to now send the metric every second, the existing retention will store the raw 1second data for 1week. So i probably want a new retention of "1s:1d,1m:1w,10m:5w,1h:1y"

  2. by using the mtachcache we now no longer have a single source of truth for the properties of a metric. I think this is a bad idea and we should just store all properties of a metric in the metric index. The retention policy is a property of the metric. Storing the schemaId and aggId with the def will also allow you to return it from idx.Get()

  3. The schemaId and aggId dont get sent when a new leaf node is added when there is an existing branch node. https://github.com/raintank/metrictank/blob/14d0c7137f506a2501c458f6f947be1e993c5b71/idx/memory/memory.go#L205-L211

So my recommandation is to

  1. define an idx.IndexEntry struct that uses an embedded schema.MetricDefinition.
type IndexEntry struct {
  schema.MetricDefinition
  SchemaId uint16
  AggId uint16
}

Change the idx.MetricIndex interface to

type MetricIndex interface {
	Init() error
	Stop()
	AddOrUpdate(*schema.MetricData, int32) error
	Get(string) (IndexEntry, bool)
	Delete(int, string) ([]IndexEntry, error)
	Find(int, string, int64) ([]Node, error)
	List(int) []IndexEntry
	Prune(int, time.Time) ([]IndexEntry, error)
}

in MemoryId, change

type MemoryIdx struct {
	sync.RWMutex
	FailedAdds map[string]error // by metric id
	DefById    map[string]*idx.IndexEntry
	Tree       map[int]*Tree
}

As metricDefs/metricData are added to the index, lookup the SchemaId and AggId.

When the the schemaId or AggId are needed they can be fetched from the index efficiently with
metricIdx.Get(def.Id) or if you dont know the def.Id and only know the series name (Carbon Input) you could use metrixIdx.Find(orgId, seriesName, 0)

@woodsaj
Copy link
Member

woodsaj commented Mar 15, 2017

Also please s/schemaI/schemaId/ and s/aggI/aggId/

@Dieterbe
Copy link
Contributor Author

only know the series name (Carbon Input) you could use metrixIdx.Find(orgId, seriesName, 0)

we should then also update the idx.Node type I think, its Defs property should be a []IndexEntry not []schema.MetricDefinition
I suspect metricIdx.Find will turn out to be too expensive to run for every point that comes in. The current Find function is optimized for "rich" queries that may have patterns, may want data from org -1, it does a bunch of allocations and processing.
But we could do it like that, and if the numbers confirm a bottleneck, it should be easy to add a new interface function that is a narrower version of Find and only does a direct lookup in the Tree, and only for the given org

so with that in mind, i'll implement those changes.

@Dieterbe
Copy link
Contributor Author

closing in favor of #570

@Dieterbe Dieterbe closed this Mar 17, 2017
@Dieterbe Dieterbe changed the title WIPGraphitey consolidation step 2 and 3 WIP Graphitey consolidation step 2 and 3 Apr 4, 2017
@Dieterbe Dieterbe deleted the graphitey-consolidation branch September 18, 2018 09:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants