Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Releases: grafana/metrictank

don't use. use 0.8.3 instead

15 Mar 15:18
fb2a833
Compare
Choose a tag to compare

mostly bugfixes + promql, auto kafka offset, ...

  • fix tagdb routes #846
  • move backend stores into their own package #851
  • fix sortByName calls in dashboard 8ac5a28
  • Initial tag query performance improvements #848
  • offset 'auto' option to set kafka offset to retention - segmentSize in k8s, with fallback to oldest #850 , #861
  • mt-update-ttl multithreaded (much faster) + more #843
  • Added missing batch/agg functions #847
  • experimental promql support #858
  • Make sure rollup data is persisted before GC #840
  • Add support for graphite's summarize #837
  • fix: GC task too eagerly closes chunks. #844

bugfixes

01 Feb 08:21
Compare
Choose a tag to compare

index

  • split up tags and tree index internally #806
  • add clauses to detect nil Node's in idx.Find #812

build

  • split up circleCI jobs into parallel workflows #810
  • build metrictank-gcr docker image automatically in MT repo + trigger extra qa checks in separate environment #831
  • mt binaries should be in /usr/bin not /usr/sbin #832

new graphite functions

  • diff, diffSeries, multiply, multiplySeries, rangeOf, rangeOfSeries, stddev, stddevSeries #824
  • exclude, grep #826
  • sortByName #827
  • divideSeriesLists #833

bugfixes

  • initialize logger properly, so you can see config error if that aborts startup. #825
  • fix input (particularly kafka-mdm) exit flow #748
  • Fix broken consolidation #836
  • Pass error back instead of crashing when non-configured, incorrect consolidation function used or unknown aggSpan in GetAggregated(). #839
  • consistent version string generation 99cf721
  • make logging work for upstart 0.6.5

Happy 2018!

30 Jan 10:35
773b420
Compare
Choose a tag to compare

breaking changes

tag index

MT now has an experimental tag-index built-in (compatible with graphite and we also aim to integrate with prometheus). This comes with an internal schema update. see #729 #749 , #750 , #755 , #759 , #762 , #774 , #779
for this we introduced two new config flags:

  • tag-support : whether to enable tag queries
  • match-cache-size : internal, can be mostly ignored.

note : tags in MetricDefinitions and MetricData are now validated for correctness irrespective of tag-support configuration. for invalid incoming metrics, you will see in: Invalid metric debug log messages and incrementing of input.*.metric_invalid metrics. they are no longer ingested.

metric names are now extended with tags, in the memory index, and in all query api output, potentially breaking dashboards (not in the persisted index)

cassandra

omit read requests when they are too old and when the read queue is full

(instead of the previous blocking behavior) #685
config:

  • cassandra-read-queue-size now defaults to 200000 instead of 100. update your value otherwise you may see read requests dropped too eagerly
  • cassandra-omit-read-timeout (new) setting defaults to 60s

new metrics:

  • store.cassandra.omitted_old_reads
  • store.cassandra.read_queue_full

remove cassandra index pruning

until we figure out a better mechanism. disk usage may grow more if you have heavy churn #765 #800 #816

swim (memberlist) settings

we now allow tweaking many swim settings via a new swim config section, and also the bind-addr property has moved from cluster section to swim section
see #760 in particular ae3c5da and 81eea5f
also relevant is the new gossip-to-the-dead-time setting which can help with recovering from split brains.

non-breaking changes

stats, logging, instrumentation, profiling

  • add opentracing instrumentation using jaeger #709 , #713 , #715, #758 ,
    3057525, 736d2db, #732, ea9f56f
  • fix the cache bug oppressed stats #721
  • consistently report any recoverable runtime faults via metrics and logs. metrictank.stats.$environment.$instance.recovered_errors.*.*.* 760bd06
  • better mutex and block pprof endpoints #737
  • report accurate mean instead of an approximation #744

input plugins

  • carbon : like graphite, strip leading "." if any from metric key #694 . this fixes an index crash bug #668

index

  • find improvements #655
  • stop index pruning from locking the index during the entire operation, which was slowing down requests. #787
  • fix CassandraIdx.Init error handling 31990fb

tools

  • Whisper importer aggregate conversion and various improvements and fixes #712 , #720 , #743, #752, #793, #814
  • mt-replicator-via-tsdb : use the kafkaMdm input plugin to consume from kafka + various clustering changes related to it #723
  • mt-index-cat tags valid filter #817
  • mt-kafka-mdm-sniff-out-of-order improvements #754
  • mt-index-cat functions for showing age and rounding of durations #763
  • deprecate old mt-index tools and remove mt-replicator because it's not reliable. #783

http API

  • proper statuscode when render failed #718
  • add maxSeries #742
  • refactor aggregation function api #771
  • Support for groupByTags and aliasByTags #780
  • remove from adjustment for clustered requests, for more consistent output #767 and to fix this bug: "include old metricdefs up to 24h" prevents from new higher res data to become visible #380
  • properly propagate request cancellations through the cluster and cancel work-in progress. #728
  • graphite-compatible msgpack support #789
  • proxy /functions to graphite #815

dashboard

  • make the dashboard multi-instance capable + separate plots for partition lag vs persist partition lag #722
  • update dashboard 8f356d7

storage & chunk cache

  • make cassandra schema optional via cassandra-create-keyspace flag, useful when provisioning clusters. e9ad4d8 , e9ad4d8, ac401e7, fa88502, 8311277, d16ca0f
  • out of order chunks in chunkCache leading to all kinds of trouble #733
  • fix a config bug that was causing reorderbuffer not to activate #756
  • apply cassandra-timeout setting to insert queries #778
  • Clear cache api #555
  • make reorderbuffer and garbage collection work better together. #781 , fixes crash bug #776
  • update gocql a04083f #778

clustering

  • various cluster readyness / priority / initialisation fixes + new cluster dashboard #717
  • update memberlist to post v0.1.0 7f40597
  • chaos testing #760

meta

  • rebrand: raintank -> GrafanaLabs and repository move #738
  • contribution guidelines #740
  • circleci build optimizations and tweaks. f2df73a, e994819, #757, e833430
  • use dep instead of govendor 28c02a7 (via #760)
  • builds: go1.8 -> 1.9 #712
  • update dependencies leveldb #785, globalconf #786
  • developer documentation #805
  • reorganize the scripts, add various code quality and linting steps to CI, upgrade to circleci 2.0 #803

0.7.4 : point reordering and various improvements and tools

01 Nov 22:55
Compare
Choose a tag to compare

data server

  • support more from/to specifications, honor timezone controls #682
  • expr panic fix #676
  • allow target[]=foo specification #688
  • measure duration of plan.Run() #689
  • queries should fail when shards are missing. fixes #670. new config option for min-available-shard
  • fix clustered consolidateBy() requests #707
  • support from/to patterns for find query #708

cassandra

  • update TWCS settings #690
  • fix clustered scenario schema in cassandra doc (#679)
  • set gc_grace_seconds to the compaction_window_size #700
  • also track writeQueue len on puts. fix #701

index

  • Avoid unnecessary string allocations during memory-idx.Add #687 , #692

reorder buffer (new!)

  • reorder buffer to allow for some data to arrive out of order. #675

misc

  • remove usage reporting from MT #666
  • this might fix an OOM problem in cache.searchForward #696

clustering

  • Adds an endpoint to post cluster peers to #680

tools

  • new mt-replicator tool. consumes from local kafka cluster and publishes data to a remote tsdb-gw server. #645, #691
  • Improvements on whisper importer #704

bugfix release

29 Jun 16:59
Compare
Choose a tag to compare

clustering

  • fix partial data in clustered setups #652

built in graphite processing library

  • fix for hitcount #644
  • support the full char set that graphite supports for metric names #651
  • fix up some target naming #667
  • implement nudging similar to graphite #647

other

  • various small fixes to log messages

0.7.2

24 May 17:21
Compare
Choose a tag to compare

data server / api

  • support a minimal built-in graphite function processing api, proxying to graphite what we cannot do ourselves.
    currently supports alias, aliasByNode, aliasSub, avg, divideSeries, perSecond, scale, sum, transformNull and consolidateBy (which now intelligently controls runtime consolidation and archive selection for consolidated archives). #575, #623 #637 #640 #641 #643
  • allow metrictank to run as a graphite-web cluster node. by using graphite-web we get more stability and performance compared to graphite-api. #611 #616 #633
  • misc fixes for gzip middleware #619 #621

clustering

  • fix aggregation handling in clustered setups #602
  • fix cluster api errors not logging properly #603
  • configurable "drop first chunk" behavior for write nodes #632

index

  • make index deletes much faster #606, #609

tools etc

  • fix mt-kafka-mdm-sniff crash bug #612
  • add tool to see which metrics are out of order #626
  • default to cassandra tokenaware-hostpool-epsilon-greedy in all tools and when no config file used. #638

build and packaging

  • use go1.8.1 instead of go1.8rc2 #600
  • mark /etc/metrictank as config dir, fixes a bug where package upgrades would overwrite config files #627

also some docs updates

0.7.1

07 Apr 15:16
Compare
Choose a tag to compare

data server, index and instrumentation

  • be more graphitey in terms of consolidation, and configuration of schemas and retentions. #534 , #570
  • Improve http stats + implement gzip responses #548
  • drastic improves to how we maintain the index in cassandra (later replaced: #558, #560 , #566) . disable cass updates (for secondary nodes) #565. fix lastSave vs lastUpdate discrepancy #574
  • support msgp output for render responses #562
  • instrument how many series are being filtered from responses #551
  • fix deleting from mixed leaf/branch nodes from index #554
  • replace request limit options with new hard/soft point limits #577
  • support multiple raw intervals per storage schema #588
  • remove seenAfter offset (24h buffer) which was sometimes annoying with series having effects too long (e.g. old resolution showing up after having enabled different resolution) #595

clustering

  • remove state transfer between nodes #535
  • support configuring http client timeout for inter-cluster requests #542
  • introduce priority system for controlling which instance satisfies requests #541, #546
  • ensure nodes can only connect to clusters with the same clusterName #597

tools

  • make mt-store-cat more flexible and powerful. #590
  • fix mt-kafka-mdm-sniff garbled output. #524
  • add tool for migrating index from one cass cluster to another #527
  • Add tools for cross-cluster split-metric migrations #529
  • add filter options to mt-kafka-mdm-sniff #531
  • mt-replicator improvements #523
  • add mt-update-ttl tool #515
  • whisper import tools #533
  • add tools to explain schemas and aggregations. #591, 1f7160b

docs

  • add an FAQ to docs. #517
  • various doc updates

builds

  • Include tools in all packages / docker image #585
  • set proper exit code when tool building fails #593

0.7.0

07 Feb 19:43
Compare
Choose a tag to compare

This overview does not cover all changes, just the major ones.
Similarly, there may be more PR's involved to a given feature beside the mentioned ones.

clustering

  • implement sharded cluster using partitioning #400 #472
  • gossip based peer discovery #459
  • process old metricPersist messages before consuming metrics. this makes it easier to safely run statically configured clusters. #452 #485
  • fix metricpersist handling for aggregated metrics #507
  • make secondary nodes also GC #269

storage

  • new chunk format that contains chunkspan. integrates seamlessly with older chunks. #418
  • chunk cache as a more effective way for metrictank to cache hot in-memory data, complements the ringbuffers which can now retain less data #417 #455 #461
  • use different tables based on TTL. #382 #484
  • set default cassandra keyspace to "metrictank" #460
  • update gocql

stats

  • new internal metrics system, replacing statsd[aemon], more performant, and re-organized metrics tree + new dashboard #384
  • instrument kafka-mdm metrics #448

build & deploy

  • golang version 1.6 -> 1.8rc2
  • CI: be more strict (go vet, gofmt -s, vendor health) #405
  • multiple docker stacks for some different scenarios (standard, dev, clustered, etc) with script to important appropriate dashboards etc. #413

index

  • remove branch vs leaf restriction. #490
  • various bugfixes. remove ES index as it was discouraged and not well maintained. better metrics (measure number of entries, split up updates vs adds, sync index ops vs background ops) #504

api

tools

metrictank now comes with helper tools:

inputs

  • carbon-in: better metrics2.0 support #491
  • update sarama kafka library, which fixes compression support #428

config & docs

  • various updates to configs docs
  • default to cassandra index. #411, #449
  • standardize on default raw chunkspan 10min and numchunks 7

0.6.0

18 Nov 07:02
Compare
Choose a tag to compare
  • adopt semver
  • removal of kafkamdam and nsq input plugin (though nsq as clustering bus is still supported)
  • move http api/listener configuration to new section
  • removal of /get endpoint which is no longer used
  • use /node instead of /api, contains more info
  • huge refactor of api code
  • big refactor of input plugin code (#376)
  • include notifier type in notifier metrics
  • switch default policy to tokenaware,hostpool-epsilon-greedy (#374)
  • optimize CI runs, add benchmarks to CI, optimize chunk totalPoints tracking (#385, #386)
  • support listening on https
  • idx fixes wrt persistence and memory structures getting out of sync when trying to add bad metrics (#398)

0.5.8

15 Nov 12:09
Compare
Choose a tag to compare
  • support environment variables for configuration, using MT_ prefix
  • fix data sometimes not being properly returned (#333)
  • support for cassandra SSL and auth (#367)
  • use instance-id for statsd metrics, not hostname (#368)
  • script for maintaining ES index (#369)
  • report "Cannot achieve consistency level" errors
  • track GC heap objects. (#340)