Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Optionally disable cassidx update #565

Merged
merged 4 commits into from
Mar 16, 2017
Merged

Conversation

replay
Copy link
Contributor

@replay replay commented Mar 15, 2017

That's one part of #547

@Dieterbe
Copy link
Contributor

Dieterbe commented Mar 16, 2017

if !updateCassIdx there's no need to initialize the writeQueues and start processWriteQueue routines. (watch out Stop() needs to be adjusted too, closing a nil channel would cause a panic)

@@ -390,6 +393,9 @@ func (c *CasIdx) Delete(orgId int, pattern string) ([]schema.MetricDefinition, e
}

func (c *CasIdx) deleteDef(def *schema.MetricDefinition) error {
if !updateCassIdx {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think checks like these should be executed in the caller, e.g. in the public functions, not the internal "worker" functions. This is also more efficient. E.g. In Prune we can simply skip the entire loop of many deleteDef calls.

Copy link
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comments aside. I think this looks good. this will be very useful to deploy. I'm not sure if the 2nd "remove duplication" commit is still worth it once this is rebased on top of the changes that simplified the code already a bit. Note that you also need to update metrictank-sample.ini and then run scripts/sync-configs.sh

@replay
Copy link
Contributor Author

replay commented Mar 16, 2017

Cool, I'll update that. As a general question: If there are timing-stats collected about how long it takes to update the index, and cassandra index updates are disabled, should these stats still be collected even though now they only include the memory index updates? F.e: https://github.com/raintank/metrictank/blob/a4e8ff546c1a0c74c09cd938d72523ae0f4306ba/idx/cassandra/cassandra.go#L391

I think the should. Obviously the values will drop by a lot once cassandra doesn't need to be updated, but the stats are separated per instance anyway so they shouldn't skew the other results.

@replay replay force-pushed the optionally_disable_cassidx_update branch 2 times, most recently from fa12a97 to 3a49582 Compare March 16, 2017 17:41
@replay replay force-pushed the optionally_disable_cassidx_update branch from 3a49582 to 972cfd8 Compare March 16, 2017 18:12
@replay
Copy link
Contributor Author

replay commented Mar 16, 2017

this would be ready for review again

@@ -271,6 +271,8 @@ write-queue-size = 100000
max-stale = 0
#Interval at which the index should be checked for stale series.
prune-interval = 3h
# disable cassandra index updates (for read nodes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is the inverse of what the option actually does.

suggestion:
# synchronize index changes to cassandra. not all your nodes need to do this.

c.writeQueue <- writeReq{recvTime: time.Now(), def: def}
statAddDuration.Value(time.Since(pre))

return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two lines seem pointless

c.MemoryIdx.AddOrUpdateDef(def)
c.writeQueue <- writeReq{recvTime: time.Now(), def: def}
statUpdateDuration.Value(time.Since(pre))
updateIdx = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can replace 3 lines with one, if we do updateIdx = (existing.LastUpdate < oldest.Unix())

go c.processWriteQueue()
if updateCassIdx {
for i := 0; i < numConns; i++ {
c.wg.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can move this to before the loop and make it a single call: c.wg.Add(numConns)

@replay
Copy link
Contributor Author

replay commented Mar 16, 2017

Ok, updated everything as requested

@@ -92,7 +92,7 @@ func ConfigSetup() *flag.FlagSet {
casIdx.DurationVar(&timeout, "timeout", time.Second, "cassandra request timeout")
casIdx.IntVar(&numConns, "num-conns", 10, "number of concurrent connections to cassandra")
casIdx.IntVar(&writeQueueSize, "write-queue-size", 100000, "Max number of metricDefs allowed to be unwritten to cassandra")
casIdx.BoolVar(&updateCassIdx, "update-cassandra-index", true, "disable cassandra index updates (for read nodes)")
casIdx.BoolVar(&updateCassIdx, "update-cassandra-index", true, "synchronize index changes to cassandra. not all your nodes need to do this.")
casIdx.DurationVar(&updateInterval, "update-interval", time.Hour*3, "frequency at which we should update the metricDef lastUpdate field.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify that you can also get instant updates. here and in the config comments and docs.
in that case, what value should the user specify? 0s ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, just specify 0s or any other unit. ok, i'll add that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the description there and in the example configs.

@replay replay force-pushed the optionally_disable_cassidx_update branch from 7f9ac15 to b3541aa Compare March 16, 2017 21:14
@Dieterbe
Copy link
Contributor

ok if you can confirm '0s' works, then this LGTM .

@replay
Copy link
Contributor Author

replay commented Mar 16, 2017

Looking good:

I modified idx/cassandra/cassandra.go such as:

mst@ubuntu:~/go/src/github.com/raintank/metrictank$ git diff idx
diff --git a/idx/cassandra/cassandra.go b/idx/cassandra/cassandra.go
index 8c58537..1a63d1d 100644
--- a/idx/cassandra/cassandra.go
+++ b/idx/cassandra/cassandra.go
@@ -242,8 +242,10 @@ func (c *CasIdx) AddOrUpdate(data *schema.MetricData, partition int32) {
                if existing.Partition == partition {
                        var oldest time.Time
                        if updateInterval > 0 {
+                               fmt.Println(fmt.Sprintf("updating if update interval has been exceeded %s", data.Name))
                                oldest = time.Now().Add(-1 * updateInterval).Add(-1 * time.Duration(rand.Int63n(updateInterval.Nanoseconds()*int64(updateFuzzyness*100)/100)))
                        } else {
+                               fmt.Println(fmt.Sprintf("updating because update interval is 0 %s", data.Name))
                                oldest = time.Now()
                        }
                        updateIdx = (existing.LastUpdate < oldest.Unix())

Then I've set update-interval = 0s:

mst@ubuntu:~/go/src/github.com/raintank/metrictank$ git diff scripts/config/metrictank-docker.ini
diff --git a/scripts/config/metrictank-docker.ini b/scripts/config/metrictank-docker.ini
index 9599279..6b8cee7 100644
--- a/scripts/config/metrictank-docker.ini
+++ b/scripts/config/metrictank-docker.ini
@@ -274,7 +274,7 @@ prune-interval = 3h
 # synchronize index changes to cassandra. not all your nodes need to do this.
 update-cassandra-index = true
 #frequency at which we should update the metricDef lastUpdate field, use 0s for instant updates
-update-interval = 4h
+update-interval = 0s
 #fuzzyness factor for update-interval. should be in the range 0 > fuzzyness <= 1. With an updateInterval of 4hours and fuzzyness of 0.5, metricDefs will be updated every 4-6hours.
 update-fuzzyness = 0.5
 # enable SSL connection to cassandra

Then I started MT and grepped for some:

build/metrictank -config scripts/config/metrictank-docker.ini | grep some

Then I used fakemetrics to produce metrics for 10 keys:

~$ fakemetrics -flushPeriod 100 -orgs 1 -keys-per-org 10   -carbon-tcp-address 127.0.0.1:2003

And I'm continuously getting the following output from the MT with grep:

mst@ubuntu:~/go/src/github.com/raintank/metrictank$ build/metrictank -config scripts/config/metrictank-docker.ini | grep some
2017/03/16 15:11:57 [I] Metrictank starting. Built from 0.7.0-114-gb3541aa - Go version go1.7
2017/03/16 15:11:57 [I] CLU Start: Starting cluster on 0.0.0.0:7946
2017/03/16 15:11:57 [I] CLU manager: Node ubuntu with address 127.0.0.1 has joined the cluster
2017/03/16 15:11:57 [I] initializing cassandra-idx. Hosts=127.0.0.1:9042
2017/03/16 15:11:57 [I] API Listening on: http://:6060/
2017/03/16 15:11:57 [I] cassandra-idx Rebuilding Memory Index from metricDefinitions in Cassandra
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx writeQueue handler started.
2017/03/16 15:11:57 [I] cassandra-idx Rebuilding Memory Index Complete. Imported 539. Took 9.469262ms
2017/03/16 15:11:57 [I] metricIndex initialized in 56.268525ms. starting data consumption
2017/03/16 15:11:57 [I] carbon-in: listening on :2003/tcp
2017/03/16 15:11:57 [I] CLU manager: Node ubuntu at 127.0.0.1 has been updated - {"name":"default","version":"0.7.0-114-gb3541aa","primary":true,"primaryChange":"2017-03-16T15:11:57.589988972-07:00","state":1,"priority":0,"started":"2017-03-16T15:11:57.588685675-07:
00","stateChange":"2017-03-16T15:11:57.589989053-07:00","partitions":[0],"apiPort":6060,"apiScheme":"http","updated":"2017-03-16T15:11:57.700520943-07:00","remoteAddr":""}
2017/03/16 15:11:59 [I] stats now connected to localhost:2003
updating because update interval is 0 some.id.of.a.metric.1
updating because update interval is 0 some.id.of.a.metric.2
updating because update interval is 0 some.id.of.a.metric.3
updating because update interval is 0 some.id.of.a.metric.4
updating because update interval is 0 some.id.of.a.metric.5
updating because update interval is 0 some.id.of.a.metric.6
updating because update interval is 0 some.id.of.a.metric.7
updating because update interval is 0 some.id.of.a.metric.8
updating because update interval is 0 some.id.of.a.metric.9
updating because update interval is 0 some.id.of.a.metric.10
updating because update interval is 0 some.id.of.a.metric.1
updating because update interval is 0 some.id.of.a.metric.2
updating because update interval is 0 some.id.of.a.metric.3
updating because update interval is 0 some.id.of.a.metric.4
updating because update interval is 0 some.id.of.a.metric.5
updating because update interval is 0 some.id.of.a.metric.6
updating because update interval is 0 some.id.of.a.metric.7
updating because update interval is 0 some.id.of.a.metric.8
updating because update interval is 0 some.id.of.a.metric.9
updating because update interval is 0 some.id.of.a.metric.10
updating because update interval is 0 some.id.of.a.metric.1
updating because update interval is 0 some.id.of.a.metric.2
updating because update interval is 0 some.id.of.a.metric.3
updating because update interval is 0 some.id.of.a.metric.4
updating because update interval is 0 some.id.of.a.metric.5
updating because update interval is 0 some.id.of.a.metric.6
updating because update interval is 0 some.id.of.a.metric.7
updating because update interval is 0 some.id.of.a.metric.8
updating because update interval is 0 some.id.of.a.metric.9
updating because update interval is 0 some.id.of.a.metric.10

@replay replay changed the title [WIP] Optionally disable cassidx update Optionally disable cassidx update Mar 16, 2017
@replay replay merged commit 7ebbd51 into master Mar 16, 2017
@Dieterbe Dieterbe deleted the optionally_disable_cassidx_update branch September 18, 2018 09:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants