-
Notifications
You must be signed in to change notification settings - Fork 104
Push chunks of hot metrics into cache when they get completed by AggMetric
#461
Conversation
f8c494f
to
a8106e4
Compare
AggMetric
AggMetric
b9497bc
to
af2942e
Compare
AggMetric
AggMetric
testMetricPersistOptionalPrimary(t, true) | ||
} | ||
|
||
func TestMetricPersistNotBeingPrimary(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just call this Secondary
@@ -6,6 +6,7 @@ import ( | |||
|
|||
type Cache interface { | |||
Add(string, uint32, chunk.IterGen) | |||
CacheIfHot(string, uint32, *chunk.IterGen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why pointer to IterGen? this type is tiny and we can just copy it by value. semantically it also makes more sense to me, this implies we may make changes to the itergen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that's tiny for a struct
, but I'd guess it's still significantly bigger than a pointer, no? I can change it to passing by value anyway, shouldn't really make a big difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok that the type is larger than the pointer. the problem with pointers is more about the dereferencing overhead and cache misses causing stalls (I don't have evidence to back this up, more like a gut feeling)
ts := uint32(1000) | ||
for i := uint32(0); i < chunkAddCount; i++ { | ||
agg.Add(ts, 1) | ||
ts = ts + chunkSpan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do ts += chunkSpan
timeout <- true | ||
} | ||
|
||
for i := uint32(0); i < chunkAddCount-1; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about we create a mockcache, similar to devnullstore, maybe devnullcache, it could take care of all the counting and the timeouting.
for i := uint32(0); i < chunkAddCount-1; i++ { | ||
go oneSecTimeout() | ||
select { | ||
case <-timeout: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just use https://golang.org/pkg/time/#After here instead of oneSecTimeout
} | ||
|
||
// if the previous chunk is not cached we consider the metric not hot enough to cache this chunk | ||
if met.lastTs() < itergen.Ts() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this only works reliably for span aware chunks right? (which would be OK) but maybe worth mentioning.
@@ -52,6 +52,126 @@ func getConnectedChunks(metric string) *CCache { | |||
return cc | |||
} | |||
|
|||
// test AddIfHot method without passing a previous timestamp on a hot metric | |||
func TestAddIfHotWithoutPrevTsOnHotMetric(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests look sweet. I like how you test all 4 scenarios.
my main problem with this approach is the CacheCb type. instead of passing the CacheCb method around into AggMetric, AggMetrics, Aggregator, etc, why not just a reference to the actual cache? if the idea is to make it explicit that those things will only invoke a subset of cache's functionality, we should consider creating an extra interface like so:
any cache that implements our Cache interface - in particular our CCache - also implements the above interface, so it can be used. this would be similar to the go standard library, where you may pass around things like os.File - which implements io.Reader, io.Writer, io.Closer and a bunch more - but the functions that execute reads only declare the interface with the subset of the functions they need (e.g. https://golang.org/pkg/io/#ReadFull ) |
Actually I think in the current setup this change decreases the accuracy of the LRU eviction mechanism, because every time a chunk gets added to the cache it gets added to the front of the LRU. Imagine this case:
The underlying problem is that when these chunks get pushed into the cache, they get pushed to the front of the LRU, as if they would have just been touched. So I think maybe they shouldn't be pushed to the front of the LRU, but to the end. What do you think about that? |
let me know if i misunderstand.
IMHO this is acceptable / pretty normal behavior. In the normal case, if you run your cache very close to max size, it happens, that as you view "a lot of metrics in set B" and it auto-adds all those chunks to the cache, it can evict chunks of set A, even though you'ld like A to remain cached. Your proposal sounds good though, I think the cache as is works well enough, but if it's easy to implement go for it. it should give an improvement in those rare cases. |
Yeah, that's right. At the moment the rules are simple:
The second rule has the side effect that if i That stuff gets complicated to explain |
this alone is a strong enough reason to just leave it as is I think. I would say let's just keep it simple and as-is. I think the eviction caused by AddIsHot (the first problem) is easy to reason about and kind of expected when you run a cache that close to its RAM limits. |
@Dieterbe yeah i think you're right. I'd better keep it simple and leave it as it is until (if) what i described becomes a problem. |
@Dieterbe regarding your comment with the |
516f6b4
to
ed346b2
Compare
@Dieterbe i changed the callback into a separate interface called |
f46ac80
to
6ad0080
Compare
@Dieterbe if you get a chance could you look at that again please? otherwise it will turn into rebase hell |
- does aggmetric call the cache push callback every time a metric gets evicted from the ring buffer? - does aggmetric add the chunk into the store if the node is primary? - makes the devnullStore count how many times it's Add() got called - does aggmetric not add the chunk into the store if the node is not primary?
AggMetric
completes a chunk it will now get passed into a callback that refers toCache.CacheIfHot
. This happens in any case, no matter if the node is cluster primary or not.Cache.CacheIfHot
which takes anIterGen
that should be cached only if it's metric is hot. In this context "hot" means that the previous chunk within that metric is currently cached.