-
Notifications
You must be signed in to change notification settings - Fork 105
Conversation
3f6d830
to
fbdb56e
Compare
"tags in main index" is quite vague. can you explain in detail what the intended goal is (afaik, change the paths of metrics in the memory index - but not the persistent index - to include the |
idx/memory/memory.go
Outdated
@@ -265,7 +265,8 @@ func (m *MemoryIdx) Load(defs []schema.MetricDefinition) int { | |||
} | |||
|
|||
func (m *MemoryIdx) add(def *schema.MetricDefinition) idx.Archive { | |||
path := def.Name | |||
path := def.FullNameWithTags() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems wasteful to save an extra copy of everything when realistically the Name
and Tags
could all reference the one string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good idea, going to try that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point, i wonder if it makes sense to use 1 backing string and either parse out substrings on-demand, or maintain a list of index integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's true. Could trim off the half the slice size because we don't need to store a pointer, just a start and end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that if we would store name and tags in only one string and then keep the positions where name&tags start/end as integers, then the Name
could not be accessible as a simple property anymore unless we make the tags part of its value. We could easily add methods Name()
and Tags()
to extract those substrings, but that would then change the interface.
So I think probably the best would be to use one string that contains the full name, including tags, and then slice it up into the Name
and Tags
properties, even though that's resulting in more pointers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that? or did i make it unnecessarily complicated?
raintank/schema@bc847ce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then the Name could not be accessible as a simple property anymore unless we make the tags part of its value. We could easily add methods Name() and Tags() to extract those substrings, but that would then change the interface.
This is not a problem if we're confident that it is results in significant performance improvement.
That said i'm not convinced that at this point it's worth the time spend to figure out whether or not it would result in a significant performance improvement. We can always push that for later.
One thing I would add though is that if a function/method implementation is simple enough (e.g. nothing more than returning a property value or maybe a few lines more) than the compiler will just inline that function and there is no function call overhead. but it can only do this when the type is fixed at compile time (not an interface type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that? or did i make it unnecessarily complicated?
raintank/schema@bc847ce
it looks more complicated because a var was renamed from buffer to idBuffer. if it kept the name it would be more easy to follow. otherwise it looks pretty good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dieterbe I updated it again and moved the deduplication into a separate function. That way it can also be used in other places, like for example after loading the definitions from the cassandra index, without side effects.
It seems like graphite.go::findTreeJson will need a little modifying as well. If the tags contain '.', it will mess up the result. |
c3c9d35
to
4711e3b
Compare
So, who is responsible for the |
is there still a use for the name attribute without tags? |
@Dieterbe from the top of my head i can't think of a reason why we still need the |
i don't understand. i'm basically trying to say why do we even need a new nameWithTags attribute, and then update code to use that attribute instead of name, we can just start storing the tags in the name string, which is what you seem to want. |
@Dieterbe yeah true, could do that. then we'd have to update carbon-relay-ng again to include the tags inside the |
yes. if the consensus is that in MT the tags should be in the name (which I don't have strong opinions about), then it should also be the case in other tools. at least that's my half-educated opinion. maybe @shanson7 or @DanCech have input. |
@Dieterbe - Are you recommending that the schema be changed to remove |
my 3 above comments are all about "why do we need 2 name attributes, one with and without tags encoded in it, it seems we only need the one that has the tags encoded in them" I don't think at this point we should remove the Tags attribute (the slice), we may refactor (optimize) our data structures down the road and a more efficient way to encode tags could be part of it, but i'm not worried about that just yet. to your point, it seems sensible to distinguish the datastructures for ingest and internal storage, currently we're fine with using the same for both and get by fine, though our mdm input format is definitely neither memory, nor disk, nor cpu friendly and is due for replacement. (see also #199) |
Ah, now I'm picking up what you're laying down. I will say that my local workaround while waiting on this PR did involve just encoding the tags into the |
after some slack discussion:
see also raintank/schema#12 |
idx/memory/memory.go
Outdated
@@ -64,6 +64,30 @@ type TagIDs map[idx.MetricID]struct{} // set of ids | |||
type TagValue map[string]TagIDs // value -> set of ids | |||
type TagIndex map[string]TagValue // key -> list of values | |||
|
|||
func (t *TagIndex) AddTagId(name, value string, id idx.MetricID) { | |||
ti := *t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if we don't do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ go build -ldflags '-X main.GitHash=0.7.4-313-g4aece03' -o /home/mst/documents/code/go/src/github.com/grafana/metrictank/scripts/../build/metrictank
# github.com/grafana/metrictank/idx/memory
idx/memory/memory.go:68:15: invalid operation: t[name] (type *TagIndex does not support indexing)
idx/memory/memory.go:69:4: invalid operation: t[name] (type *TagIndex does not support indexing)
idx/memory/memory.go:71:15: invalid operation: t[name] (type *TagIndex does not support indexing)
idx/memory/memory.go:72:4: invalid operation: t[name] (type *TagIndex does not support indexing)
idx/memory/memory.go:74:3: invalid operation: t[name] (type *TagIndex does not support indexing)
idx/memory/memory.go
Outdated
ti[name][value][id] = struct{}{} | ||
} | ||
|
||
func (t *TagIndex) DelTagId(name, value string, id idx.MetricID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function (and the Add one) don't need to be exported I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 107aef7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor comments but looks good overall.
@Dieterbe fixed |
We need to make it possible to query tagged series. This PR makes the
;tag=value
suffix part of the name of the leaf node, so metrics can be queried by their full name such asa.b.c;a=a;b=b;c=c
.This is necessary because we need a way to uniquely identify metrics when querying them. When graphite receives a
seriesByTag()
query it will now do a/find
on the tag index of MT, which results in a list of full series names (including the tags), so they are uniquely identified. Then it queries all those returned series:Depends on raintank/schema#12