-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup and adding support for sharding based on metric name #3170
cleanup and adding support for sharding based on metric name #3170
Conversation
plugins/outputs/kinesis/kinesis.go
Outdated
u := uuid.NewV4() | ||
return u.String() | ||
} | ||
if k.PartitionKey != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only part I'm not sure of, what does toml default too if a value isn't set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The struct will have the zero value of the type. For strings it will be "". I assume this is the reason why the current ordering for assigning the final partitionKey was chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening a pull request.
I'm not much of a Kinesis user, so could you explain why using the metric name is advantageous over using a random partition key?
One thing I'm concern about is that previously if no partition key was set and use_random_partitionkey was false, all metrics would be added to a single partition, which I believe would allow ordering when reading the metrics. This could still be done by simply setting the partition key to a non empty value, but would be a change in behavior.
plugins/outputs/kinesis/kinesis.go
Outdated
u := uuid.NewV4() | ||
return u.String() | ||
} | ||
if k.PartitionKey != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The struct will have the zero value of the type. For strings it will be "". I assume this is the reason why the current ordering for assigning the final partitionKey was chosen.
For our use case we have more metrics then we can write to a single shard without being throttled. By using the metric name as the partition key we can spread our metrics across multiple shards but since all metrics with the same name are still mapped to the same shard we can continue to guarantee ordering. |
That makes sense, though I'm still concerned about the change in behavior and also how complicated the different options are becoming, as you need to understand the fallback ordering between the options. I wonder if we need a more extensible way to specify partition keys, what do you think about this idea?: ## The partition key can be calculated using one of several methods:
##
## Use a static value for all writes:
# [outputs.kinesis.partition]
# method = "static"
# key = "howdy"
#
## Use a random partition key on each write:
# [outputs.kinesis.partition]
# method = "random"
#
## Use the measurement name as the partition key:
# [outputs.kinesis.partition]
# method = "measurement"
#
## Use the value of a tag for all writes, if the tag is not set the empty
## string will be used:
# [outputs.kinesis.partition]
# method = "tag"
# key = "host" If we did this, the current keys would need to be deprecated and they would only be used if there was no partition table specified. |
We could do that, it does mean a breaking config change but does make it much more flexible for the future. |
If we do this what is important is that when upgrading Telegraf the old config continues to work as before (bonus points if we add a warning on startup), and that newly created config files point people to use the new method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was fast! Here are a few things that need tweaked:
plugins/outputs/kinesis/kinesis.go
Outdated
@@ -159,10 +190,37 @@ func writekinesis(k *KinesisOutput, r []*kinesis.PutRecordsRequestEntry) time.Du | |||
if err != nil { | |||
log.Printf("E! kinesis: Unable to write to Kinesis : %+v \n", err.Error()) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
plugins/outputs/kinesis/kinesis.go
Outdated
## string will be used: | ||
# [[outputs.kinesis.partition]] | ||
# method = "tag" | ||
# key = "host" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 space indent in the sample config.
plugins/outputs/kinesis/kinesis.go
Outdated
# | ||
## Use the value of a tag for all writes, if the tag is not set the empty | ||
## string will be used: | ||
# [[outputs.kinesis.partition]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these all single bracket options so you can only set one.
fixing sample config
removing old doc
Required for all PRs: