-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for Kafka partition key (based on OTEL attribute value) #29433
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Hello @nirb-s1, from my understanding this sounds like a valid request. The exporter is currently relying on the sarama package, which exposes this option. We'd simply add the specified partition when marshalling messages. I don't have enough context to be able to answer to how to handle failures, so I'll have to defer to others there. I also don't have a lot of kafka experience in general, so the code owners may overrule what I've said here in case I've misunderstood something. |
I'll have a crack at this. Will open up a PR soon. |
thanks @VihasMakwana for helping here, this will be highly appriciate and can save us tons of complexity and cost to handle de-duplication situations. |
btw, another idea / ask from this feature is to be able to use the metric timestamp in the partition key somehow so we can use Kafka topic compaction |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@VihasMakwana thanks for jumping on this, I see that it will be auto closed if no one will work on it, did you had a chance to check it? |
Hi, sorry about it. |
**Description:** Add resource attributes based partitioning for OTLP metrics In our backend we really need an ability to distribute metrics based on resource attributes. For this I added additional flag to the configuration. Some code from traces partitioning by traceId reused. Judging by issues, this feature is anticipated by several more people. **Link to tracking Issue:** [31675](#31675) Additionally this feature was menioned in these issues: [29433](#29433), [30666](#30666) **Testing:** Added tests for hashing utility. Added tests for marshalling and asserting correct keys and the number of messages. Tested locally with host metrics and chained OTLP metrics receiver. **Documentation:** Changelog entry Flag is added to the doc of KafkaExporter --------- Co-authored-by: Curtis Robert <crobert@splunk.com>
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
A question about this issue. The new attribute partition_metrics_by_resource_attributes still does not allow to select a specific way of determining keys, it will instead create a hash based on the resource attributes. Are there plans to add the ability to define the exact way in which the key will be derived from the message (like in the example provided in the original ask, with "os.hostname")? The reason I need it is that in the project I'm working on, there is one place that receives info about some objects from various different sources (some of them don't use otel at all, though the ones that don't use sarama directly) and then does stuff based on the information it receives. However, that one place needs to be scaled into several pods, and each pod will have a consumer in a consumer group. We want to make sure that all messages about the same object will come to the same pod. So we wanted all places to send their messages with the key of the object's name (a unique identifier), and then they would all go into the same partition (sarama's default partitioner is a deterministic hash based on the key). But the current implementation doesn't allow for that. @open-telemetry/collector-contrib-triagers |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Component(s)
exporter/kafka
Is your feature request related to a problem? Please describe.
Currently the kafka exporter doesn't provide a way to define a partition key at all which means that messages aren't orders in kafka topic if topic includes multiple partitions.
If also means that if metrics are re-send due to timeouts, we are required to handle duplications on consumer side instead of using kafka compaction on topic option which is also based on partition key.
Describe the solution you'd like
Define in kafka exporter configuration an attribute that will be used as partition key, if not set use the existing default behaviour.
Example:
Use attribute os.hostname as partition key, than all metrics of same hostname will arrive to same consumer eventually.
Describe alternatives you've considered
Without such capabilities, we are required to build dedup capabilities on consumer side
Additional context
No response
The text was updated successfully, but these errors were encountered: