Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka offset for destination and source #1036

Open
gabrieljones opened this issue Jun 16, 2023 · 9 comments
Open

kafka offset for destination and source #1036

gabrieljones opened this issue Jun 16, 2023 · 9 comments
Assignees

Comments

@gabrieljones
Copy link

gabrieljones commented Jun 16, 2023

What are you trying to achieve?

kafka offset needs the same source and destination treatment as partition:
See open-telemetry/opentelemetry-specification@52a3589#diff-bcd74adc2dce1c2c1808237660730119bae461b333c2adc7829d430e38adb15eR384-R385

What did you expect to see?

| `messaging.kafka.destination.partition` | int | Partition the message is sent to. | `2` | Recommended |
| `messaging.kafka.source.partition` | int | Partition the message is received from. | `2` | Recommended |
| `messaging.kafka.destination.offset` | int | Offset of the message within the partition it is sent to. | `42` | Recommended |
| `messaging.kafka.source.offset` | int | Offset of the message within the partition it is received from. | `42` | Recommended |

Additional context.

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md#apache-kafka

Related:
open-telemetry/opentelemetry-specification#2957

@gabrieljones
Copy link
Author

Looks like this example also needs to be brought in sync with the recent changes

@gabrieljones
Copy link
Author

Is there documentation that covers whether it is ever appropriate to have a single Span that spans both a consume and a produce? And then adjacent to that what if you have a single Span that covers a microbatch of consuming and/or producing multiple messages.

@gabrieljones
Copy link
Author

gabrieljones commented Jun 19, 2023

If I want to search for the produce and consume spans for a given offset and partition I would need to do
something like

(messaging.kafka.destination.partition == 2 || messaging.kafka.source.partition == 2) &&
(messaging.kafka.destination.offset == 42 || messaging.kafka.source.offset == 42)

If a usecase dictates single Span should never span a consume and a produce it could change these back to

messaging.kafka.partition
messaging.kafka.offset

then the search to find both sides would be simplified to

messaging.kafka.partition == 2 && messaging.kafka.offset == 42

There are pros and cons to both approaches. Should Semantic Conventions support both usecases? Or does source/destination heavily outweigh the simplified search?

@lmolkova
Copy link
Contributor

lmolkova commented Jun 19, 2023

@gabrieljones we're unifying source and destination (by removing source namespace) in #100, would it resolve your concern?

@gabrieljones
Copy link
Author

@lmolkova Is there any documentation that speaks to what happens if you have a span that covers both an inbound consume and an outbound publish? Or if such a span is an anti-pattern?

@gabrieljones
Copy link
Author

gabrieljones commented Jun 30, 2023

offset should be adjacent to partition in the namespace.
As of #100 we have

Attribute Type Description Examples Requirement Level
messaging.kafka.destination.partition int Partition the message is sent to. 2 Recommended
messaging.kafka.message.offset int The offset of a record in the corresponding Kafka partition. 42 Recommended

semantic_conventions/messaging.md#apache-kafka

key also is ... hmmm

@lmolkova
Copy link
Contributor

@lmolkova Is there any documentation that speaks to what happens if you have a span that covers both an inbound consume and an outbound publish? Or if such a span is an anti-pattern?

you can find recommendations on span structure in https://github.com/open-telemetry/oteps/blob/main/text/trace/0220-messaging-semantic-conventions-span-structure.md

having just one span to describe two operations at once is not how you would typically trace things with OpenTelemetry.

@pyohannes
Copy link
Contributor

offset should be adjacent to partition in the namespace.

I don't share that understanding. Attributes under messaging.kafka.destination.* and messaging.destination.* should uniquely define a destination.

We wouldn't talk about different destinations when having different offsets (otherwise every message would have a different destination), however, currently we treat different partitions (in the same topic) as different destinations.

@gabrieljones
Copy link
Author

gabrieljones commented Sep 20, 2023

offset should be adjacent to partition in the namespace.

I don't share that understanding. Attributes under messaging.kafka.destination.* and messaging.destination.* should uniquely define a destination.

We wouldn't talk about different destinations when having different offsets (otherwise every message would have a different destination), however, currently we treat different partitions (in the same topic) as different destinations.

I suppose it is indeed the kafka client that chooses the partition, and the kafka broker that reports back the offset. Though partition selection can be completely hidden from the user. However, the offset only has meaning in the context of the selected partition. This contextual relationship led to my intuition that offset, partition, and key should be adjacent in the namespace. But I will concede that intuition alone is insufficient motivation.

@trask trask transferred this issue from open-telemetry/opentelemetry-specification May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Post-stability
Development

No branches or pull requests

4 participants