Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-1444: Listener Observability Initial Commit #1500

Merged
merged 31 commits into from
Sep 19, 2022

Conversation

garyrussell
Copy link
Contributor

garyrussell referenced this pull request in garyrussell/spring-amqp Sep 7, 2022
- expand scope to include error handler: spring-projects#1287
- tracing can't be used with a batch listener (multiple messages in listener call)

@Override
public String getContextualName() {
return "RabbitListener Observation";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this in the PropagatingSenderTracingObservationHandler:

span.name(context.getContextualName() != null ? context.getContextualName() : context.getName());

so, the plain name (in your case spring.rabbit.listener) is not going to be visible on a span.
Or there is another logic in the super interface:

    /**
     * Get the span name.
     * @param context handler context
     * @return name for the span
     */
    default String getSpanName(T context) {
        String name = context.getName();
        if (StringUtils.isNotBlank(context.getContextualName())) {
            name = context.getContextualName();
        }
        return SpanNameUtil.toLowerHyphen(name);
    }

So, we need to consult with @marcingrzejszczak to be sure where to use one or another, or both.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard name is the name you'll eventually see as a metric name (e.g. metrics.rabbitmq.listener - if we want to align against sth like OpenTelemetry metrics https://opentelemetry.io/docs/reference/specification/metrics/semantic_conventions/instrumentation/kafka/ cc @shakuzen). The contextual name will be translated to a span name. That could be done also in accordance with OpenTelemetry trace semantic conventions (https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/messaging/#span-name) and could be set to <destination name> <operation name> (that's why you need to have information from the context to set the name - this is why we called it contextual name)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this sounds like that listenerId has to be present in this contextual name instead of the tag (or in addition to).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? For the standard name that would be literally metrics.rabbitmq.listener. For the contextual name you'd need to pass to the context the destination name (queue from rabbit template?) and operation name like (send / receive)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see that there is a Description in those metrics. And we indeed add something like that to the Timer.
But I couldn't find anything similar in the Observation API.
I may guess that this contextualName may be mapper.
But all of these are just speculations from my ignorance.

(Sorry for hijacking this PR for not related discussion 😄 )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we do instrument at a different place I wonder if that will change anything for tools like Zipkin. What I mean is that if we have a RequestReply case, that implies the kind type to be CLIENT and SERVER. Zipkin never assumed that there will be a broker in between. For messaging theoretically you should always be using the SenderContext and ReceiverContext cause those use kind PRODUCER and CONSUMER that are desgined for messaging. I think we should NOT be using the RequestReply... versions cause we can break the latency visualization tools (I'm talking here about tracing only of course) cc @shakuzen @jonatan-ivanov

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also wanted to say that request-reply is an artificial feature for Messaging per se.
It is always a produce on one side and consumer on another. And then we got opposite direction for a reply process.
It might look strange for the first producer side when we block its observation waiting for a reply where ,probably, its consumer span going to look as a child...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so in fact the request-reply is a fire and forget but in the other direction. So indeed we shouldn't be using the RequestReply... context but the standard Sender... and Receiver... options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting a bit confused - if the contextual name must be retrieved from the context, why is there a getter on the observation itself (where there is no context).

Does this commit get us closer to what you expect?

c7d1821

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contextual name should be applied on the Convention, you should not override any methods on the context itself. The idea behind ObservationConvention is that it contains all the information about naming and tagging (keyvalues). So that if users want to change these attributes they inject their own convention instead of chanhing anything in the production code.

* @since 3.0
*
*/
public class MessageSenderContext extends SenderContext<Message> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I call them the same way in Spring Integration as well.
I wonder if we should think about different names.
For example RabbitSenderContext 😄 or RabbitMessageSenderContext.
Just to avoid confusion on the end-user side.

Or... since it is apparently your Pr is going to be merged first, I probably need to rename them in Spring Integration... 😉


@Override
public boolean supportsContext(Context context) {
return context instanceof MessageReceiverContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to be a SenderContext instead?

@@ -6,7 +6,7 @@
</Console>
</Appenders>
<Loggers>
<Logger name="org.springframework.amqp.rabbit" level="info"/>
<Logger name="org.springframework.amqp.rabbit" level="debug"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't need this any more?
I mean you have checked logs locally - OK: no need to increase logging for main

build.gradle Outdated
@@ -398,6 +399,7 @@ project('spring-rabbit') {
testApi project(':spring-rabbit-junit')
testImplementation("com.willowtreeapps.assertk:assertk-jvm:$assertkVersion")
testImplementation "org.hibernate.validator:hibernate-validator:$hibernateValidationVersion"
testImplementation 'io.micrometer:micrometer-observation-test'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention

// Check if commit needed
if (isChannelLocallyTransacted(channel)) {
// Transacted channel created by this template -> commit.
RabbitUtils.commitIfNecessary(channel);
}
}

protected void observeTheSend(Channel channel, Message message, boolean mandatory, String exch, String rKey,
Message messageToUse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two Message instances over here?
Why just one messageToUse from the previous modification is not enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor error.

observation = RabbitTemplateObservation.TEMPLATE_OBSERVATION.observation(registry,
new MessageSenderContext(message))
.lowCardinalityKeyValue(RabbitTemplateObservation.TemplateLowCardinalityTags.BEAN_NAME.asString(),
this.beanName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot this be a part of MessageSenderContext and populated transparently by that RabbitTemplateObservation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how to add key values to the context internally, addLowCardinalityKeyValue() is package-private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do it via the ...Convention - stay tuned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... Probably.
See DefaultApacheHttpClientObservationConvention for example.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you should set all keyvalues via conventions.

@Override
public KeyValues getHighCardinalityKeyValues(RabbitMessageReceiverContext context) {
return this.highCardinality
.and(RabbitListenerObservation.ListenerLowCardinalityTags.LISTENER_ID.asString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add this tag into high as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops; brain fade.

@Override
public KeyValues getHighCardinalityKeyValues(RabbitMessageSenderContext context) {
return this.highCardinality
.and(RabbitTemplateObservation.TemplateLowCardinalityTags.BEAN_NAME.asString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DITTO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this one, Gary.

@Override
public KeyValues getHighCardinalityKeyValues(RabbitMessageSenderContext context) {
return this.highCardinality
.and(RabbitTemplateObservation.TemplateLowCardinalityTags.BEAN_NAME.asString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this one, Gary.

* @param lowCardinality the low cardinality {@link KeyValues}.
* @param highCardinality the high cardinality {@link KeyValues}.
*/
public RabbitListenerObservationConvention(@Nullable KeyValues lowCardinality,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to allow users to add key values like this? If they want to set their own key values, they would provide their own custom convention that could simply append their key values to whatever you're setting in your default observation convention implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


@Override
public KeyValues getLowCardinalityKeyValues(RabbitMessageReceiverContext context) {
return this.lowCardinality

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many listener ids can we have in a single application?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually only a few; probably low teens at the most.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so that's indeed low-cardinality 👍


@Override
public String getContextualName() {
return "RabbitTemplate Observation";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

* @param lowCardinality the low cardinality {@link KeyValues}.
* @param highCardinality the high cardinality {@link KeyValues}.
*/
public RabbitTemplateObservationConvention(@Nullable KeyValues lowCardinality,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above in terms of additional keyvalues

observation = RabbitListenerObservation.LISTENER_OBSERVATION.observation(registry,
new RabbitMessageReceiverContext((Message) data, getListenerId()))
new RabbitMessageReceiverContext(message, getListenerId(),
message.getMessageProperties().getConsumerQueue()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we just do this in the target RabbitMessageReceiverContext?
We already provide a Message over there...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's not available in getContextualName().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh; sorry - I see what you mean.


private boolean observationEnabled;

private RabbitTemplateObservationConvention observationConvention = new RabbitTemplateObservationConvention();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcingrzejszczak ,

is this what you have expected?

I thought the idea was just about a general ObservationConvention<RabbitMessageSenderContext> and that's it...

Copy link

@marcingrzejszczak marcingrzejszczak Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It feels a bit involved with all those extra layers of enums and conventions, but I'm still learning this stuff and cannot be treated as a genuine source of opinion. 😸

@marcingrzejszczak
Copy link

It feels a bit involved with all those extra layers of enums and conventions, but I'm still learning this stuff and cannot be treated as a genuine source of opinion. smile_cat

Enums here are like @Configuration for Spring. They separate what you want to have as observations, how they should be named and tagged from their actual usage. An additional advantage is that due to using enums you can automatically generate documentation of your observations https://micrometer.io/docs/tracing#_documentation_building

A convention is the way to actually apply the naming and tagging. The way we designed things allow you to easily locally or globally alter that behaviour.

}

@Override
public String getContextualName() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not set it here. You should set the contextual name in the RabbitListenerObservationConvention

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that Context.getContextualName() has to be final, so we won't be able to override it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you can override everything but it was never intended to be done like that. That's because users won't be able to easily customize the contextual name.


@Override
public String getContextualName() {
return this.destination + " send";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not set it here. You should set the contextual name in the RabbitTemplateObservationConvention

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this one is missed for a fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


@Override
public String getName() {
return "spring.rabbit.template";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold on. Then it means that my question about RabbitTemplateObservationConvention as a custom convention from end-user remains opened.
According that task feature we should have an extra interface and have it default impl for use-cases like this and contextual name.
The interface would be used as a point of customization from end-user, not a class as it is right now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I modified my response here #1500 (comment)

* {@link ReceiverContext} for {@link Message}s.
*
* @author Gary Russell
* @since 2.8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.0

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


@Override
public TracingSetup[] getTracingSetup() {
return new TracingSetup[]{ TracingSetup.IN_MEMORY_BRAVE };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to explicitly set this - if you don't have Zipkin running or Wavefront configured they will be ignored. Why have you ignored the OTel tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; that wasn't obvious from the docs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://micrometer.io/docs/tracing#_running_integration_tests

 -   by asserting spans that were stored without emitting them to a reporting system

-    against running Tanzu Observability by Wavefront instance (this option turns on when you have passed the Wavefront related configuration in the constructor - otherwise the test will be disabled)

-    against running Zipkin instance (this option turns on when Zipkin is running - otherwise the test will be disabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct to me:

Screen Shot 2022-09-15 at 9 14 32 AM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So testq2 send is a child of a testq1 receive ? If that's the case then it looks good!

In some cases the name has / at the beginning - should it look like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test1 receiver forwards the message to test2 which is received by the second receiver.

Yes, with RabbitMQ, the sender doesn't know the destination queue; it sends to exchange/routingKey. In this case, I am sending to the default exchange "" with a routing key equal to the queue name (which is a default binding provided by rabbitmq).

The / is not present in the Kafka PR because the sender knows the destination topic name.

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably also need some docs for this new stuff.


@Override
public String getContextualName() {
return this.destination + " send";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this one is missed for a fix.

src/reference/asciidoc/amqp.adoc Outdated Show resolved Hide resolved
Co-authored-by: Artem Bilan <abilan@vmware.com>

@Override
public KeyValues getHighCardinalityKeyValues(RabbitMessageReceiverContext context) {
return KeyValues.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about default methods in the super interface.


@Override
public KeyValues getHighCardinalityKeyValues(RabbitMessageSenderContext context) {
return KeyValues.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DITTO


@Override
public String asString() {
return "bean.name";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about prefixing tags: spring-projects/spring-integration#3879 (review)


@Override
public String getName() {
return "spring.rabbit.template";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the name directly on the DocumentedObservation impl instead of context.
Not sure what is the proper way to go though...
@marcingrzejszczak ?

Thanks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule of thumb should be

  • always try to use the names from a convention
  • if for some reason it doesn't make sense (you would have a lot of conventions cause you have a lot of contexts that don't differ too much) you can set one on a DocumentedObservation

@marcingrzejszczak
Copy link

Also we should double check that if we have 2 separate processes then the trace gets continued. It means that on the SenderContext you would need to set the remoteServiceName to be the name of the broker (e.g. rabbitmq). Then on the recipient side on the ReceiverContext you would need to se the remoteServiceName to be rabbitmq too. That way you will see in Zipkin in the dependency graph that application A is talking to a rabbitmq service that then talks to service B.

else {
observation = RabbitTemplateObservation.TEMPLATE_OBSERVATION.observation(this.observationConvention,
DefaultRabbitTemplateObservationConvention.INSTANCE,
new RabbitMessageSenderContext(message, this.beanName, exch + "/" + rKey), registry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we took a Supplier<Context> here instead of Context where the supplier was only called if ObservationRegistry wasn't a no-op, and therefore the Context object wouldn't be allocated in the no-op case, would that simplify things for you? I think then you wouldn't need the if-else this is contained in, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me as well. And I probably agree with @marcingrzejszczak as well.
Only the problem that @garyrussell is off today and next Monday is release day.
I can fix this quickly myself, but we have to be sure that this one is good to go for merging.

Thank you!

Copy link
Contributor Author

@garyrussell garyrussell Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, we could just use a singleton for the No-op case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget that, I shouldn’t comment from my iPad while not working. I see your point about them if/else.

we can go with this as is for Monday’s milestone and Polish later if you add the supplier variant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one, Gary, in the DocumentedObservation:

    default <T extends Observation.Context> Observation createNotStarted(
            @Nullable ObservationConvention<T> customConvention, @NonNull ObservationConvention<T> defaultConvention,
            @NonNull Supplier<T> contextSupplier, @NonNull ObservationRegistry registry) {
        if (registry.isNoop()) {
            return Observation.NOOP;
        }
        return observation(customConvention, defaultConvention, contextSupplier.get(), registry);
    }

I'm not sure why name is like that, but looks like it does exactly what is expected from us here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're working on unifying the names. Sorry about the commotion cc @ttddyy

@garyrussell
Copy link
Contributor Author

garyrussell commented Sep 17, 2022

@marcingrzejszczak I don't see a remoteServiceName on the ReceiverContext, only on the SenderContext.

Also, should the service name just be rabbitmq or should it include, for example, the ip address of the actual broker?

@garyrussell
Copy link
Contributor Author

Forget the second comment in my last post - it wouldn't work because, with a cluster, they might be connected to different nodes.

@marcingrzejszczak
Copy link

marcingrzejszczak commented Sep 19, 2022

@marcingrzejszczak I don't see a remoteServiceName on the ReceiverContext, only on the SenderContext.

That must be our mistake - I'll fix it (micrometer-metrics/micrometer#3419)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants