-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus metric for messages dropped by MQTT QoS 0 Queue #9080
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ansd
force-pushed
the
mqtt-qos0-queue-metrics
branch
3 times, most recently
from
August 15, 2023 13:57
c5082ac
to
efccc69
Compare
Why: A RabbitMQ operator should be able to see whether RabbitMQ drops MQTT QoS 0 messages due to overload protection. It's an indication that an MQTT subscriber does not consume fast enough. How: Use Prometheus global counters. There are 2 valid solutions: 1. Introduce a new metric called messages_dropped specifically for the rabbitmq_mqtt_qos0_queue type. This would work in a similar fashion how streams extends the per protocol global counters, but requires extending the per protocol & queue type global counters for the MQTT QoS queue type. The emitted metrics would look as follows: ``` rabbitmq_global_messages_dropped_total{protocol="mqtt310",queue_type="rabbit_mqtt_qos0_queue"} 0 rabbitmq_global_messages_dropped_total{protocol="mqtt311",queue_type="rabbit_mqtt_qos0_queue"} 0 rabbitmq_global_messages_dropped_total{protocol="mqtt50",queue_type="rabbit_mqtt_qos0_queue"} 0 ``` 2. Reuse the existing metric rabbitmq_global_messages_dead_lettered_maxlen_total This commit decides to go for the 2nd approach because: a) there is no need to add a new metric. Even though dead lettering is not supported for the MQTT QoS 0 queue type, this metric maps nicely to what happens: The queue drop messages since itx max length (mqtt.mailbox_soft_limit) is exceeded with overflow behaviour drop-head. Furtheremore the label `dead_letter_strategy="disabled"` tells that dead lettering is not taking place from this queue type. b) this metric allows to support dead lettering for the MQTT QoS 0 queue type in the future. The new dead lettering metrics look as follows: ``` rabbitmq_global_messages_dead_lettered_maxlen_total{queue_type="rabbit_classic_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_maxlen_total{queue_type="rabbit_classic_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_maxlen_total{queue_type="rabbit_mqtt_qos0_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_maxlen_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_maxlen_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_expired_total{queue_type="rabbit_classic_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_expired_total{queue_type="rabbit_classic_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_expired_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_least_once"} 0 rabbitmq_global_messages_dead_lettered_expired_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_expired_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_rejected_total{queue_type="rabbit_classic_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_rejected_total{queue_type="rabbit_classic_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_rejected_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_least_once"} 0 rabbitmq_global_messages_dead_lettered_rejected_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_rejected_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_delivery_limit_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_least_once"} 0 rabbitmq_global_messages_dead_lettered_delivery_limit_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_most_once"} 0 rabbitmq_global_messages_dead_lettered_delivery_limit_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="disabled"} 0 rabbitmq_global_messages_dead_lettered_confirmed_total{queue_type="rabbit_quorum_queue",dead_letter_strategy="at_least_once"} 0 ```
ansd
force-pushed
the
mqtt-qos0-queue-metrics
branch
from
August 15, 2023 14:06
efccc69
to
0f5fe8f
Compare
ansd
added a commit
to rabbitmq/rabbitmq-website
that referenced
this pull request
Aug 15, 2023
The test case was flaky: ``` *** CT Error Notification 2023-08-15 14:25:51.016 ***🔗 v5_SUITE:at_most_once_dead_letter_detect_cycle failed on line 871 Reason: {test_case_failed,Received unexpected message: {publish,#{client_pid => <0.227.0>,dup => false, packet_id => 1, payload => <<"at_most_once_dead_letter_detect_cycle">>, properties => #{'Subscription-Identifier' => 10}, qos => 1,retain => false, topic => <<"a/b">>, via => #Port<0.76>}}} ```
michaelklishin
approved these changes
Aug 16, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that piggybacking on the existing metric makes more sense.
ansd
added a commit
that referenced
this pull request
Aug 16, 2023
follow up from #9080 Call rabbit_global_counters:init/1 just once.
ansd
added a commit
that referenced
this pull request
Aug 16, 2023
follow up from #9080 Call rabbit_global_counters:init/1 just once.
ansd
added a commit
to rabbitmq/rabbitmq-website
that referenced
this pull request
Aug 16, 2023
rabbitmq-ci
pushed a commit
to rabbitmq/rabbitmq-website-next
that referenced
this pull request
Aug 16, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why:
A RabbitMQ operator should be able to see whether RabbitMQ drops MQTT
QoS 0 messages due to overload protection. It's an indication that an
MQTT subscriber does not consume fast enough.
How:
Use Prometheus global counters.
There are 2 valid solutions:
rabbitmq_mqtt_qos0_queue type. This would work in a similar fashion
how streams extends the per protocol global counters, but requires
extending the per protocol & queue type global counters for the MQTT
QoS queue type. The emitted metrics would look as follows:
This commit decides to go for the 2nd approach because:
a) there is no need to add a new metric. Even though dead lettering is not supported
for the MQTT QoS 0 queue type, this metric maps nicely to
what happens: The queue drop messages since itx max length
(mqtt.mailbox_soft_limit) is exceeded with overflow behaviour
drop-head. Furtheremore the label
dead_letter_strategy="disabled"
tellsthat dead lettering is not taking place from this queue type.
b) this metric allows to support dead lettering for the MQTT QoS 0 queue
type in the future.
The new dead lettering metrics look as follows: