-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native MQTT #5895
Native MQTT #5895
Conversation
11d09fb
to
905a836
Compare
90510fb
to
7035062
Compare
8d75fd9
to
8493487
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Previously the helper was not setting the virtual host limit correctly.
…set-vhost-limits Backport a CT broker helper fix from #5895
Previously the helper was not setting the virtual host limit correctly. (cherry picked from commit ca1c5ac)
Leaving MQTT alone, as this branch does not contain #5895, which fixed a great many dialyzer warnings.
… (backport #7295) (#7349) * Fix all dependencies for the dialyzer This is the latest commit in the series, it fixes (almost) all the problems with missing and circular dependencies for typing. The only 2 unsolved problems are: - `lg` dependency for `rabbit` - the problem is that it's the only dependency that contains NIF. And there is no way to make dialyzer ignore it - looks like unknown check is not suppressable by dialyzer directives. In the future making `lg` a proper dependency can be a good thing anyway. - some missing elixir function in `rabbitmq_cli` (CSV, JSON and logging related). - `eetcd` dependency for `rabbitmq_peer_discovery_etcd` - this one uses sub-directories in `src/`, which confuses dialyzer (or our bazel machinery is not able to properly handle it). I've tried the latest rules_erlang which flattens directory for .beam files, but it wasn't enough for dialyzer - it wasn't able to find core erlang files. This is a niche plugin and an unusual dependency, so probably not worth investigating further. (cherry picked from commit 949b535) (cherry picked from commit 3a3ff30) # Conflicts: # deps/rabbit/BUILD.bazel # deps/rabbit/src/rabbit_access_control.erl # deps/rabbit/src/rabbit_exchange.erl # deps/rabbit_common/src/rabbit_misc.erl # deps/rabbitmq_consistent_hash_exchange/BUILD.bazel # deps/rabbitmq_mqtt/BUILD.bazel (cherry picked from commit 2ae27f2) # Conflicts: # deps/rabbit_common/src/rabbit_misc.erl * Resolve conflicts (cherry picked from commit b205ac9) # Conflicts: # deps/rabbit_common/src/rabbit_misc.erl * Avoid using a type from rabbit in rabbit_common to avoid a dep cycle (cherry picked from commit bca40c6) * Resolved additional errors from merge Leaving MQTT alone, as this branch does not contain #5895, which fixed a great many dialyzer warnings. (cherry picked from commit 3f9e6f9) * fixup merge artifacts * Avoid referencing unexported types from jsx * Additional dialyzer fixes --------- Co-authored-by: Alexey Lebedeff <binarin@binarin.ru> Co-authored-by: Michael Klishin <klishinm@vmware.com> Co-authored-by: Rin Kuryloski <kuryloskip@vmware.com>
@ChunyiLyu would you help to provide a doc on how to get the mqtt connect/disconnect notification? And when will you push the offical docker image to docker hub? |
Hi @janiu-001
There is a doc in https://www.rabbitmq.com/event-exchange.html . For each MQTT CONNECT / DISCONNECT, an event
Yesterday, 3.12.0-beta.1 was released. You can try out Native MQTT by using Docker image We are looking forward to your feedback. Thanks! |
@ansd Thanks very much for you response, i tried the image pivotalrabbitmq/rabbitmq:v3.12.0-beta.1-otp-max-bazel. Looks like that the connect/disconnect worked. Connect:DisconnectBut it failed to load the definitions.json which create the queue and bindings! But it could work with rabbitmq:3.11-management( i could see all the queues defined in the definitions.json on dashboard ). |
Thanks @janiu-001.
What is the error message? Are you trying to export queues created by the MQTT plugin in 3.11 and re-import those in 3.12? If you could provide step by step instructions that reproduce this issue and open a separate GitHub issue if you think something doesn't work as expected, that would be great! |
with the 3.12 imagewith the rabbitmq:3.11-management image, we could see the queues defined in the definition.json |
@janiu-001 the best and fastest way to assist us is to attach your complete Right now you're asking us to guess how to reproduce it - we know nothing about your queue definitions and other RabbitMQ metadata. |
Sorry for the troubles. Since github do not support to upload json file, i change the suffix from json to text |
This belongs to a discussion. |
@michaelklishin that would be ideal but @janiu-001 responded to a closed PR 🤷♂️ I'm going to check this out today. |
Yep, something is up with definitions import - #7532 |
@janiu-001 thanks for reporting the issue. In the future, please either start a discussion or file an issue rather than replying to a closed PR. That's not just Team RabbitMQ's preference but is pretty standard practice. Thanks! |
@lukebakken yes, understannd. will follow your suggestion |
Concept
Until today MQTT (like STOMP and AMQP 1.0) is proxied via AMQP 0.9.1:
Pros:
Cons:
This PR implements Native MQTT. By "native" we mean that MQTT becomes a first class protocol in RabbitMQ that does not get proxied via AMQP 0.9.1:
Pros:
Native MQTT is possible thanks to the queue_type interface introduced by @kjnilsson 2 years ago. The queue_type interface decouples the AMQP 0.9.1 channel from queues (classic queues, quorum queues, or streams) making it possible to have the MQTT connection process publish to (and receive from) queue processes directly (without the need to proxy via the AMQP 0.9.1 channel).
Result
The result of this PR (Native MQTT) is that resource usage drops drastically and that RabbitMQ is able to accept millions of MQTT connections.
Scaling tests can be found in https://github.com/rabbitmq/mqtt-testing (private repo).
Connecting in total 1 million "background" MQTT connections (i.e. neither publishing nor consuming, but only sending PING packets) to a 3 node RabbitMQ cluster with small buffer sizes and disabling management metrics:
requires in RabbitMQ 3.11 97 GB of memory per node and with this PR 7 GB of memory per node. This is a permanent memory saving of (90GB * 3) 270 GB for the cluster and a memory reduction of factor 13. (For only QoS 0 subscribers using the new queue type - more details below - the memory reduction factor will even be much higher.)
93% of the memory required on
main
(or 3.11) is process memory, meaning memory required for Erlang processes.Further scaling tests have shown that with this PR,
Implementation details
rabbit_hearbeat
, heartbeat supervisor and rabbit_mqtt_connection_sup procsses. The same is done for Web MQTT: There is now a single Erlang process per incoming Web MQTT connection.rabbit_mqtt_qos0_queue
. (We can rename it if we think it we can re-use it somewhere else.) The idea is that this queue type is a "pseudo queue" or "virtual queue" as for example experimented in https://github.com/rabbitmq/rabbitmq-server/tree/virtual-queue-type for direct-reply-to. The queue process is the receiving MQTT connection process. The queue is basically the Erlang mailbox of that process. It will be used for MQTT connections that connect with clean session and subscribe with QoS 0. There is no point in forwarding a message to a real queue first. Instead, MQTT publishing connections send a message directly to the new queue type - that is to the receiving MQTT connection process.rabbitmq_mqtt
. Note that therabbit
app is not aware that this queue type exists.delegate
as done in classic queues. The new queue type comes behind the feature flagrabbit_mqtt_qos0_queue
.mqtt_mailbox_soft_limit
AND the network to the MQTT client is congested meaning the MQTT client app cannot consume messages fast enough.mqtt_node
is deleted (hidden behind feature flagdelete_ra_cluster_mqt_node
). Instead, MQTT client ID to Pid tracking is done via a local-only process group using module pg (thanks to @lhoguin for coming up with this idea). The Ra cluster's procsses grew huge with many MQTT connections and became a bottleneck when mass disconnecting clients. Local-onlypg
scales better. For the purpose of MQTT client ID tracking (i.e. disconnect the old client if a new client with the same ID connects), we don't need the strong Raft consistency guarantees. This does mean however, that - in the presence of lost messages or network partitions - it can happen that 2 clients connect with the same ID (which shouldn't be huge problem - but does violate the protocol spec temporarily in the presence of such misconfigured clients and network issues within the broker).shared_SUITE
that is shared between MQTT and Web MQTT. This introduces only a test dependency fromrabbitmq_mqtt
torabbitmq_web_mqtt
.emqttc
library code are deleted from the Web MQTT tests. Instead, theemqtt
web socket option is used.rabbit_mqtt_reader
got converted fromgen_server2
togen_server
(thanks @lhoguin)stream_queue
andclassic_queue_type_delivery_support
(the latter introduced in v3.10.9 and v3.11.1) are required. (That's not a problem because Feature flags: Make feature flags v2 required #6810 present in 3.12 will already require to upgrade to 3.11.x first.)Limitations
Following limitations are already present today in v3.11, but are especially important to mention now that RabbitMQ can handle many more MQTT connections:
pg
?) Nodes might even crash.management_agent.disable_metrics_collector = false
(the default) will double or triple memory usage. The management plugins is not designed to handle such a huge number of stats emitting objects (connections / queues). Management metric collection should therefore be turned off and Prometheus should be used. We don't plan to improve management agent metrics in the future. Prometheus is the way to go.Newly introduced limitations with this PR:
Future Work
(Prioritisation not yet decided.)
Thanks to @ChunyiLyu who equally contributed to this PR!