Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

umqtt.robust dies when MQTT broker gets restarted #102

Open
phieber opened this issue Sep 14, 2016 · 16 comments
Open

umqtt.robust dies when MQTT broker gets restarted #102

phieber opened this issue Sep 14, 2016 · 16 comments

Comments

@phieber
Copy link

phieber commented Sep 14, 2016

Hi,

the umqtt.robust module is pretty reliable but I found a case when it stops sending:

screenshot from 2016-09-14 13 42 10

As soon as I restart the MQTT broker, I have to connect via WebREPL and press CTRL+C.

I use the following minimal test code:
https://github.com/phieber/uPython-ESP8266-01-umqtt

Can you reproduce this issue when restarting the broker?

br
Patrick

@phieber
Copy link
Author

phieber commented Sep 14, 2016

In the screenshot above, I have restarted the broker after two successful MQTT publish messages (two temperature values in this case)

@Actpohomoc
Copy link

I use code to avoid such problem:

try: retries = 0 while (retries < 20): retries += 1; client.check_msg() time.sleep(1); except OSError: connect_and_subscribe()

def connect_and_subscribe():
global client
client = MQTTClient(CONFIG['client_id'], CONFIG['broker'], CONFIG['port'], "user", "pass", 120)
client.set_callback(callback)
client.connect(False)
print("Conn to {}".format(CONFIG['broker']))
client.subscribe(b"MBI/CURRENT_DATETIME")
time.sleep(1);
client.check_msg()
.....
So I always check if there any OsError and reconnect to MQTT.

You need to upgrade the robust by this PR: [https://github.com//pull/117]

@craftyguy
Copy link

Is there a better way to go about handling this situation where the broker 'disappears' and then 're-appears' at some later time? The current implementation of umqtt.robust is not robust at all, even with the reconnect. check_msg never works even though it seems the client reconnected.

Should the umqtt.robust object include a list of subscribed topics to auto-resubscribe in the reconnect() function?

@scargill
Copy link

scargill commented Jun 1, 2017

That last message as a microPython newby worries me... I don't want to start coding in microPython on the basis of unreliable MQTT as we have reliable MQTT in C......

Can someone give an example of this "robust" code which WILL stay connected and which will resubscribe on reconnection - i.e. so that it just works in the background.

Is this possible?

@dpgeorge
Copy link
Member

dpgeorge commented Jun 2, 2017

Is there a better way to go about handling this situation where the broker 'disappears' and then 're-appears' at some later time?

Yes, the current implementation of umqtt.robust does not handle the case when the broker is restarted and forgets all of its state (at least the state related to your client).

Should the umqtt.robust object include a list of subscribed topics to auto-resubscribe in the reconnect() function?

Prehaps. This is indeed how other libraries work (eg https://github.com/fusesource/mqtt-client). The robust MQTTClient class would need to override the "subscribe" method to record the topics (and qos), and then in the reconnect() method it would call subscribe() again after reconnecting.

@dpgeorge
Copy link
Member

dpgeorge commented Jun 2, 2017

See #186 for a fix which will resubscribe to all existing topics if a reconnect is made.

@craftyguy
Copy link

Damn, I was in the process of writing a fix for this. You win the day, sir!

@dpgeorge
Copy link
Member

dpgeorge commented Jun 2, 2017

@craftyguy I'd be interested to see your solution. And also if you want to test my solution and give feedback that would be great.

@craftyguy
Copy link

I literally started about less than an hour ago, but my approach was pretty much the same as yours. Your solution looks to be more elegant/robust. I'll give this a shot possibly as early as tomorrow, since I'm tired of hacking together a more robust robust mqtt 😄

@craftyguy
Copy link

@dpgeorge

I tried your patch (#186), and it doesn't seem to work with this simple test program:

import machine
from umqtt.robust import MQTTClient
import utime

MQTT_SERVER = '1.1.1.1'
IN = 'in'
OUT = 'out'

# mqtt subscription callback
def sub_cb(topic, msg):
    t = topic.decode('ASCII')
    m = msg.decode('ASCII')
    print("received new topic/msg: %s / %s" % (t, m))
    if t == IN:
        print("IN: %s" % m)

umqtt_client = MQTTClient("test_client", MQTT_SERVER)
umqtt_client.DEBUG = True
umqtt_client.set_callback(sub_cb)
umqtt_client.connect(clean_session=False)
umqtt_client.subscribe(IN)
print("Connected to MQTT broker: %s" % MQTT_SERVER)


def main():
    global umqtt_client
    while True:
        utime.sleep(1)
        umqtt_client.check_msg()
        umqtt_client.publish(OUT, b'hi!')

I should note that I am invoking the main() function here from main.py.

When I restart the mqtt broker, I get an mqtt: OSError(-1,) printed to console. I can see the client reconnects to the broker since there's a message in the broker log about this, but the client doesn't respond to messages published to the IN topic, nor does it publish anything else to OUT topic.

If my test is an invalid use of umqtt, please let me know since I am relatively new to using mqtt!

@dpgeorge
Copy link
Member

dpgeorge commented Jun 6, 2017

@craftyguy for your example to work I think you need to connect with clean_session=True, because you'll be explicitly resubscribing upon reconnection.

@craftyguy
Copy link

@dpgeorge I see, thank you for pointing that out. I also see the PR was merged :)
I will give it another try!

@dpgeorge
Copy link
Member

dpgeorge commented Jun 7, 2017

I also see the PR was merged

@craftyguy No it wasn't, so you'll need to pull the PR explicitly to test it.

@craftyguy
Copy link

craftyguy commented Jun 7, 2017 via email

@curiouswala
Copy link

curiouswala commented Jun 22, 2018

I tried the PR and it does recover from a broker restart but when the broker loses power and comes back, it doesn't recover. It is completely reproducible, happens every time I take the power out from my raspberry pi zero running the mosquitto broker. Does anyone else face this behaviour?
The problem seems to be that umqtt.robust does raise an error when killing my mqtt broker from the terminal but doesn't raise any error when broker dies from a power outage.

ian-llewellyn added a commit to ian-llewellyn/micropython-lib that referenced this issue Jul 18, 2022
After `reconnect()`, MQTTClient.socket is blocking by default. This
commit aims to fix that for when `check_msg()` sets the socket to
non-blocking. May fix errors reported in micropython#102 and micropython#192.

This fix:
* avoids using an additional instance attribute to record the intended
  state of the socket
* only adds two additional lines of code to one file in the codebase
* depends on socket's `__str__()` method to retrieve the current timeout
  value: `<socket state=0 timeout=-1 incoming=0 off=0>` - not ideal
ian-llewellyn added a commit to ian-llewellyn/micropython-lib that referenced this issue Jul 18, 2022
After `reconnect()`, MQTTClient.socket is blocking by default. This
commit aims to fix that for when `check_msg()` sets the socket to
non-blocking. May fix errors reported in micropython#102 and micropython#192.

This fix:
* avoids using an additional instance attribute to record the intended
  state of the socket
* only adds two additional lines of code to one file in the codebase
* depends on socket's `__str__()` method to retrieve the current timeout
  value: `<socket state=0 timeout=-1 incoming=0 off=0>` - not ideal
ian-llewellyn added a commit to ian-llewellyn/micropython-lib that referenced this issue Jul 18, 2022
After `reconnect()`, MQTTClient.socket is blocking by default. This
commit aims to supplement that behaviour at times when `check_msg()`
sets the socket to non-blocking. It may fix errors reported in micropython#102 and
micropython#192.

This fix:
* avoids using an additional instance attribute to record the intended
  state of the socket
* only adds two additional lines of code to one file in the codebase
* depends on socket's `__str__()` method to retrieve the current timeout
  value: `<socket state=0 timeout=-1 incoming=0 off=0>` - not ideal
ian-llewellyn added a commit to ian-llewellyn/micropython-lib that referenced this issue Sep 27, 2022
After `reconnect()`, MQTTClient.socket is blocking by default. This
commit aims to supplement that behaviour at times when `check_msg()`
sets the socket to non-blocking. It may fix errors reported in micropython#102 and

This fix:
* avoids using an additional instance attribute to record the intended
  state of the socket
* only adds two additional lines of code to one file in the codebase
* depends on socket's `__str__()` method to retrieve the current timeout
  value: `<socket state=0 timeout=-1 incoming=0 off=0>` - not ideal
@jonnor
Copy link

jonnor commented Aug 25, 2024

Is this still an issue on latest MicroPython and mqtt.robut? If so, we need minimal example code on how to reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants