-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack smashing #745
Comments
I don't think there is a difference in data written to persistence over the different versions, except for MQTT 3.1.1 vs MQTT 5. I have some tests written for that case, but there could be gaps. |
In the fix for #815 I realized that this might have been caused by the WebSockets support - altering the persisted data if it was going to be sent over websockets. I have changed it so that doesn't happen any more, but we do need to check that when using websockets, reading persisted data works properly. And that you can persist messages with a websocket connection, and then resend later over TCP, and vice versa - it shouldn't matter. |
Test persistence left by websockets and used for a non-websockets connection, and vice versa. |
I will start testing this evening to see what I can find. |
@fpagliughi did you manage to do any testing? |
There is definitely an issue with different versions of the persistence file after an upgrade. I haven't tracked it down yet, but at the same time, I didn't consider this bad enough to prevent the Rust release from going forward. I doubt anyone really needs to keep their persistence directory intact through a library upgrade. But I would like to get to the bottom of it. (And apologies if it turns out to be something in the derived libs). BTW, it was a simple async TCP publisher example. But I also have another example app in the Rust library that just publishes lots of messages. Maybe 10 or 100 per second (configurable). That will run fine for days, doesn't seem to leak any memory, but then crashes out with a "stack smashing" error. I suspect this one is definitely on the Rust side of things, but will put together a comparable C example just to make sure. |
Barring any other bugs which I will discover in the process of testing the persistence, the issue I fixed in 1.3.2 with persistence and Websockets would have caused the persistence data to be unrecognized if written by 1.3.1 and websockets. That would be not readable by 1.3.0 (no websockets) or 1.3.2 (websockets or not). But as I said, I'll be trying it out. |
I face the same issue, with a process that works for days, and got a smashing stack error, using gdb and back trace, the error involve the publish. Once the process crash, relaunching it , it also smash directly in persistent reading. Relaunching again, works for other days / weeks. process used is http://github.com/frett27/iotmonitor , with a static link to mqtt paho. hope to provide a better feedback and analysis |
Yes, sorry I haven't followed up on this. We need a minimal example to show this, if it's still happening. Something that just publshes data in a fairly tight loop (like a few hundred or thousand messages per second). Leave it running for a day or two and see what happens. I'll try to get to it soon, but if anyone beats me to it, I won't complain. :-) |
It just occurred to me that maybe it's #571 |
I just reproduced this symptom quickly by reducing the value of PERSISTENCE_MAX_KEY_LENGTH to a smaller value - so I think that confirms it's #571. I've added a fix to the develop branch. One way to avoid this at the moment is not to use persistence (for testing purposes). |
I reproduce the issue today, i'don't know yet if it's related: Thread 2 "iotmonitor" received signal SIGABRT, Aborted. relaunching the process: Thread 2 "iotmonitor" received signal SIGABRT, Aborted. |
@frett27 yours is a different case from @fpagliughi as you are using the MQTTClient API rather than MQTTAsync. That will be the same effect for qentry_seqno at lines 665 and 668 of MQTTPersistence.c. |
Thank's @icraggs i just pulled the fix from develop, and made the change for MQTTClient API. i kept the gdb running to had more investigation on my side, thank's so much |
Feedback : Seems this fix the issue for me, my small server has been launch a couple of days before, and no more stack smashed after correcting the MQTTClient API. |
I've been having a recurring issue with a crash of my sample apps in Linux (Ubuntu 18.04 at the moment) where it complains about stack smashing.
I believe that it happens when I switch between different versions of the Paho C library during development, so I don't think it's a problem most users would commonly see.
I believe that it has something to do with the default file persistence used by the library. One guess may be that if the library tries to access the persistence file created from a different version of the lib, then maybe it gets some bad data and things go ugly. If that's the case, then perhaps the file persistence could use a version tag written to the file?
Here's a max dump when trying to publish a message:
Note that if I remove the persistence directory, 'myclientid-localhost-1883', then the app runs fine.
The text was updated successfully, but these errors were encountered: