-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault in Socket_getReadySocket #696
Comments
From that gdb backtrace, it looks like you have multiple threads calling Socket_getReadySocket(). That's not intended to happen, and if true, is the source of the problem. |
I have all the MQTT logic running in one thread in asynchronous mode so the only other thread would be the helper thread that gets created. Perhaps the pasting of the trace wasn't clear but I was just trying to show you the output of the commands I had typed in once in gdb. |
|
I don't have a small program at the moment that I can put here. I was destroying the client to mainly change the client ID before reconnect. Is there a way to do that without having to destroy? |
No, you can't change the client id without recreating. This was to minimise accidental errors with persistence, where if the client id is changed, any persisted data is invalidated. |
One other thing I noticed was that every time I call MQTTClient_connect after a failed connection attempt, new file descriptors are being created. I'm using websockets method to connect to broker. I eventually get to a point where there no more descriptors to use for the rest of the system. Here's roughly what I'm doing: while (MQTTCLIENT_SUCCESS != MQTTClient_connect(client, &conn_opts)) Is there something that I'm missing that's supposed to close all the opened descriptors? |
I went back to version 1.3.0 and don't see this issue with the same code above. I debugged and it looks everytime MQTTClient_connect is called a new socket is being created and just left open. Seems like we should be reusing previously opened socket until we timeout or there's an error. |
A connect will create a new socket each time. There might be different connect options which need to be applied in any case so that's unlikely to change. Out of interest, why do you need to change the client id on each connect? That's potentially going to leave more state around on the server. |
What do you mean by different connect options? I'm not changing the client id on every connect. I run a similar loop as I mentioned above for an hour and if it can't connect then I destroy the client and recreate it with a new ID and try to connect again. I was doing that mainly just to make sure there was nothing wrong on my end and if for some reason the IOT platform was blocking the connection because of the client ID. I took your advice about not destroying it so I'm just the same client ID now but that's when I noticed all the open descriptors. |
Potentially fixed in the develop branch. |
I'm getting a segmentation fault when it's trying to do select in this function. This is for an embedded application where I attempt to connect to the broker and if MQTTClient_connect returns an error than I wait 30 seconds and I'll call MQTTClient_disconnect, then MQTTClient_destroy, recreate the client and attempt to connect again.
I'm simulating a case where there isn't a good connection to the broker so it's continuously attempting to connect. Sometimes it may connect but it will frequently lose the connection and attempt to reconnect. I've attached the trace files and the gdb trace.
mqttlvl1.txt
mqttlvl3.txt
#0 0xfe373450 in ?? ()
(gdb) bt
#0 0xfe373450 in ?? ()
#1 0xfe373384 in ?? ()
#2 0x484ff5ac in Socket_getReadySocket (more_work=1218187716, tp=0x47fbeeb0, mutex=0x47fbef10) at /shared/build/test/LIBS/pahoMqttLocal/src/Socket.c:267
#3 0x00000000 in ?? ()
(gdb)
#0 0xfe373450 in ?? ()
#1 0xfe373384 in ?? ()
#2 0x484ff5ac in Socket_getReadySocket (more_work=1218187716, tp=0x47fbeeb0, mutex=0x47fbef10) at /shared/build/test/LIBS/pahoMqttLocal/src/Socket.c:267
#3 0x00000000 in ?? ()
(gdb) f 2
#2 0x484ff5ac in Socket_getReadySocket (more_work=1218187716, tp=0x47fbeeb0, mutex=0x47fbef10) at /shared/build/test/LIBS/pahoMqttLocal/src/Socket.c:267
267 rc = select(s.maxfdp1, &(s.rset), &pwset, NULL, &timeout);
(gdb) print s.maxfdp1
$1 = 0
(gdb) print s.rset
$2 = {fds_bits = {0, 0, 0, 0, 0, 0, 0, 0}}
(gdb) print pwset
$3 = {fds_bits = {1207692928, 0, 0, 0, 0, 0, 0, 0}}
(gdb) print timeout
$4 = {tv_sec = 2, tv_usec = 1218187716}
I also have other threads using sockets to send other types of data. I'm using libraries based on the latest changes in the develop branch where it's being prepped for release 1.3.1.
The text was updated successfully, but these errors were encountered: