Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPI3 Disconnects #465

Open
ediogodias opened this issue Sep 18, 2016 · 25 comments
Open

RPI3 Disconnects #465

ediogodias opened this issue Sep 18, 2016 · 25 comments

Comments

@ediogodias
Copy link

ediogodias commented Sep 18, 2016

I tested a simple simple to connect to a BLE device, but i almost all the times receive a disconnect before connecting... Cant figure out why.

The same script on MacOS works as expected.

var noble   = require('noble');

noble.on('stateChange', function(state) {
  if (state === 'poweredOn') {
    noble.startScanning(['180d']);
  } else {
    noble.stopScanning();
  }
});


noble.on('discover', function(peripheral) {
  console.log(peripheral.advertisement.localName);

  peripheral.on('disconnect', function() {
    console.log("disconnected", peripheral.advertisement.localName)
  });


  peripheral.connect(function(error) {
    peripheral.discoverSomeServicesAndCharacteristics(['180d'], ['2a37'], function(error, services, characteristics){
      var service = services[0];
      var characteristic = characteristics[0];

      console.log(characteristic);
    });
  });

});
node index2.js 
Polar H7 B861C114
discovered Polar H7 B861C114
disconnected Polar H7 B861C114

Here the output of hcidump:

HCI sniffer - Bluetooth packet analyzer ver 5.23
device: hci0 snap_len: 1500 filter: 0xffffffff
2016-09-18 17:40:55.894473 > HCI Event: Command Complete (0x0e) plen 4
    Set Event Mask (0x03|0x0001) ncmd 1
    status 0x00
2016-09-18 17:40:55.895041 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Event Mask (0x08|0x0001) ncmd 1
    status 0x00
2016-09-18 17:40:55.895551 > HCI Event: Command Complete (0x0e) plen 12
    Read Local Version Information (0x04|0x0001) ncmd 1
    status 0x00
    HCI Version: 4.1 (0x7) HCI Revision: 0xb6
    LMP Version: 4.1 (0x7) LMP Subversion: 0x2209
    Manufacturer: Broadcom Corporation (15)
2016-09-18 17:40:55.895955 > HCI Event: Command Complete (0x0e) plen 4
    Write LE Host Supported (0x03|0x006d) ncmd 1
    00 
2016-09-18 17:40:55.896361 > HCI Event: Command Complete (0x0e) plen 6
    Read LE Host Supported (0x03|0x006c) ncmd 1
    00 01 00 
2016-09-18 17:40:55.896915 > HCI Event: Command Complete (0x0e) plen 10
    Read BD ADDR (0x04|0x0009) ncmd 1
    status 0x00 bdaddr B8:27:EB:07:A3:EF
2016-09-18 17:40:55.906495 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x0c
    Error: Command Disallowed
2016-09-18 17:40:55.906922 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Parameters (0x08|0x000b) ncmd 1
    status 0x00
2016-09-18 17:40:55.912280 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-18 17:40:57.319102 > HCI Event: Command Status (0x0f) plen 4
    LE Create Connection (0x08|0x000d) status 0x00 ncmd 1
2016-09-18 17:40:57.771967 > HCI Event: Command Status (0x0f) plen 4
    LE Read Remote Used Features (0x08|0x0016) status 0x00 ncmd 1
2016-09-18 17:40:57.773726 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-18 17:40:57.876261 > HCI Event: Disconn Complete (0x05) plen 4
    status 0x00 handle 64 reason 0x3e
    Reason: Connection Failed to be Established
2016-09-18 17:41:00.207559 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x0c
    Error: Command Disallowed
@sandeepmistry
Copy link
Collaborator

What's the value of the error arg in the connect callback?

From the HCI dump the connection is failing:

2016-09-18 17:40:57.319102 > HCI Event: Command Status (0x0f) plen 4
    LE Create Connection (0x08|0x000d) status 0x00 cmd 1

....

2016-09-18 17:40:57.876261 > HCI Event: Disconn Complete (0x05) plen 4
    status 0x00 handle 64 reason 0x3e
    Reason: Connection Failed to be Established

Also, can you run with sudo hcidump -t -x next time? Which version of noble are you using?

@TheCbac
Copy link

TheCbac commented Sep 30, 2016

I'm experiencing an issue on a Raspberry Pi 3 as well. When connecting to a peripheral about 50% of the time the device connects and then immediately disconnect. The other 50% the connection success with no problems. I'm running the exact same code on my Mac (OS X 10.11.4) and the connections succeed 100%

When the connection fails:


$ sudo hcidump -t -x
HCI sniffer - Bluetooth packet analyzer ver 5.23
device: hci0 snap_len: 1500 filter: 0xffffffff
2016-09-30 16:59:16.458302 < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
    value 0x01 (scanning enabled)
    filter duplicates 0x01 (enabled)
2016-09-30 16:59:16.458837 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-30 16:59:18.388824 > HCI Event: LE Meta Event (0x3e) plen 42
    LE Advertising Report
      ADV_IND - Connectable undirected advertising (0)
      bdaddr CB:AC:05:9E:71:9E (Public)
      Flags: 0x06
      Complete local name: 'Sense01'
      Unknown type 0x06 with 16 bytes data
      RSSI: -46
2016-09-30 16:59:18.418575 > HCI Event: LE Meta Event (0x3e) plen 21
    LE Advertising Report
      SCAN_RSP - Scan Response (4)
      bdaddr CB:AC:05:9E:71:9E (Public)
      Unknown type 0x1b with 7 bytes data
      RSSI: -62
2016-09-30 16:59:20.785048 < HCI Command: LE Create Connection (0x08|0x000d) plen 25
    bdaddr CB:AC:05:9E:71:9E type 0
    interval 96 window 48 initiator_filter 0
    own_bdaddr_type 0 min_interval 6 max_interval 12
    latency 0 supervision_to 200 min_ce 4 max_ce 6
2016-09-30 16:59:20.787814 > HCI Event: Command Status (0x0f) plen 4
    LE Create Connection (0x08|0x000d) status 0x00 ncmd 1
2016-09-30 16:59:20.800756 > HCI Event: LE Meta Event (0x3e) plen 19
    LE Connection Complete
      status 0x00 handle 64, role master
      bdaddr CB:AC:05:9E:71:9E (Public)
2016-09-30 16:59:20.800982 < HCI Command: LE Read Remote Used Features (0x08|0x0016) plen 2
  40 00 
2016-09-30 16:59:20.802029 > HCI Event: Command Status (0x0f) plen 4
    LE Read Remote Used Features (0x08|0x0016) status 0x00 ncmd 1
2016-09-30 16:59:20.802081 < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
    value 0x00 (scanning disabled)
    filter duplicates 0x00 (disabled)
2016-09-30 16:59:20.803718 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-30 16:59:20.904464 > HCI Event: LE Meta Event (0x3e) plen 12
    LE Read Remote Used Features Complete
      status 0x3e handle 64
      Error: Connection Failed to be Established
2016-09-30 16:59:20.904998 > HCI Event: Disconn Complete (0x05) plen 4
    status 0x00 handle 64 reason 0x3e
    Reason: Connection Failed to be Established
2016-09-30 16:59:20.970692 < ACL data: handle 64 flags 0x00 dlen 7
    ATT: MTU req (0x02)
      client rx mtu 256
2016-09-30 16:59:21.168153 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 64 packets 1
2016-09-30 16:59:36.440391 < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
    value 0x00 (scanning disabled)
    filter duplicates 0x01 (enabled)
2016-09-30 16:59:36.441116 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x0c
    Error: Command Disallowed

When the connection succeeds:

$ sudo hcidump -t -x
HCI sniffer - Bluetooth packet analyzer ver 5.23
device: hci0 snap_len: 1500 filter: 0xffffffff
2016-09-30 17:02:59.439319 < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
    value 0x01 (scanning enabled)
    filter duplicates 0x01 (enabled)
2016-09-30 17:02:59.440395 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-30 17:03:01.022537 > HCI Event: LE Meta Event (0x3e) plen 42
    LE Advertising Report
      ADV_IND - Connectable undirected advertising (0)
      bdaddr CB:AC:05:9E:71:9E (Public)
      Flags: 0x06
      Complete local name: 'Sense01'
      Unknown type 0x06 with 16 bytes data
      RSSI: -48
2016-09-30 17:03:01.052313 > HCI Event: LE Meta Event (0x3e) plen 21
    LE Advertising Report
      SCAN_RSP - Scan Response (4)
      bdaddr CB:AC:05:9E:71:9E (Public)
      Unknown type 0x1b with 7 bytes data
      RSSI: -66
2016-09-30 17:03:02.536673 < HCI Command: LE Create Connection (0x08|0x000d) plen 25
    bdaddr CB:AC:05:9E:71:9E type 0
    interval 96 window 48 initiator_filter 0
    own_bdaddr_type 0 min_interval 6 max_interval 12
    latency 0 supervision_to 200 min_ce 4 max_ce 6
2016-09-30 17:03:02.539609 > HCI Event: Command Status (0x0f) plen 4
    LE Create Connection (0x08|0x000d) status 0x00 ncmd 1
2016-09-30 17:03:02.585033 > HCI Event: LE Meta Event (0x3e) plen 19
    LE Connection Complete
      status 0x00 handle 64, role master
      bdaddr CB:AC:05:9E:71:9E (Public)
2016-09-30 17:03:02.585227 < HCI Command: LE Read Remote Used Features (0x08|0x0016) plen 2
  40 00 
2016-09-30 17:03:02.586591 > HCI Event: Command Status (0x0f) plen 4
    LE Read Remote Used Features (0x08|0x0016) status 0x00 ncmd 1
2016-09-30 17:03:02.586642 < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2
    value 0x00 (scanning disabled)
    filter duplicates 0x00 (disabled)
2016-09-30 17:03:02.587406 > HCI Event: Command Complete (0x0e) plen 4
    LE Set Scan Enable (0x08|0x000c) ncmd 1
    status 0x00
2016-09-30 17:03:02.632572 > HCI Event: LE Meta Event (0x3e) plen 12
    LE Read Remote Used Features Complete
      status 0x00 handle 64
      Features: 0x11 0x00 0x00 0x00 0x00 0x00 0x00 0x00
2016-09-30 17:03:02.639655 < ACL data: handle 64 flags 0x00 dlen 7
    ATT: MTU req (0x02)
      client rx mtu 256
2016-09-30 17:03:02.662428 > ACL data: handle 64 flags 0x02 dlen 7
    ATT: MTU resp (0x03)
      server rx mtu 256
2016-09-30 17:03:02.792640 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 64 packets 1

@ediogodias
Copy link
Author

Sorry for not answering with the full log.. i digged a lot into this issue and my findings points towards node-bluetooth-hci-socket issues... When using gattool everything works as expected 100% of the time.

I played around the Hci.prototype.setSocketFilter that is executed, and this made things work a little bit in terms of connection, whenever i removed some of the masks used.

Another finding that i made, about behaviour, is that on MacOS if i lose connection of the bluetooth peripheral (because of range) and then the peripheral get in range again, i can listen to this and reconnect again. On the other hand, in raspberry, if i lose connection i will never receive callback for that peripheral again. The only option is to stop and start scanning again.

Its seems that this project, although all the credits and kudos for the work done, have really underterministic behaviours depending on the system it executes.

I moved away to python bindings for now..

@cscholze
Copy link

cscholze commented Oct 7, 2016

@ddxtreme which bindings did you use? Are they working? And are you still using the Noble library and using custom bindings?

@ediogodias
Copy link
Author

ediogodias commented Oct 7, 2016

@cscholze i stop using noble, i went using bluepy for python. As i didnt depend on nodejs and after several hours trying to understand the problem i switched as soon as my findings points towards node-bluetooth-hci-socket malfunction with RP3. As there isnt any other library for node for bluetooth, i switched.

Now i already overcome that issue with bluepy

@sandeepmistry
Copy link
Collaborator

@ddxtreme @TheCbac @cscholze

Can you please try out noble/node-bluetooth-hci-socket#39 and provide your feedback?

cd <some patch/noble
cd node_modules
rm -rf bluetooth-hci-socket
npm install sandeepmistry/node-bluetooth-hci-socket#rework-kernel-workarounds

Then try your noble app ...

@tengelmeier
Copy link

tengelmeier commented Oct 11, 2016

I've run into probably the same issue.

Here is what iot-sensor-tag/publish/sensor-tag.js outputs with the patches branch:

   iot-sensor-tag/publish $ sudo node sensor-tag.js 
   Device MAC Address: b827ebXXXXXX
   The device MAC address does not match the ID in the configuration file.
   Make sure the Sensor Tag is on!
   MQTT client connected to IBM IoT Cloud.
   Discovered device with UUID: bc6a29ab67ef  (cc2540)
   Connected To Sensor Tag
   noble warning: unknown peripheral bc6a29ab67ef
   noble: unknown peripheral null connected!
   noble: unknown peripheral null connected!
   noble: unknown peripheral null connected!
   [... repeated ...]

Sorry, I have very little Linux / Node / Raspberry knowledge - just wanted to push a bunch of humidity data in the cloud for analysis...

P.S.: it looks like hcidump is missing from my bluez installation?

@TheCbac
Copy link

TheCbac commented Oct 17, 2016

@sandeepmistry Thanks, I'll try this out later this week.

In the mean time, I noticed that all of the issues went away when switching to ethernet, instead of wifi and turning off wifi with:

$ sudo ifdown wlan0

@BasKiers
Copy link

BasKiers commented Oct 19, 2016

Hi,

I am experiencing the same problems referenced in this issue (and #453, #474, #480) as well.
At a random time the connection will immediately be disconnected after connecting.

This causes the next hci call to produce the following hci error requiring a hciconfig hci0 reset to get noble working again:

Bluetooth: hci0 link tx timeout
Bluetooth: hci0 killing stalled connection b1:d6:4b:0d:ac:e6

After a lot of debugging i've not come much closer to finding a fix. I tried to only call noble function sequentially and add a delay of 100ms between all calls that go to noble. This did not fix the issue but did cause less hci0 tx timeout errors which made it possible for my code to start a new connection that did not immediately disconnect (until hci0 does crash).

I've got a hcidump but it is an .cap file so you will need to open it in wireshark. The problem occurs at 183 sec of the dump. The calls that were done at that time were:

time        execution time   call
0248.3270 : 7                connect
0248.3380 : 110              discoverServices
0248.4600 : 125              discoverCharacteristics
0248.5910 : 1004             write
0253.6000 : 33               disconnect
0257.3440 : 65               connect
0257.4140 : timeout          discoverServices

I tested my code on linux kernel 4.1 and 4.7 and this did not resolve the issue.

I am using a BCM4330 based sdio module and used the wifi and bluetooth firmware that shipped with the hummingboard. I suspect this problem is firmware related since the raspberry pi 3 uses a similar module (BCM43430).
Maybe @arend has time to look whether this is firmware related?

Regards,
Bas

dump.cap.zip

Edit: I tried disabling wlan0 but this did not fix the issue for me.

@cscholze
Copy link

@TheCbac you said that using a wired connection got rid of the disconnect issue. Is that still the case? I tried it and it worked for a little bit, but then started disconnected again.

@sandeepmistry
Copy link
Collaborator

@BasKiers @cscholze what peripheral devices are you using? Maybe the noble connection parameter are used need to be adjusted?

@BasKiers have you tried to see if gatttool exhibits similar behaviour?

@TheCbac
Copy link

TheCbac commented Oct 23, 2016

@cscholze Since I switched to a wired connection I haven't experienced the issue at all. This is on multiple Pi's and over 10's of hours. The wired connection uses a static IP and wlan0 down.

@benjaminhon
Copy link

benjaminhon commented Oct 24, 2016

yes i am experiencing the same problem. I am using the Intel Edison.

about 50% of the time, when i connect, it immediately disconnects. And after i try about 2-5 times, it eventually connects. However this is causing some lag because of the retries.

Oct 24 06:28:50 Node0 node[9306]: <Buffer 00 00 c2 b4 cf f9 4f ec 4f 76 de 77 00 3f 00 02 00 90 00 5f>
Oct 24 06:28:50 Node0 node[9306]: <Buffer 00 00 c6 b4 c0 f9 40 ec 5f 7a de 72 00 40 00 c7 03 d8 00 26>
Oct 24 06:28:50 Node0 node[9306]: <Buffer 00 00 ca b4 c1 f9 40 ec 5f 80 de 72 00 48 ff a5 03 ca ff a6>
Oct 24 06:28:50 Node0 node[9306]: <Buffer 00 00 ce b4 c2 f9 40 ec 5f 80 de 74 00 4a 02 58 01 79 08 94>
Oct 24 06:28:50 Node0 node[9306]: disconnecting b4994c57a312
Oct 24 06:28:50 Node0 node[9306]: <Buffer 00 00 02 f3 84 f2 80 fc 5f 71 de 79 00 49 00 ec ff bf 02 ba>
Oct 24 06:28:50 Node0 node[9306]: disconnected from b4994c57a312
Oct 24 06:28:51 Node0 node[9306]: connecting to b4994c57a312
Oct 24 06:28:51 Node0 node[9306]: connected to b4994c57a312
Oct 24 06:28:51 Node0 node[9306]: disconnected from b4994c57a312  <---- disconnects immediately
Oct 24 06:28:52 Node0 node[9306]: connecting to b4994c57a312
Oct 24 06:28:53 Node0 node[9306]: connected to b4994c57a312
Oct 24 06:28:53 Node0 node[9306]: discovered fff0 fff1
Oct 24 06:28:53 Node0 node[9306]: subscribed to b4994c57a312 fff0 fff1
Oct 24 06:28:53 Node0 node[9306]: <Buffer 00 00 08 be c7 00 42 fd 6f 6b de 74 00 3a 00 0b 00 0b 00 2c>
Oct 24 06:28:53 Node0 node[9306]: <Buffer 00 00 0c be c8 00 42 fd 6f 68 de 79 00 3d 00 09 00 09 00 27>
Oct 24 06:28:53 Node0 node[9306]: <Buffer 00 00 00 be ca 00 42 fd 6f 6e de 74 00 3a 00 0a 00 0b 00 2a>
Oct 24 06:28:53 Node0 node[9306]: <Buffer 00 00 04 be cb 00 42 fd 6f 66 de 76 00 3a 00 07 00 0d 00 2b>

@tengelmeier
Copy link

tengelmeier commented Oct 24, 2016

I had an 100% reproducible immediate disconnect with the Jessie - supplied bluez 5.23 and Ti Sensortag
With bluez 5.42 I had 6h continuous uptime until an disconnect for a different reason...

@BasKiers
Copy link

For what it is worth I've got the debug log of when this issue occures:

Mon, 24 Oct 2016 14:58:12 GMT hci set scan enabled - writing: 010c20020001
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 040e04010c200c
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 14
Mon, 24 Oct 2016 14:58:12 GMT hci               cmd = 8204
Mon, 24 Oct 2016 14:58:12 GMT hci               status = 12
Mon, 24 Oct 2016 14:58:12 GMT hci               result = 
Mon, 24 Oct 2016 14:58:12 GMT hci create le conn - writing: 010d201960003000000068af644c99b40006000c000000c80004000600
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 040f0400010d20
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 15
Mon, 24 Oct 2016 14:58:12 GMT hci               status = 0
Mon, 24 Oct 2016 14:58:12 GMT hci               cmd = 8205
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 043e1301004000000068af644c99b40c000000c80005
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 62
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event type = 1
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event status = 0
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event data = 4000000068af644c99b40c000000c80005
Mon, 24 Oct 2016 14:58:12 GMT hci                       handle = 64
Mon, 24 Oct 2016 14:58:12 GMT hci                       role = 0
Mon, 24 Oct 2016 14:58:12 GMT hci                       address type = public
Mon, 24 Oct 2016 14:58:12 GMT hci                       address = b4:99:4c:64:af:68
Mon, 24 Oct 2016 14:58:12 GMT hci                       interval = 15
Mon, 24 Oct 2016 14:58:12 GMT hci                       latency = 0
Mon, 24 Oct 2016 14:58:12 GMT hci                       supervision timeout = 2000
Mon, 24 Oct 2016 14:58:12 GMT hci                       master clock accuracy = 5
@@@@@@ LE CONN COMPLETE b4994c64af68
Mon, 24 Oct 2016 14:58:12 GMT att b4:99:4c:64:af:68: write: 020001
Mon, 24 Oct 2016 14:58:12 GMT hci write acl data pkt - writing: 024000070003000400020001
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 011620024000
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 1
Mon, 24 Oct 2016 14:58:12 GMT hci               cmd = 8214
Mon, 24 Oct 2016 14:58:12 GMT hci               data len = 2
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 040f0400011620
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 15
Mon, 24 Oct 2016 14:58:12 GMT hci               status = 0
Mon, 24 Oct 2016 14:58:12 GMT hci               cmd = 8214
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 043e0c043e40000100000000000000
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 62
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event status = 62
Mon, 24 Oct 2016 14:58:12 GMT hci               LE meta event data = 40000100000000000000
Mon, 24 Oct 2016 14:58:12 GMT hci onSocketData: 0405040040003e
Mon, 24 Oct 2016 14:58:12 GMT hci       event type = 4
Mon, 24 Oct 2016 14:58:12 GMT hci       sub event type = 5
Mon, 24 Oct 2016 14:58:12 GMT hci               handle = 64
Mon, 24 Oct 2016 14:58:12 GMT hci               reason = 62
@@@@@@ LE DISCONN COMPLETE b4994c64af68
@@@@@CONNECTION CORRUPT DETECTED after  33ms

@cscholze
Copy link

cscholze commented Oct 25, 2016

@sandeepmistry We are using RPI3 for central and iOS devices for the peripherals. I still have not tested thoroughly with gatttool and will update you as soon as I do. Should be in the next day or two.

@BasKiers
Copy link

BasKiers commented Nov 7, 2016

Hi,

small update.
I've added the following measures and my ble connection is now much more stable, only occasional unexpected behaviour (corrupt connections) for which I have a hacky workaround aswell.

Things I did to try to improve stability:
Added a function queue to ensure only 1 call is done to noble at the same time.
This is done for all noble calls and I've added a hardcoded delay of 100ms between all calls (Will need to investigate if this delay can be removed and/or some noble calls can be done asynchronously).

Stop scanning before calling connect and disconnect

Added timeout logic to all noble calls, also moved the registering of the callback listener to my code because then I can remove the listener when a timeout occurs (before this would result in listener memory leak detected warnings)

Removing device from noble cache when connect returns an error or times out

Added logic to retry a call after it timed out (this does rarely fix the problem after the other fixes so I will probably remove this)

Use the kernel workarounds branch of node-bluetooth-hci-socket (https://github.com/sandeepmistry/node-bluetooth-hci-socket/tree/rework-kernel-workarounds). Although I do not have much test data to support it I do think this increased the stability for me. Maybe I can provide you with more definite data for this soon.

The problems I worked around are:
After a device get's lost and I clear it from noble (probably shouldn't do this but I want to be able to clear devices from noble when they are not available anymore) I am unable to find it again.
This was caused by the check if the advertisement message had type === 0x04.
Fixed this issue by setting process.env.NOBLE_REPORT_ALL_HCI_EVENTS and calling this.noble._bindings._hci.reset() before starting discovery (need to test if the latter is really necessary)

After connecting to a device sometimes a disconnect would immediately be called.
Fixed this by setting a timeout of 500 ms and if a disconnect event is received within this window I know the correction is corrupt. When this happens hci0 needs to be reset before I can get connections again so I disconnect from all devices, then call hciconfig hci0 reset from my code and then retry the connection.

I am currently testing the stability and fixing some bugs in my own code. My test setup consists of 2 Mipow Milight devices (one bulb and one candle) which I turn on and off in different intervals disconnecting and reconnecting between each write that has an interval higher than 3000 ms. After 10 minutes the program will disconnect and pause for 5 minutes and after that continue the loop. The longest stable test I've done till now (due to a bug in my code) was 8.5 hours and I am pretty confident that the noble part is stable. I do not use subscribe/unsubscribe since I think this will break when I do a hci reset so this is something I will need to test.

@Murvi
Copy link

Murvi commented Nov 7, 2016

Hi

Great update.

I created #487 as I am receiving disconnect supervisory timeouts.

Could you explain to me how you were able to increase the timeout value.

I having been trying to debug this over weekend without any joy as I am not getting or I don't think I am getting errors but the devices seem to be gracefully disconnecting due to the supervisory timeout value. This is strange as I getting this on two different devices at random intervals.

Would like to try and increase the supervisory timeout and test to see if this gives me more stable results.

@benjaminhon
Copy link

benjaminhon commented Nov 18, 2016

hmm so is there any resolution for this?

@sandeepmistry
Copy link
Collaborator

@benjaminhon could you please try #514?

@carlosamtech
Copy link

Hello Sandeep,
Thank you for a great set of libraries!
We have a related issue using noble-device that this patch has fixed. Are you planning to merge it to the master?
Running the following cycle, we never got connected to a peripheral two ice.….
discoverAll
stopDiscoverAll
connectAndSetUp
readStringCharacteristic
stopDiscoverAll
...
Thanks & Best

@cryptocake
Copy link

Not sure if this already known here, but RPi3 BL & Wifi are on the same chip. Using BL when connected to Wifi will interfere with Wifi, causing disconnects and eventually hangups. To solve this disable Wifi and go with a wired solution, or get an external BL adapter.
Just sharing this, just in case...

@Jaftem
Copy link

Jaftem commented Feb 6, 2018

Hi, I'm also dealing with many of the issues @BasKiers explained. I'm using a wifi dongle so that the RPi 3's wifi/bluetooth chip doesn't experience interference between the two services as per @cryptocake's most recent comment.

I have a crude workaround for this using @BasKiers solution to set a timeout that checks for an immediate disconnect. If this has happened, I manually force the peripheral to disconnect again. Can anyone explain what exactly is happening under the hood? The problem seems to occur more often when I advertise 10+ BLE signals from device B while device A holds a connection with the RPi3.

I have also noticed this immediate 'disconnect' is not a disconnect at all (this is why I force a peripheral.disconnect() upon checking if a 'disconnect' has occurred immediately after a connect). Both the pi and device tell me (using hcitool) that the connection is still on-going. Noble believes it has disconnected, tries to connect again, then fails with a 0x0c command disallowed status upon trying to retrieve characteristics.

@endrelovas
Copy link

I wish I read this thread weeks ago... not only random disconnects (both on Rpi3 and Rpi4) but there is data loss on notifications (on Rpi3, did not experience it on Rpi4) as well when I try to receive bulk data from a device.

@komorebi-san
Copy link

I am having the same issue on Ubuntu 18.04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests