-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BGApiConnHandler thread exceptions, no way to catch and recover #2
Comments
Which pybgapi version do you use? |
I've worked around this here: etactica/silabs-pybgapi-fork@4712def I'm using BGAPI 1.2.0 (imported from pypi into our fork where I've added this event wrapper) Target is is a BGM220S, using bt_evt_system_boot(major=5, minor=1, patch=0, build=144, bootloader=0, hw=258, hash=3778289845) the XAPI file are from the gecko 4.2.1 SDK package. I've had this problem with 4.2.0 as well, I'm not sure if I was using the NCP code with 4.1.3 |
The versions seem to be correct. |
At the moment it's a BRD4184A rev A02. (BG22 thunderboard) But will eventually be a custom board, most likely with an MG24. |
Thank you! |
it's already running 1v4p9b113 |
I see. |
I'm trying to setup an example using your examples repo, but no luck yet. I see this, and also "'<class 'struct.error'>:unpack requires a buffer of 130 bytes [' File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner\n', ' File "/usr/lib/python3.10/site-packages/bgapi/bglib.py", line 186, in run\n', ' File "/usr/lib/python3.10/site-packages/bgapi/serdeser.py", line 304, in parse\n']'" errors a lot more when I'm running on a lower power CPU. On my laptop, it's much rarer. the current workload is permanent scanning for legacy advertisements, a single periodic advertiser set, and rolling a loop of client connections to devices seen. |
I've created an example using the minimal code I can manage that demonstrates this: https://github.com/etactica/pybgapi-examples/tree/just-simple-crasher/example/bt_scanner It's simply opening the NCP connection in an advertising rich environment scanning for everything at once. This hangs with various serdes errors (depending on exactly where it's gotten lost) The hang is the problem. because the internal bglib thread dies, the app stays running, which no standard process monitoring can catch this and restart. Also, it happens very often. |
Also, as far as addressing the root cause, changing SL_NCP_EVT_BUF_SIZE up to 4096 has no impact on this that I can see, not sure what else to try in that regard. I'm currently running this on the MG24 xplorer, brd2703A rev a02, but as mentioned, was also on the bg22 explorer board |
Thanks for the reproducer script!
|
Option 1 would be ideal, of course, but I'm not sure what avenue to even try, I'm not used to having a 115200 serial port lose data like this. I do agree it seems like the serial port is losing data. It seems to be CPU related though, with all the debugging prints, it tops out at 100% and looks like it's switching cores as it's rescheduled. I tried extending the event buffer size in NCP, but that didn't help. What speed can I configure the Jlink on these dev boards to? I'd like to use 1MBit or more if could.
I'm not sure option 3 would actually help, if I'm having problems with the serial port losing data? I did look at that, as it seems essential if want to use bluetooth and matter concurrently down the road anyway, and I can try packaging cpcd for this platform to try it out. |
with cpcd, I still get failures, and cpcd errors similar too: FATAL system call in function 'server_push_data_to_endpoint' in file server_core/server/server.c at line #1741 : Invalid argument the client side I had to modify a little to not issue a reset, but instead to wait 2 seconds before trying the HELLO command, but it still crashes very rapidly, so it's ~equivalent to the normal NCP case, just with more moving pieces :) |
I've got some long lived applications using bgapi, and they (far too) regularly have the internal thread die:
I presume this is related to the inherent unreliable nature of the serial channel, but that's unfixable...
Unfortunately, there's no event generated, or caught, so there doesn't seem to be any way for my application to handle this. It would be nice if it could all be safely caught and generate some system level event please...
The text was updated successfully, but these errors were encountered: