Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

20221102 zstack 3.x.0 coordinator build throws NWK_TABLE_FULL after ~7 days #402

Closed
sjorge opened this issue Dec 5, 2022 · 11 comments
Closed

Comments

@sjorge
Copy link
Contributor

sjorge commented Dec 5, 2022

Had it 3 times now that after about a week some device drop of the network (and sometimes even vanish from the database if not caught quickly).

It seems to be because NWK_TABLE_FULL errors, I usually toggle my lights in blocks with at the breaker to get them back.
Then when they join again I sometimes (but not for every bulb) get:

error 2022-12-05 20:23:04: Error: ConfigureReporting 0x680ae2fffe11ed6f/1 lightingColorCtrl([{"attribute":"currentY","minimumReportInterval":3,"maximumReportInterval":3600,"reportableChange":1}], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":20051,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))

When the device go offline themselves it's just a availability ping failure, no other error and online device still work fine for a bit.

Unplugging the coordinator stick for a few minutes and plugging it back in once the network is mostly back seems to help.

I'm using a zzhp and my mesh is rather dense with a lot of router but I am still below (85) 100 devices. (I had over 100 before without issues on the same stick).

While the devices are offline, usually buttons controlling unavailable devices work fine still. So I think it's just the coordinator running out of space and not having a route to the device.

image

I did notice on this firmware the coordinator has way more lines attached than before. Before it would have like 8 ish lines to routers and then they'd mesh.

Sadly frontend can't show the source routing map and it's too big to draw manually with graphvis :(

@sjorge
Copy link
Contributor Author

sjorge commented Dec 5, 2022

Attached is the graphviz data for the map, but I never managed to get it to draw without it OOM'ing after it exhausts 32G of memory.

map.dot.txt

Edit: I did manage to render it with 64G memory
https://drive.google.com/file/d/1H53a1fjodf3NlWzhUE9D4t4Xt4af2_ck/view

Edit 2: this is a fresh map with the 20220219 firmware, it seems to have less connections from the coordinator, I wonder if that is also why I was able to render this one 🤔

@sjorge
Copy link
Contributor Author

sjorge commented Dec 5, 2022

Going to revert back to the previous firmware CC1352P2_CC2652P_other_coordinator_20220219 the issue seems similar enough to #383 and that one mentioned 20220219 as the last known good, and that was also the one I was running before the issue started.

@Koenkk
Copy link
Owner

Koenkk commented Dec 6, 2022

Did the issue also occur with 20220219?

@sjorge
Copy link
Contributor Author

sjorge commented Dec 6, 2022

I dont’t remember it happening before upgrading, and that was the newest firmware i had in my downloads folder before 20221102, so i flashed that one again yesterday.

I guess if the mesh stays up for more than a week we’ll know.

@ellnic
Copy link

ellnic commented Dec 7, 2022

I've just lost a Hue motion sensor after 5 days on 20221102. Haven't had any issues on 20220219 since removing Ikea battery powered devices. I'll stay on 20221102 for the time being to see if it replicates.

Edit: I've just checked the logs and I see no mention of NWK_TABLE_FULL so whatever threw my motion sensor isn't the same as above.

@sjorge
Copy link
Contributor Author

sjorge commented Dec 10, 2022

So far it seems stable at 4 days, i had to reboot the node so not made it to ~7 yet. (With the old firmware)

@sjorge
Copy link
Contributor Author

sjorge commented Dec 13, 2022

Everything still good on 20220219, guess i’ll be sticking to this one for a bit longer.

@Koenkk
Copy link
Owner

Koenkk commented Dec 13, 2022

I'll try to publish a new fw this week, see #383 (comment)

@Jabe
Copy link

Jabe commented Dec 14, 2022

Same issue with 20221102. NWK_TABLE_FULL after a good amount of days. Lost 3 devices. One I was able to rejoin no problem but two only after downgrading to 20220219. I have 50 routers and 33 end devices, so I will try the new version when it's out!

@Koenkk
Copy link
Owner

Koenkk commented Dec 14, 2022

try with 20221214: #383 (comment)

I'll close this thread, lets continue in #383

@Koenkk Koenkk closed this as completed Dec 14, 2022
@Koenkk
Copy link
Owner

Koenkk commented Apr 1, 2023

Update: let's continue in Let's continue in #439

Repository owner locked as resolved and limited conversation to collaborators Apr 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants