Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Almost (not quite) all devices suddenly lost connectivity #19071

Closed
blacknell opened this issue Sep 25, 2023 · 8 comments
Closed

Almost (not quite) all devices suddenly lost connectivity #19071

blacknell opened this issue Sep 25, 2023 · 8 comments
Labels
problem Something isn't working stale Stale issues

Comments

@blacknell
Copy link

blacknell commented Sep 25, 2023

What happened?

I've had my zigbee setup running pretty well without incident for about 2 years when suddenly I noticed almost all devices were reported offline.

After some digging I found this error.

warn  2023-09-25 08:21:19: Failed to ping 'router_zigbee_downstairs' (attempt 1/1, Read 0x00124b0029e84b54/8 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC channel access failure' (225)))

This pointed at probably Wifi interference. I was using Zigbee channel 15 which does overlap partially with wifi channels 1 & 6. I changed my wifi access radios to fix with channel 11 (which shouldn't overlap at all). It all worked again for a few days.

Then it stopped once more. This time I tried wifi channel 13 (no luck).

Then I switched off all the 2.4GHz radios. Still no luck. I should add that wifi signals from neighbours are very weak.

What did you expect to happen?

I expect to see all my devices online.

Instead most but not all are suddenly now constantly offline.

Weirdly, some are showing online in the frontend (one of these is a powered device that can route, the others are batteries). When I try to switch on/off, it doesn't respond and I get front end error messages.

I don't think this is channel interference (it's happening with wifi OFF). It could be a hardware failure but I have no clue how to validate this.

Would really really appreciate some help from @Koenkk and the community

How to reproduce it (minimal and precise)

Difficult to say the least. It's happening 100% on my set up as follows:-

I have one coordinator (of course), 3 dedicated routers (distributed across the house), 15 powered devices (not all can route) and 8 battery devices. The coordinator and dedicated routers are all Sonoff Zigbee 3.0 USB Dongle Plus with the zStack3x0 firmware with 20230507 (coord) and 20221102 (router) from @Koenkk.

The LQI's were all pretty high when it was working.

I keep all the addons and firmware upto date (last change was to 1.33 about 3 weeks ago).

Zigbee2MQTT version

1.33.0-1

Adapter firmware version

20230507

Adapter

Sonoff Zigbee 3.0 USB Dongle Plus

Debug log

log-20230925.txt

@blacknell blacknell added the problem Something isn't working label Sep 25, 2023
@blacknell blacknell changed the title Almost (not all) devices suddenly lost connectivity Almost (not quite) all devices suddenly lost connectivity Sep 25, 2023
@Koenkk
Copy link
Owner

Koenkk commented Sep 25, 2023

'MAC channel access failure' (225) is indeed to interference, but not that is not only Wifi. It can also be bluetooth or proprietary devices: https://www.zigbee2mqtt.io/advanced/zigbee/02_improve_network_range_and_stability.html#interference-from-other-2-4-ghz-devices

It can also be due to other devices or the computer itself interference.

Note there is nothing I can do from z2m to fix this error.

@Fabiancrg
Copy link

I have the same problem, it started after I upgraded to 1.33 and updated the FW to 20230507 on my Sonoff ZDongle-P.
After some days, all routers go suddenly offline and I have to restart Z2M in order to get them back online.
I checked also the Wifi channel settup and made sure my zigbee channel was free and it is so I don't really understand why that's happening.

I have a second setup at a different localtion where I don't have any issue, the only difference is the adapter, it's the ZZH instead of the Sonoff.

Also I don't have the MAC error bu these two on all routers:
Failed to PING with message Timeout - 58034 - 1 - 189 - 0 - 1 after 10000ms or SRSP - AF - dataRequest after 6000ms

@blacknell
Copy link
Author

blacknell commented Sep 26, 2023

'MAC channel access failure' (225) is indeed to interference, but not that is not only Wifi. It can also be bluetooth or proprietary devices: https://www.zigbee2mqtt.io/advanced/zigbee/02_improve_network_range_and_stability.html#interference-from-other-2-4-ghz-devices

It can also be due to other devices or the computer itself interference.

Note there is nothing I can do from z2m to fix this error.

@Koenkk - So I moved the dongle to somewhere else in the room and I am getting connectivity now. Only moved it by about 1 metre, plus it's on a 2 metre extension lead so the dongle itself it at least 1 metre from nearest computer. I have Tx power set to 20 on the 4 routers/coordinator.

Most, but not all, the devices are showing green again but...

Several of them are showing green but are not working. For example, my light_landing_stairs is showing green but the switch method refuses to operate. However, it's state is showing correctly on/off. How weird is that?!

Physically operating the switch

debug 2023-09-27 13:55:52: Received Zigbee message from 'light_landing_stairs', type 'attributeReport', cluster 'genOnOff', data '{"onOff":0}' from endpoint 1 with groupID 0
info  2023-09-27 13:55:52: MQTT publish: topic 'zigbee2mqtt/light_landing_stairs', payload '{"last_seen":"2023-09-27T13:55:52+01:00","linkquality":109,"power_on_behavior":"previous","state":"OFF","update":{"installed_version":-1,"latest_version":-1,"state":null},"update_available":null}'
debug 2023-09-27 13:55:54: Received Zigbee message from 'light_landing_stairs', type 'attributeReport', cluster 'genOnOff', data '{"onOff":1}' from endpoint 1 with groupID 0
info  2023-09-27 13:55:54: MQTT publish: topic 'zigbee2mqtt/light_landing_stairs', payload '{"last_seen":"2023-09-27T13:55:54+01:00","linkquality":109,"power_on_behavior":"previous","state":"ON","update":{"installed_version":-1,"latest_version":-1,"state":null},"update_available":null}'

Operating switch via HA

error 2023-09-27 13:53:11: Publish 'set' 'state' to 'light_landing_stairs' failed: 'Error: Command 0x003c84fffece3a80/1 genOnOff.off({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205))'
debug 2023-09-27 13:53:11: Error: Command 0x003c84fffece3a80/1 genOnOff.off({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205))
    at ZStackAdapter.sendZclFrameToEndpointInternal (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:415:23)
    at Queue.executeNext (/app/node_modules/zigbee-herdsman/src/utils/queue.ts:32:32)

Any ideas how to fix this? Re-pairing is a big inconvenience because I have to undo all the switch plates.

@Koenkk
Copy link
Owner

Koenkk commented Sep 27, 2023

I see you are running the 20230507 firmware which is not considered stable, can you try 20230923?

@blacknell
Copy link
Author

@Koenkk - I updated to 20230923. I didn't know whether it should be launch coordinator or other coordinator. I couldn't find out what this meant. So I went with launch coordinator.

I found a missing router (when I requested coordinator check) which I removed from the network. Now I get a clean reply {"data":{"missing_routers":[]},"status":"ok"}. I restarted zigbee2mqtt.

The same 3 devices are showing their on/off status correct but are not controllable.

@Fabiancrg
Copy link

Hi
As you have the

@Koenkk - I updated to 20230923. I didn't know whether it should be launch coordinator or other coordinator. I couldn't find out what this meant. So I went with launch coordinator.

As you have the Sonoff ZDongle-P, the firmware is the CC1352P2_CC2652P_launchpad_coordinator*

@Koenkk
Copy link
Owner

Koenkk commented Sep 28, 2023

@blacknell so it seems the device is not reachable and no other device has a route to it.

  • Is it close to the coordinator? (could it be the direct parent)
  • If not, I think the only solution is to re-pair it.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale Stale issues label Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working stale Stale issues
Projects
None yet
Development

No branches or pull requests

3 participants