-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP8266 Doesn't respond to ARP requests #6886
Comments
In ESPEasy I added a send Gratuitous ARP option to overcome this issue. (hide the symptoms actually) |
I can confirm that! (Fritzbox) Sent with GitHawk |
@TD-er cool, can you share the snippet of code for that. May be it is worth to add it directly to the core because is needed when using sleep. Thanks |
@ascillato Well it is already shared as you may know ;) I think this is where most of the magic happens: Some of the magic lies in how often I send it. The Gratuitous ARP packet is also sent right after the connection to WiFi is fully active (got IP event + few 100 msec) This is still not a fix for the problem, as you sometimes experience when the first try to connect to a node after some time does take a few seconds. |
Great, thanks. That is simple to add, but IMHO this should be managed inside the core. What do you think? @d-a-v @devyte @earlephilhower |
Well, Gratuitous ARP is not a fix for the problem, it is to hide the symptoms. |
Searching a bit more, this issue has been discussed before but not added to the core. There are workarounds inside the projects. |
Exactly. But this only happens if using sleep. So, the SDK don't answer it because the ARP request is being done while the device is sleeping (request outside the DTIM time) ? And the weird thing is that this issue is not happening in other routers. Just in mikrotiks and fritzbox AFAIK |
#6484 made the FW much more stable, but this issue is still beyond our control. We can add a gratuitous ARP trigger API. I think only STA interface is relevant. |
That's right. With AP mode active the WiFi is not put to sleep. However, it would be a welcome addition to send out the Gratuitous ARP as soon as a connection is made. |
Have any of you been able to identify the low level cause of the issue? My guess is some interoperability issue with the power saving negotiation on these vendors. I plan on sniffing the WiFi later and seeing if I can spot anything of interest. |
I don't think it is limited to these brands (Fritzbox/MikroTik). But what I do find odd and still have no explanation for is this. This looks like the AP does try to send out the ping packet more often (or the ESP does receive it while dormant?) and packets that don't expect a reply like ARP or UDP will not be attempted again. |
Not to my knowledge
Please have a look to #2330 (I hope you'll not see me as evil for suggesting) #6889 is aimed at anyone concerned with this issue. |
my suspicion is this is the sleep level. Up to sdk 3, sleep level of the wifi can't be set and is fixed at max. That level is not appropriate for sending packets to the esp and then waiting for a response, but is rather meant for the esp initiating communications. |
I got the feeling the level was not fixed, but somewhat dynamic, based on rules we don't know and can't control. Still having set the sleep mode to no sleep is also no guarantee it will remain awake. |
Unicast and Broadcast / Multicast traffic is handled differently by the Access Point as I understand it, which may explain this discrepancy. If I set Regarding the Gratuitous ARP suggestion, if this is a fault with the AP vendors then I think this is acceptable. If it's masking an issue within the ESP8266 core then perhaps it's not such a great idea. |
I changed the |
I've connected an ESP8266 to an open wifi network and done some packet captures with airodump-ng. I can see the ESP8266 notifying the AP it's going to sleep and periodically waking up and notifying the AP. However the pcap only shows a single frame for a broadcast and doesn't show a retransmission when the ESP8266 wakes up. I suspect this is down to airodump-ng or the parameters I'm using. If anyone familiar with low level Wifi troubleshooting could give me some hints that would be appreciated. |
I've been doing more testing and observed the same issue with missing ARP responses across other devices that are using Wifi Power Save features, namely Huawei P30 and a Samsung Galaxy 9 Android Phones. If some other Mikrotik users in this thread could test the theory with nping that would be great: As it stands this doesn't look like an issue with the ESP8266 Arduino Core so will wait for some further feedback and then close. |
Well it may be less of an issue on mobile phones, as they do normally not run services which must be reachable from the network. But since you're sniffing for ARP, can you see if those mobiles send out gratuitous ARP packets? |
Yeah, this is why I have not noticed before. But regardless, the AP should buffer the ARP requests and re-transmit them when the device wakes up and requests them. I do not see these Android devices sending Gratiutous ARPs. Interestingly ESPHome devices are also sending Gratiutous ARPs but I cannot see in their code where it's being sent. |
I've created a post on the Mikrotik user forum - https://forum.mikrotik.com/viewtopic.php?f=2&t=154613 As this doesn't appear to be an issue in the core I'll close it. Thanks all for your assistance and comments. In the meantime I think the Gratuitous ARP workaround @d-a-v has submitted in #6889 is probably the best solution for these misbehaving Access Points. |
"Should" as in "would be nice" or "according to some standard" ? |
According to some standard (Legacy Power Save / UAPSD ), although I will admit I am not an expert. And there are still unanswered questions like how long it should be buffered for before dropping.
I did see something that implied Aruba APs would respond to ARP requests on behalf of connected clients but I guess this is vendor specific. |
I was thinking about this issue with ARP for a very specific use case. For example, you have 2 access points and want to switch to "the other AP". So one second you are connected to AP1 and the switches in the network know where to route packets to for your ESP node. That's exactly what I am seeing in my network, when I intentionally switch to the other AP. Is there a way to "de-register" any existing ARP cache on the network? |
Just to quote myself. |
I'm not following. As soon as you send a DHCP request the switch will learn the MAC address. On the other hand, I don't think you'd send a DHCP request if you're roaming between APs with the same SSID. So you may be right that immediately after roaming the switch will assume you're connected to the old IP until a packet is sent. I think more enterprise level access points handle this stuff for you. |
When I switch AP, I do perform a disconnect and a connect to the other AP, so there will be a DHCP request. |
@TD-er |
Well I was thinking, what if we send out a gratuitous ARP packet with IP address 0.0.0.0, right before sending out a DHCP request (so right after we get an connection). |
Ah right.
Is that what you tried ? |
Have not tried it with the commented-out code, but I can try for sure (will try this evening, have to pick up my daughter soon) |
This has been affecting me on Tasmota with a FreeBSD router OPNsense/PFSense as well. BSD has a 20min arp cache timeout and if it's not refreshed the entry is removed. This is slightly different than linux which seems to keep it but just mark it stale. It's pretty important for networking to have working ARP replies. That the WiFI is actively talking to other machines but not replying to an arp request is a pretty big bug. Is there any chance of this getting fixed? Tasmota has a workaround to hide it where you can send an arp broadcast on a timer but that doesn't fix it 100%. @TD-er did you get a chance to test that change? |
Also, without a specific arp entry My AP (which is not my router) could not ping my Tasmota light even though it had a valid DHCP lease and an active connection to my MQTT broker |
Gratuitous arp should help to solve that. I don't know if it is enabled by default in tasmota.
Did you try with tcpdump to check these packets are really received ? |
The way I use Gratuitous Arp is a bit more elaborate compared to the fix offered here. About the 2nd issue you mentioned, that may be a different bug. |
A ping doesn't work if there's no arp reply. You can try with When the ESP isn't replying to ARP requests it's not in compliance with the RFCs. What could be done to fix it? It's not in a deep sleep or anything |
@beren12 said:
I thought this arp thing was already addressed since the ets_intr_lock thing. If that's not the case, then the official answer from us core maintainers here in this repo, pending some new hint that hasn't shown up in the already extensive investigation and work already done on this matter, is "reverse engineer Espressif's closed source libraries, starting with the wifi lib". The wifi lib is closed source and belongs to Espressif, who won't open source it. So without reverse engineering that code to gain visibility into what's happening, there is no solution. |
You can also try yourself to switch between SDK builds to see if it makes a difference for your board . I suspect (as in no proof) there is a slight timing difference between nodes. Anyway, if the node is running normally (with or without sleep state specifically set), the power consumption will differ based on the load of the node. That does not sound very odd, but the ESP's power consumption doesn't really change a lot between computing pi or performing busy waits. The thing that really makes a difference is the WiFi module. What I think is happening... Devices entering sleep mode can notify the access point and thus forcing a higher DTIM interval for all connected devices. Apart from DTIM, there is another metric which may attribute to these issues. This is what I think, may be factors in this issue. Edit: |
There is also |
Which (sadly) does not guarantee the power consumption will not go down after a while. |
My ESP8266 regularly stops responding to ARP requests. I raised this in #6873 but I was asked to troubleshoot with the Tasmota Devs first (Github Issue) and provide a MCVE example. Tasmota ruled out an issue with their code, but did identify that this may be an interoperability issue between the ESP8266 and Mikrotik access points.
Basic Infos
Platform
Problem Description
I have discovered if If
wifi_set_sleep_type
is light or modem ARP requests will consistently fail. When set to none I reliably get responses to ARP requests. This appears to affect ESP8266 devices connected to Mikrotik Access Points.ARP responses when sleep type set to none: (1 lost is acceptable IMO)
ARP responses when sleep type set to light:
ARP responses when sleep type set to modem:
MCVE Sketch
The text was updated successfully, but these errors were encountered: