Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular hostByName via mDNS failure with Ticker #6650

Closed
6 tasks done
lbussy opened this issue Oct 17, 2019 · 13 comments
Closed
6 tasks done

Regular hostByName via mDNS failure with Ticker #6650

lbussy opened this issue Oct 17, 2019 · 13 comments

Comments

@lbussy
Copy link

lbussy commented Oct 17, 2019

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it. (N/A)
  • I have filled out all fields below.

Platform

  • Hardware: ESP-12S
  • Core Version: SDK:2.2.1(cfd48f3)/Core:2.5.2=20502000/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
  • Development Env: PlatformIO
  • Operating System: Windows

Settings in IDE

  • Module: Wemos D1 mini r2
  • Flash Mode: qio
  • Flash Size: 4Mb
  • lwip Variant: v2 Lower Memory
  • Reset Method: ck
  • Flash Frequency: 40Mhz
  • CPU Frequency: 80Mhz
  • Upload Using: SERIAL
  • Upload Speed: 460800

Problem Description

This is a repost of #6639 with updated verbiage and MCVE.

hostByName resolution fails regularly with -5 when called with Ticker using mDNS. When Ticker is attached with a 5 second interval, it fails every other time. When called with a 2 second interval, it fails every 5th time.

When using DNS/Internet addresses, I cannot reproduce the issue.

MCVE Sketch

#include <ESP8266mDNS.h>
#include <ESP8266WiFi.h>
#include <Ticker.h>
#include <Arduino.h>

void setup() {
    Serial.begin(74880);
    Serial.setDebugOutput(true);
    Serial.flush();
    WiFi.begin(F("Xxxxxxxx"), F("xxxxxxxx"));
    Serial.println();
    Serial.print(F("Waiting for connection."));
    while (WiFi.status() != WL_CONNECTED)     {
        Serial.print(F("."));
        delay(500);
    }
    Serial.println();
    Serial.println(F("Connected."));
    Serial.print(F("DNS #1: "));
    Serial.print(WiFi.dnsIP().toString().c_str());
    Serial.print(F(", DNS #2: "));
    Serial.println(WiFi.dnsIP(1).toString().c_str());
}

void loop() {
    IPAddress resolvedIP;
    const char* host = "raspberrypi.local";
    if (!WiFi.hostByName(host, resolvedIP)) {
        Serial.print(F("(Loop) Host lookup failed for "));
        Serial.println(host);
    } else {
        Serial.print(F("(Loop) Host: "));
        Serial.print(host);
        Serial.print(", IP: ");
        Serial.println(resolvedIP.toString().c_str());
    }

    Ticker lookup;
    lookup.attach(5, [lookup]() {
        IPAddress resolvedIP;
        const char* host = "raspberrypi.local";
        if (!WiFi.hostByName(host, resolvedIP)) {
            Serial.print(F("(Ticker) Host lookup failed for "));
            Serial.println(host);
        } else {
            Serial.print(F("(Ticker) Host: "));
            Serial.print(host);
            Serial.print(", IP: ");
            Serial.println(resolvedIP.toString().c_str());
        }
    });

    while (true) {
        yield();
    }
}

Debug Messages

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x4010f000, len 1384, room 16
tail 8
chksum 0x2d
csum 0x2d
v8b899c12
~ld

SDK:2.2.1(cfd48f3)/Core:2.5.2=20502000/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
wifi evt: 2
scandone

Waiting for connection..scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 7
cnt

connected with Xxxxxxxx, channel 8
dhcp client start...
wifi evt: 0
....ip:192.xxx.xxx.155,mask:255.255.255.0,gw:192.xxx.xxx.1
wifi evt: 3

Connected.
DNS #1: 1.1.1.1, DNS #2: 1.0.0.1
[hostByName] request IP for: raspberrypi.local
[hostByName] Host: raspberrypi.local IP: 192.xxx.xxx.131
(Loop) Host: raspberrypi.local, IP: 192.xxx.xxx.131
[hostByName] request IP for: raspberrypi.local
[hostByName] Host: raspberrypi.local IP: 192.xxx.xxx.131
(Ticker) Host: raspberrypi.local, IP: 192.xxx.xxx.131
pm open,type:2 0
[hostByName] request IP for: raspberrypi.local
[hostByName] Host: raspberrypi.local lookup error: -5!
(Ticker) Host lookup failed for raspberrypi.local
[hostByName] request IP for: raspberrypi.local
[hostByName] Host: raspberrypi.local IP: 192.xxx.xxx.131
(Ticker) Host: raspberrypi.local, IP: 192.xxx.xxx.131
[hostByName] request IP for: raspberrypi.local
[hostByName] Host: raspberrypi.local lookup error: -5!
(Ticker) Host lookup failed for raspberrypi.local

The cadence demonstrated here continues forever.

@lbussy
Copy link
Author

lbussy commented Oct 17, 2019

Most strangely, I tried setting station mode with a short delay immediately following it and the cadence changed consistently from pass - fail - pass - fail to pass - pass - fail - pass - pass - fail.

The lines added immediately after connecting to wifi were:

    WiFi.mode(WIFI_STA);
    delay(200);

At this point, my brain exploded and I ran out of guesses.

Sorry about the close/open - I hit the wrong button.

@lbussy lbussy closed this as completed Oct 17, 2019
@lbussy lbussy reopened this Oct 17, 2019
@devyte
Copy link
Collaborator

devyte commented Oct 23, 2019

@lbussy AFAICT, what you're trying to do isn't supposed to work at all, i. e. WiFi.hostByNsme() isn't supposed to resolve the ".local" domain and should just fail. I have no idea why it works sometimes for you.

@lbussy
Copy link
Author

lbussy commented Oct 23, 2019

@devyte I'm at a loss then - close this I suppose.

Can you tell me before I go away and nurse my wounds how I am supposed to do an mDNS lookup by name properly? The examples I see in the libs all seem to look for a service. The host I am seeking may not be advertizing a service.

Searching the web, well, all sorts of answers and there's even a published lib for it - which does not work and causes a WDT reset.

@mcspr
Copy link
Collaborator

mcspr commented Oct 23, 2019

Technically, it should work. When builtin resolver sees .local, it will send DNS request to the mDNS multicast address (also see #6613 and the example dig -p 5353 @224.0.0.251 something.local)
This looks like an issue with Ticker usage and not something wrong with the Core per se. hostByName expects to be in loop() or setup() and sometimes could call delay():


So it will trigger a crash do nothing whenever INPROGRESS condition is true.
edit: (I think I was mistaking delay with yield here)

Have you tried using Ticker::attach_scheduled() instead or calling in loop checking millis() / PolledTimeout?

@lbussy
Copy link
Author

lbussy commented Oct 23, 2019

Thank you @mcspr for detailing that. Now I know I'm not crazy at least.

I have no doubt that my issue may lie in/with Ticker(). I'll try Ticker::attach_scheduled() and see how that impacts my sketch. I'd never considered it.

Incidentally, I ended up using a local cache to get by the sporadic host lookup issues. At least I can get past this for now and will revisit it later.

@devyte
Copy link
Collaborator

devyte commented Oct 23, 2019

@mcspr you're right. The lwip mdns is rather broken, so that internal mdns resolver was supposed to be disabled.

@lbussy
Copy link
Author

lbussy commented Oct 23, 2019

Have you tried using Ticker::attach_scheduled() instead or calling in loop checking millis() / PolledTimeout?

Tried that as a straight replacement and it seemed to fail to fire. The way I understand it, this would change it to an interrupt-driven event?

Anyway, I don't have physical access to my test rig right now and having blown up the connection somehow I need to wait till later to try to debug.

@mcspr
Copy link
Collaborator

mcspr commented Oct 23, 2019

@devyte commented 1 hour ago
The lwip mdns is rather broken, so that internal mdns resolver was supposed to be disabled.

Not to go off topic, but are there any specific problems there? It is not lwip-mdns lib though, just a basic dns resolver sending request to 224.0.0.251:5353 instead of configured dns server and then receiving dns response back.

@lbussy commented 1 hour ago
The way I understand it, this would change it to an interrupt-driven event?

It's a quite short alias for attach + schedule_function:

attach(seconds, [callback]() { schedule_function(callback); });

Where schedule_function will run the function some time after loop() function finishes:
extern "C" void __loop_end (void)
{
run_scheduled_functions();
run_scheduled_recurrent_functions();
}

@lbussy
Copy link
Author

lbussy commented Oct 23, 2019

It's a quite short alias for attach + schedule_function:

I did look at that and as you say it seems simple enough. However it it not seem to ever fire the scheduled task:

void loop() {
    Ticker test;
    test.attach_scheduled(5, [test](){Serial.print(F("."));});
    while (true) { yield(); }
}

Where schedule_function will run the function some time after loop() function finishes:

You don't mean "finished" as in returned from loop() do you? I may not be following there.

@devyte
Copy link
Collaborator

devyte commented Oct 23, 2019

@lbussy Scheduled functions fire in between calls to loop. You're not letting that happen, because you're holding the loop with a busy loop that calls yield.
There was a PR that allows scheduled functions to fire on yield in addition to in between loops. I think the issues have been resolved and fixes merged, in which case it would work only with latest git and not 2.5.2.
The correct way would be to move the Ticker declaration outside of loop() and the attach_scheduled to the setup.

@lbussy
Copy link
Author

lbussy commented Oct 23, 2019

Aha! Okay, that makes much more sense now - thank you. Tested with the TickerBasic.ino moved to Ticker::attach_scheduled() and it operated as it should so I'll play around with how I am doing my work and see if it can provide some relief.

Finding a given PR in this project is daunting, but there's reasonable ways to get around it for now so I'll let it come out in the wash.

@d-a-v
Copy link
Collaborator

d-a-v commented Oct 23, 2019

There was a PR that allows scheduled functions to fire on yield in addition to in between loops

scheduled functions are not called from yield() because they can call yield(),
they are only called at next loop().

(recurrent scheduled functions (that must not call yield()) are called from yield())

@devyte
Copy link
Collaborator

devyte commented Oct 24, 2019

@d-a-v I found that flaw, and it keeps tripping me up 😆
@lbussy glad it's working for you. Closing.

@devyte devyte closed this as completed Oct 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants