-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP-NOW sync (including audioreactive) with slave search for master's channel #4066
Conversation
- attempt at solving #4063 - sync master transmits beacon/heartbeat every 5s - slave hops WiFi channels until master is heard - master can be connected to WiFi, slave cannot
@DedeHai & @SteveEisner you are welcome to review. |
ESPNow is really thrifty with bytes, it doesn't use the antenna for very long and there's no danger of flooding available bandwidth - especially with a feature someone has to to turn on. Why not set your beacon ping to 1 per sec, or even less? Then the remotes can scan channels more quickly than one per 10sec. At 10sec, it could take 2+ minutes to find & sync (worst case.) Also, when the beacon disappears (device crashes or leaves the vicinity) the most likely scenario is that it's going to return on the same channel, right? Once a remote detects the beacon it should be very biased to listen on that channel more than any other. Imagine the scenario where someone unplugs or reflashes the master and plugs it back in a bit more than 10 seconds later. We wouldn't want all the remotes to go hunting through channels for 130 seconds before they sync again. We can probably design this to explore other channels while mostly paying attention to the same one they were using, in the case where a sync had been established. |
@@ -504,6 +504,8 @@ WLED_GLOBAL byte statusESPNow _INIT(ESP_NOW_STATE_UNINIT); // state of ES | |||
WLED_GLOBAL bool useESPNowSync _INIT(false); // use ESP-NOW wireless technology for sync | |||
WLED_GLOBAL char linked_remote[13] _INIT(""); // MAC of ESP-NOW remote (Wiz Mote) | |||
WLED_GLOBAL char last_signal_src[13] _INIT(""); // last seen ESP-NOW sender | |||
WLED_GLOBAL byte channelESPNow _INIT(1); // last channel used when searching for master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per my comment I would suggest having a last_signal_channel
and when it's non-zero, change the scanning algorithm a bit.
wled00/wled.cpp
Outdated
// already established connection, send ESP-NOW beacon every 5s if we are in sync mode (AKA master device) | ||
if (useESPNowSync && statusESPNow == ESP_NOW_STATE_ON && sendNotificationsRT && now - scanESPNow > 5000) { | ||
uint8_t buffer[12]; | ||
memcpy_P(buffer, PSTR("WLED MASTER."), 12); // use PROGMEM for string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although a string like this is easy to debug, it might be better to send a packet of information. While there's probably not much info you need now, you could define a fixed-length struct EspNowBeacon
to which info could be added later. You could have a bunch of recognizable header bytes in it, but save a bit for data.
For instance, it will be useful to have a byte for the actual channel ID in the packet of info. Why?
- maybe someday the master can send beacons to other channels (the other design we talked about)
- remotes can retransmit the beacon signal - take advantage of the fact that remotes don't have to worry about dropping wifi, they can be re-broadcasters that go to other channels temporarily to help "spread the word"
- if the master wants to move all of its remotes to a new channel when the wifi channel setting is changed
You could also store a message version in the packet.
Also can use sizeof(EspNowBeacon)
instead of 12
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
I forced this POC to get some feedback.
@@ -920,6 +918,18 @@ void WLED::initInterfaces() | |||
#endif | |||
interfacesInited = true; | |||
wasConnected = true; | |||
|
|||
#ifndef WLED_DISABLE_ESPNOW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this new code should go before interfacesInited = true;
above?
and that there could be a common initialization function instead of the 3 copies. Could both the one in initAP
and this one use quickEspNow.begin(CURRENT_WIFI_CHANNEL)
? while the one in handleConnection
passes a channel parameter.
wled00/udp.cpp
Outdated
// is received packet a master's heartbeat | ||
if (len == 12 && strncmp_P((const char *)data, PSTR("WLED MASTER."), 12) == 0) { | ||
DEBUG_PRINTLN(F("ESP-NOW master heartbeat heard.")); | ||
scanESPNow = millis(); // disable scanning for next 10s and do no futrher processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the scenario I described in previous comment, you could have this switch to the channel contained in the beacon packet. (no action necessary if it's the current channel)
wled00/wled.cpp
Outdated
DEBUG_PRINTLN(F("ESP-NOW initing in unconnected STA mode.")); | ||
bool espNowOK = quickEspNow.begin(channelESPNow); // use changeable channel STA mode | ||
statusESPNow = espNowOK ? ESP_NOW_STATE_ON : ESP_NOW_STATE_ERROR; | ||
scanESPNow = now; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depending on the seeking method in the next if
, you may not need to set scanESPNow
at this point or above when the wifi times out.
I understand that while ESPNow was off, you weren't scanning, but it seems like this code is going to run immediately after that timeout anyway --
previous if
:
if (apActive && apBehavior == AP_BEHAVIOR_TEMPORARY && now > WLED_AP_TIMEOUT && stac == 0)
sets apActive = false
this one:
if (!apActive && apBehavior == AP_BEHAVIOR_TEMPORARY && enableESPNow && wifiConfigured && !sendNotificationsRT)
presumably all these are now true if the last if
also ran, so any gaps in ESPNow will be temporary?
wled00/wled.cpp
Outdated
scanESPNow = now; | ||
} | ||
if (statusESPNow == ESP_NOW_STATE_ON && now - scanESPNow > 10000) { | ||
if (++channelESPNow > 13) channelESPNow = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking ahead...
today you store:
- time of last scan
- current scanning channel
if instead you store:
- time of last beacon ping
- last beacon ping channel
you could make this a function f(now - time of last ping) -> channel to try
To be equivalent to a simple increment that would just be (last ping channel + (t / 10000)) % 14 + 1
Then in this function, you always recalculate the channel to try, and if it's != QuickEspNow.channel, you switch to that channel.
This would introduce the ability to change the seek pattern easily in one function, for instance in the case where we are seeking but also want to listen to the last beacon channel more often, you could split time between an incrementing seek channel and the master channel.
@SteveEisner thank you for thorough review and useful feedback. I can only agree with you on every point you provided. This POC was a product of 2h work so it is bound to be flawed. Still, it should show the basis of how to implement "searching for master" that does not require a lot of new code. 😄 Indeed I was thinking of using slaves as relays but that is a bit more involved as current sync message may span several ESP-NOW packets and each needs to be re-broadcast (possibly on several channels). In such case each packet would also need to carry master's signature so that slaves could recognise it. As for 5s heatbeat, it was chosen random as I really have no feeling how much stress ESP-NOW causes. If it is safe to broadcast heartbeat every second then it can be done so. Having master's channel embedded in heartbeat would only make sense if slaves would retransmit heartbeat on all channels IMO. I see no real benefit for now. I do agree, though, that additional info should be added into heartbeat packet (including channel). FYI the version of QuickEspNow I did check with had a limitation of only 2 messages in the queue (one transmitting and one waiting). If you used |
No worries! I expected it to be simple given your comments in discord. Reviewing it as a prototype :)
I was only thinking about them possibly being relays for the beacon, rather than the full sync data. Since it is "expensive" for the master to change channels (might kill wifi, etc) and "cheap" for the remotes, it seems like once they're connected, they could every once in a while switch to another channel and broadcast the beacon, on behalf of the master. Then switch back to listen for sync packets. Having more beacons get sent would mean other remotes would find the beacon faster, but they would have to follow the beacon to the proper channel (not just stop scanning) The downside to this is that they might miss sync packets while they're switched away. I'm imagining this is a pretty quick operation though (espnow is very fast)
I think it is - ESPNow has a 1mb/s theoretical limit, and these packets will be < 30 bytes. So no real impact on the network. And in my brach I don't detect any lag or framerate drag while doing multiple ESPnow sends per second. Channel switching, I'm not sure about.
My various ideas about the channel had to do with slightly more complex operations. For instance the relay I mentioned above. Or when the master wants to send its remotes to another channel - it could send a beacon on its current channel, directing them to the new channel. This would be useful at times when the master is changing its setting & doesn't want to lose the remotes.
I believe this, but sending is very fast and you're not likely to overwrite the queue unless you're sending very quickly or the network is very noisy. But since ESPNow is an un-acknowledged send, you'll need to assume that the message might not make it to the remotes for a variety of reasons. In my branch I have the followers rebroadcast the leader at random times. This makes them all more noisy but has eliminated message loss. Unacknowledged isn't any different than UDP, but the effect of a missed beacon is worse than a single missed UDP sync packet. If a remote misses the beacon it might have to cycle through 13 more channels before it listens again. So I figured the best thing to do is to "flood" the channel with beacons, preferably with the help of remotes. |
To cool off from legal issues, I've spent yesterday chasing the idea presented by @ChuckMash on Discord: "Use slave to ping master (unicast) on every channel and use send callback function to check for successful delivery and so determine master's channel." While the above should work in theory I was unsuccessful in receiving any acknowledgement that master received a ping. While investigating why I noticed that WiFi channel used by slave changed very often and had to be reset on each unicast. Thinking of the reasons it may be that while the slave is in STA mode and is periodically trying to connect to WiFi it scans all channels where a possible known WiFi may be. This renders the unicast method unusable and fixing channel in STA mode impossible. I noticed another oddity while slave was in (temporary) AP mode. Even though the AP channel selected was 1 it did occasionally receive a beacon/heartbeat from master (mine used channel 11) so I tried to set I will push new code later for you to try out and possibly spot an error on my part. BTW connection (and re-connection) logic is convoluted beyond sanity. It would benefit if someone could make it simpler (I have difficulty following execution even after 2 years of trying to understand it; most comments in that part of code are mine while I was trying to decipher the logic behind). |
It may be worth pursuing raw ESP-NOW API instead of QuickEspNow library as WLED's use of ESP-NOW is limited to broadcast and known slave-to-master unicast. |
For this part, this could be the dreaded ESP-NOW channel cross talk, perhaps have the heartbeat itself contain the channel. That way even if the message was received on the wrong channel, it contains the correct channel to use. |
That's how it is made now and no, there was no crosstalk encountered on my system. The WiFi.channel() reported 11 and master.channel was also 11. Somehow slave switched to channel 11 without any call to |
In order to 'judge' different approaches and get everyone on the same page: |
Ok, a brief description how ESP-NOW sync should work and what features should it include:
|
One more note regarding ESP-NOW and device in STA mode. As noted in Espressif documentation (FAQ) once device switches to STA mode it can no longer change channels (looks like even if unconnected as STA mode searches for a configured SSID across channels). There are two possibilities: |
- rename structures - change message header (unify) - allow *any* master (rely on sync groups) - prevent WiFi reconnection while unconnected and master is available
@SteveEisner @ChuckMash @DedeHai this code is now test-worthy. To enable ESP-NOW sync (with WiFi configured) do:
Current implementation will not work if WiFi is un-configured. |
Quick update ... I spent an hour tonight working with two devices and was never able to get them to sync via ESPNow. I'm sure this is due to my own errors. I tried to follow your directions for syncing, but I don't have the ability to move my devices out of the range of wifi [my wife is using the wifi, I can't turn it off], so I tried to go into the code and get it to work without having to do that. I can attach to the master in debug mode, and see it sending its beacon on channel 11, which is the AP channel I set in WiFi config. But I can't get the remote to begin the channel search, even by hacking the code to bypass some of the Even though I wasn't able to test it properly, I've spent a lot of time in this code now. TBH, I think the conditions for enabling ESPnow modes vs. WiFi are too complex as written now. I'd like to suggest a new pattern. (The following is just my opinion)
The idea of this pattern is to make most other code unaware of all the details of ESPNow except for the one part that they care about. (UDP)Sync just knows that it's a sync engine, and tries to use it. Config just stores its values for it. Usermods just know that they'll sometimes get ESPnow packets and should handle them as a listener would. And the rest of the logic is all in a single place where it can be reviewed & updated. if you want I can help write this.... |
constexpr SRate_t SAMPLE_RATE = 16000; // 16kHz - use if FFTtask takes more than 20ms. Physical sample time -> 32ms | ||
#define FFT_MIN_CYCLE 30 // Use with 16Khz sampling | ||
#elif defined(CONFIG_IDF_TARGET_ESP32S2) | ||
constexpr SRate_t SAMPLE_RATE = 20480; // Base sample rate in Hz - 20Khz is experimental. Physical sample time -> 25ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blazoncek I think this modification is not related to ESP-NOW.
Why would you use a non-standard sampling rate on -S2 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FFT time on S2 is about 20ms.
I already have too many open branches and having one more was out of the question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use 16000 same as -C3. Or 18000 maybe.
// variables used by getSample() and agcAvg() | ||
int16_t micIn = 0; // Current sample starts with negative values and large values, which is why it's 16 bit signed | ||
double sampleMax = 0.0; // Max sample over a few seconds. Needed for AGC controller. | ||
double micLev = 0.0; // Used to convert returned value to have '0' as minimum. A leveller | ||
float sampleMax = 0.0f; // Max sample over a few seconds. Needed for AGC controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Veto. please don't change types without first consulting with me - some double
vars are used intentionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to follow logic and it didn't seem to need that kind of precision. At least "not for a few lights that blink to a beat" as someone (bluntly) pointed out a while ago.
But I am not an expert and I will let you educate me.
Question: Do you really need precision on 8th decimal point? or are values in excess of 10^38?
Perhaps you do if you want great precision and do signal processing, but IMO there really is no need for that when "a few LEDs blink".
I understand quantization errors, but I still think that 7 decimal places might be enough for what we do with audio data in LED driver. Unless we a lot of looping while processing data. 7 digits is about 0.001‰ error in a single pass.
Do not get me wrong, I do not want to "prove" anything I'm just trying to understand if there really is a need for that kind of precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, for sampleMax
and micLev
, it might be possible to go for less resolution. Actually I put the variables back to 'double' during my experiments 2 years ago, because I sometimes had very strange behaviour when the value is "float" and mic input is very noisy.
For control_integrated
the case was very clear and double is needed. This is an integration value where up to 1000 small differences are added per second. Using float, the integrator tends to "freak out" after some time, so it's better to use double. I've actually seem this "freak-out" happen, you suddenly get amplification values that bounce around between max and min.
if (fabs(control_integrated) < 0.01) control_integrated = 0.0; | ||
else control_integrated *= 0.91; | ||
if (fabs(control_integrated) < 0.01f) control_integrated = 0.0f; | ||
else control_integrated *= 0.91f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
control_integrated (the PID integrator) is double
intentionally, because this value is very sensitive to quantization error.
You can leave all operations affecting this variable without f
suffix.
@@ -821,9 +825,9 @@ class AudioReactive : public Usermod { | |||
|
|||
if (((multAgcTemp > 0.085f) && (multAgcTemp < 6.5f)) //integrator anti-windup by clamping | |||
&& (multAgc*sampleMax < agcZoneStop[AGC_preset])) //integrator ceiling (>140% of max) | |||
control_integrated += control_error * 0.002 * 0.25; // 2ms = integration time; 0.25 for damping | |||
control_integrated += control_error * 0.002f * 0.25f; // 2ms = integration time; 0.25 for damping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
else | ||
control_integrated *= 0.9; // spin down that beasty integrator | ||
control_integrated *= 0.9f; // spin down that beasty integrator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
double control_integrated = 0.0; // persistent across calls to agcAvg(); "integrator control" = accumulated error | ||
|
||
|
||
float control_integrated = 0.0f; // persistent across calls to agcAvg(); "integrator control" = accumulated error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Veto.
double sampleMax = 0.0; // Max sample over a few seconds. Needed for AGC controller. | ||
double micLev = 0.0; // Used to convert returned value to have '0' as minimum. A leveller | ||
float sampleMax = 0.0f; // Max sample over a few seconds. Needed for AGC controller. | ||
float micLev = 0.0f; // Used to convert returned value to have '0' as minimum. A leveller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note for understanding - neither getSample()
(runs microphone input filters) nor agcAvg()
(runs AGC algo) are time critical - they are just a handful of math expressions in a sequence, without loops. So it's ok to have some computations in double, because even -C3 will be able to handle them without problem.
Only FFTcode()
is time critical.
In case your intention was to avoid linking double
libm functions, there are other libs (like ArduinoFFT) that still use double, so I doubt this will help much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArduinoFFT uses float when you construct FFT object using float in template.
It will use double if you want SQRT approximation (which we don't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90% agreed. Only the windowing function (which we need) are still using sin/cos instead of sinf/cosf
And the library is still full of "double" constants (1.0, 2.0, etc)
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wdouble-promotion
floating point literals are implicitly of type double. [...] the compiler performs the entire computation with double because the floating-point literal is a double.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK C++ will use correct overloads.
https://en.cppreference.com/w/cpp/numeric/math/cos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the library is still full of "double" constants (1.0, 2.0, etc)
Some are stored in float[]
so they are truncated during compilation. But the comment is valid for in-line computations.
We should open an issue with @kosme perhaps.
- prepend signature to distinguish packet
- part 1
There seems to be an issue with WiFi re-connection after prolonged uptime. |
FYI I am continuing development of ESP-NOW sync in my own fork. I will not be updating this PR any further. |
I wrote code for an ESPNow remote for WLED and noticed similar things. On top, there are issues in the core libraries: I started out with arduino, switched to IDF in expectation to gain more control but then had to switch back to Arduino as there are just too many undocumented and convoluted features, each one working on its own but if you start combining, everything starts to break or be erratic. I have it working now but I can imagine running ESPNow along wifi in STA and AP mode is a difficult task to balance, especially with the very limited and often vague documentation. |
Working in STA for slave is impossible as WiFi stack (if enabled to try to reconnect to SSID) will switch channels periodically when not able to connect. For master it is ok as it will switch channel only when roaming APs. Working in AP mode works (kind of) but is not what I desire (as it defeats the purpose I need). My fork has a working solution but may not be stable as I messed with WiFi reconnection logic substantially. |
Attempt to resolve #4063 by forcing slave unit to scan WiFi channels for master's beacon/heartbeat.
Sync master will broadcast beacon/heartbeat every 5s on channel to which it is connected to or on which it has AP open.
Sync slave will listen on a channel for beacon/heartbeat for 10s then it will select another channel and listen there. Repeating until master is heard. When/each time master is heard it will postpone listening for 10s.