Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32 RMT odd behaviour when looping #11

Closed
jonathanhogg opened this issue Jan 16, 2020 · 39 comments
Closed

ESP32 RMT odd behaviour when looping #11

jonathanhogg opened this issue Jan 16, 2020 · 39 comments

Comments

@jonathanhogg
Copy link

Hi @mattytrentini,

So I'm working with your esp32_rmt2 branch (merged against current micropython/master HEAD) and have found that the behaviour when loop(True) is called is strange. I'm seeing duplicated pulses in the output signal.

Small example:

from esp32 import RMT
from machine import Pin

pin = Pin(12, Pin.OUT)
rmt = RMT(0, pin=pin, clock_div=8)

rmt.loop(True)
rmt.write_pulses([2500, 7500, 5000, 5000, 7500, 2500])

(Not entirely clear to me whether it matters if loop() is called before or after first write_pulses(), but it doesn't seem to make a difference either way.)

With looping off, I see three pulses each of increasing duty cycle (Figure 1 below, captured with a USB logic-analyser). However, with looping on, I see a repeated pattern of four pulses with the first duplicated (Figure 2 below).

Screenshot 2020-01-16 at 16 41 50

In my actual code I've seen up to three pulses from a longer sequence being duplicated.

I've stared at the code in esp32_rmt.c and the ESP-IDF RMT documentation and I can't see any reason for this behaviour. I'm hoping that you might have more experience in using this hardware, though, and may have an idea what's going wrong.

Cheers,

Jonathan

@mattytrentini
Copy link
Owner

Interesting! I agree that the result is unexpected. I have hardly used looping personally so I'm afraid I don't have a lot of first-hand experience to draw from. However, googling around suggests that there may be an issue when the stream of pulses is quite short - an interrupt needs to be generated at the end of each sequence that will re-start it. If I've carried my zero's correctly that's an interrupt every 30ms for your sequence:

duration = (1/(80MHz/8) * (2500 + 7500 + 5000 + 5000 + 7500 + 2500) = 30ms

Which doesn't sound too frequent at all... However, it would be interesting to see what happens if you extend your sequence:

rmt.write_pulses([2500, 7500, 5000, 5000, 7500, 2500] * 10)

I'm afraid I don't have my logic analyser handy this weekend otherwise I'd try it myself! Please let me know how you go.

@jonathanhogg
Copy link
Author

Hmmm… have just tried the * 10 idea, but seeing the same problem: the first pulse is doubled-up, so every 30th pulse I see an additional short one. I've tried changing the pulse durations and the divider to produce sequences varying from 3ms to 300ms and it's the same for all.

I've verified that it is only as it loops that the first pulse is repeated, so the very first sequence of pulses that are generated when the RMT is started are correct. Also, if I write a sequence of pulses with looping disabled, the sequence is correct; then if I enable looping without writing anything new, the sequence restarts correctly but has a doubled pulse again each time it loops around to the start.

I'm now thinking that I may need to just work around this for the moment using the new data= argument to specify a [1, 1, …] 'null pulse' at the beginning that can be safely repeated without corrupting the data.

@jonathanhogg
Copy link
Author

jonathanhogg commented Jan 18, 2020

Right. Through incredibly laborious testing I have determined that 20µs after the RMT has started running through the pulse sequence it will restart from the beginning and then run through correctly to the end. It will thus repeat however many pulses entirely or partly fit within those 20µs.

For the relatively slow examples I was using above, this is just the first pulse that is repeated. When I crank up to the actual speed of the protocol I am attempting to output, which has 6.67µs period pulses, I get three being repeated.

It will always repeat an exact number of rmt_item32_t items, so whatever internal timer the RMT is using is not being reset, just the index of the next pulse to output. It is as if the interrupt handler is resetting the index twice, 20µs apart.

As far as I can see, you're just using the default ISR that is installed by rmt_driver_install() so I don't see how this can be going wrong.

@mattytrentini
Copy link
Owner

As far as I can see, you're just using the default ISR that is installed by rmt_driver_install() so I don't see how this can be going wrong.

That's right; I'm handing as much as possible off to the ESP-IDF. So I'm very surprised this isn't working properly; it's hard to believe there are not more people running into this (who are using the raw C API). Maybe few people are using looping?

I don't think it's anything we're doing in the library or application code anyway!

@nevercast
Copy link

Perhaps we need to raise this with Espressif, or ask some of them to take a look at this issue. @jonathanhogg are you able to write a repro' in C on ESP-IDF ? Do you have the setup/means for that? Then we can post that here and call in some Espressif engineers to help debug the matter.

@jonathanhogg
Copy link
Author

jonathanhogg commented Jan 20, 2020

So I've managed to repro it with the 3.3 IDF (tag 143d26a) and this stripped-back piece of C:

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_log.h"
#include "driver/rmt.h"


static const char *TAG = "test";

rmt_item32_t items[] = {
    {{{ 250, 1, 750, 0 }}},
    {{{ 500, 1, 500, 0 }}},
    {{{ 750, 1, 250, 0 }}},

    // RMT end marker
    {{{ 0, 1, 0, 0 }}}
};

void app_main(void *ignore)
{
    rmt_config_t config;
    config.rmt_mode = RMT_MODE_TX;
    config.channel = RMT_CHANNEL_0;
    config.gpio_num = 12;
    config.mem_block_num = 1;
    config.tx_config.carrier_en = 0;
    config.tx_config.loop_en = 1;
    config.tx_config.idle_output_en = 1;
    config.tx_config.idle_level = 0;
    config.clk_div = 80;

    ESP_LOGI(TAG, "rmt_config");
    ESP_ERROR_CHECK(rmt_config(&config));
    ESP_LOGI(TAG, "rmt_driver_install");
    ESP_ERROR_CHECK(rmt_driver_install(config.channel, 0, 0));
    int number_of_items = sizeof(items) / sizeof(rmt_item32_t);
    ESP_LOGI(TAG, "rmt_write_items");
    ESP_ERROR_CHECK(rmt_write_items(RMT_CHANNEL_0, items, number_of_items, false));

    while (1) {
        ESP_LOGI(TAG, "Waiting");
        vTaskDelay(1000 / portTICK_PERIOD_MS);
    }
    vTaskDelete(NULL);
}

As for the Python example, the trace looks like:

Screenshot 2020-01-20 at 12 13 10

It'd be good if someone else can confirm the same behaviour to be sure I've not got a bad board or something (it's an AdaFruit HUZZAH32).

@jonathanhogg
Copy link
Author

I've been doing some reading of the ESP-IDF RMT driver code and the ESP32 technical manual and, as far as I can make out, the looping is a feature of the hardware itself and is not actually handled by the interrupt handler. I can't find any errata or complaints about looping, so I'm now seriously wondering if something is wrong with this board. I'm going to dig out another when I'm back in the studio in a day or two and see if I can reproduce.

Also of note is that the ESP-IDF RMT driver will explicitly disable looping if you try to send more than 64 items as it can only do this by filling the channel buffer with the first 64 items from the sequence and then setting an interrupt to quickly swap in the next 64 items just before the end of transmission. Note that rmt_write_items() adds a zero item at the end – required by the hardware to signal stop/loop transmission (I'm not sure why the examples explicitly include a zero duration "end marker" when the driver adds one as well).

So I wonder if write_pulses() should throw an exception if looping is enabled and the sequence is longer than 63 pulses? Presumably, likewise, loop(True) would have to throw an exception if a previously configured sequence is longer than 63.

@nickzoic
Copy link

nickzoic commented Jan 21, 2020

OK I'm trying out your C code and see how it behaves on hardware,
First trial: ESP-IDF 3.3 6ccb4cf and a DevKit C with Rev1 silicon, doubled first pulse is apparent:
double pulses
Second trial: ESP-IDF 4.0 310beae ... same output.

@nickzoic
Copy link

nickzoic commented Jan 21, 2020

Seeing the same timing behaviour as you mention above when sending a bunch of thing pulses { 1, 1, 1, 0 }, and interestingly when you speed things up by decreasing the clk_div the number of repeated pulses increases, so yes the timer is outside the clock.

(EDIT: see below: this isn't as simple as I thought)
sds00018
This is meant to be a wide pulse followed by eight narrow ones and then a long gap. With clk_div = 20 there's 6 pulses repeated, with larger clk_div there's fewer.

The repeat time is less that 20µs though, for me, and it does seem to change a bit depending on clk_div ... maybe 3.3µs with clk_div = 20 and 1.5µs with clk_div = 10 ...

Hmmm, I'll have to revisit this more carefully and be a bit more thorough.

sds00021

Here's a shot of the same pulses at clk_div = 10 showing a shorter reset time (1.8µs or so) and also some pulse trains going through unaffected!

@jonathanhogg
Copy link
Author

While I'm pleased to see that it's not just me, I'm somewhat thrown that looping can be this broken and it not have come up before somewhere. Perhaps it would be good to get some input from an Espressif engineer as to whether we're just doing something fundamentally wrong.

@nickzoic, I think I was keeping clk_div = 1 and varying the durations for my testing, so it's possible that my 20µs result was dependent on that.

@nickzoic
Copy link

nickzoic commented Jan 21, 2020 via email

@nickzoic
Copy link

Bonus wild observation: the little transient spike you can see at the start of those traces is caused by the { 0, 1, 0, 0 } terminator: change it to { 0, 0, 0, 0 } and it goes away!

@nickzoic
Copy link

clk_mhz: configured clock in MHz
clk_div: RMT clk_div value
n_rep: number of repeated pulses, including the wide one
t_rep: time from start of initial pulse to start of repeated initial pulse (approx, us)

pulse pattern:

rmt_item32_t items[] = {
    {{{ 2, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 1, 1, 1, 0 }}},
    {{{ 100, 0, 0, 0 }}}
};

clk_mhz clk_div n_rep   t_rep   
240     10      11      2.8
240     15      7       2.75
240     20      6       3.25
240     30      4       3.33
240     40      3       3.5     
240     50      2       3.1

160     20      8       4.25
160     40      4       4.6

80      10      N/A
80      20      16      8.3
80      30      11      8.6
80      40      8       8.5
80      50      7       9
80      100     3       9

@nickzoic
Copy link

rev0 silicon

80      10      N/A
80      20      16      12.8
80      30      11      13.3
80      40      8       16
80      100     3       14

160     20      8       6.5

240     10      11      5.3
240     20      6       5
240     40      3       5.4

This is on a Sparkfun ESP32 Thing with Rev0 silicon.
How can it have the exact same number of pulses gone wrong but a different timing?
Oh, because this board is a 26MHz clock board and the DevKitC was a 40MHz clock board.
And I had the SDK set up for 40MHz ... compensating for that gives about the predicted values.
Setting the SDK to autodetect crystal also fixes this. Still, it's nice that whatever it is is clearly clocked not eg: some weird gate capacitance effect.

@nickzoic
Copy link

As a bonus, my results tabulated above don't agree with the previous screenshots.
Not sure what's going on there.
Ah, okay, if I trim the list down to 1 long and 8 short pulses, the reset time gets shorter and but also quite a lot of the time the pattern comes out fine. The longer pulse train used in today's tests always triggers the problem.

Anyone got any theories? I'm all out of time & brains for today.

@jonathanhogg
Copy link
Author

Bonus wild observation: the little transient spike you can see at the start of those traces is caused by the { 0, 1, 0, 0 } terminator: change it to { 0, 0, 0, 0 } and it goes away!

{{{0, 0, 0, 0}}} is the terminator that is added within rmt_write_items(), {{{0, 1, 0, 0}}} is the terminator that the ESP-IDF examples all have. I think you can just take that explicit terminator away entirely and let the driver add its own one.

Anyone got any theories? I'm all out of time & brains for today.

I'm lost as to why looping is so broken honestly.

@nickzoic
Copy link

Yep. If you don't need an actual continuous loop it's easy enough to work around: put a 20µs or so blank as your first RMT item. Hopefully it's fixable in firmware though.

@jonathanhogg
Copy link
Author

Yeah, that's my current workaround. Luckily I'm generating a group of pulses every 1ms, so I just moved some of the space after the group to before instead.

@nickzoic
Copy link

OK I've got and added this to Espressif's issues tracker
espressif/esp-idf#4664

@mattytrentini
Copy link
Owner

I've also reached out by email to Espressif - they agree that it looks like an issue on their side and will take a look. Chinese New Year may slow down the response however!

Thanks to both of you @jonathanhogg and @nickzoic for digging in to this in such detail. I'll keep pestering Espressif and will update you guys here if I hear anything.

@jonathanhogg
Copy link
Author

As per my comment on the Espressif issue, the problem goes away for me if I disable the TX-complete interrupt when looping is enabled. I have simply patched the esp32_rmt_loop function a la:

STATIC mp_obj_t esp32_rmt_loop(mp_obj_t self_in, mp_obj_t loop) {
    esp32_rmt_obj_t *self = MP_OBJ_TO_PTR(self_in);
    int loop_en = mp_obj_get_int(loop);
    check_esp_err(rmt_set_tx_intr_en(self->channel_id, !loop_en));
    check_esp_err(rmt_set_tx_loop_mode(self->channel_id, loop_en));
    return mp_const_none;
}
STATIC MP_DEFINE_CONST_FUN_OBJ_2(esp32_rmt_loop_obj, esp32_rmt_loop);

And then my Python usage is to do this every time I need to change the looping output signal:

rmt.loop(False)
rmt.wait_done()
rmt.write_pulses(pulses, data=data)
rmt.loop(True)

@nickzoic
Copy link

Yay! A good result with many thanks to @koobest !

@nickzoic
Copy link

nickzoic commented Apr 1, 2020

espressif/esp-idf#4664 just got closed, so we should check if this fixes this issue ...

@mogplus88
Copy link

FWIW I still have this problem as of today. I am using a Heltec WiFi kit 32, Linux 19.3, VS Code and PlatformIO plugin. I am using RMT to generate a PPM signal on a pin to drive a standard RF transmitter module (FrSky in my case) for radio control of models (planes, boats, whatever). Fortunately I was able to use the workaround as suggested by @jonathanhogg above, as the low pulse is always 300us. So I was able to add a dummy first pulse of 40us (to be sure, to be sure), and make the low pulse of the final "sync" pulse 260 instead of 300.

@jonathanhogg
Copy link
Author

Yes, this branch is a bit out of date now. Looking at mainline I see that there's been no major updates on that either since initial inclusion. @mattytrentini / @dpgeorge: what's the deal at the moment with the RMT updates? Is this branch going to be updated and pulled in any time soon?

If nothing else, it might be good to at least get the looping interrupt fix from above in since that's a clear bug in the current functionality. Would it be useful if I did my own micropython/micropython branch and pull request just for that?

@dpgeorge
Copy link

If nothing else, it might be good to at least get the looping interrupt fix from above in since that's a clear bug in the current functionality.

The IDF fix for this is not part of a release yet, so would be good to work around it as suggested.

But also note that, in the git comment of that IDF fix, it says that "users needs to call rmt_driver_install before rmt_config" so that should probably also be done as part of the workaround.

Would it be useful if I did my own micropython/micropython branch and pull request just for that?

Yes!

@mogplus88
Copy link

I tried reversing the order of the rmt_driver_install and rmt_config in my program and it solved the problem! Works a treat now.
Thanks @dpgeorge for pointing that out.
;-)

@dpgeorge
Copy link

Ok, great! Let's get that in mainline then.

@jonathanhogg
Copy link
Author

OK. I’ll try to get some new tests done this week and put in a PR.

I’m a little lost as to what exactly is implied by the “fix” to espressif/esp-idf#4664. The code referenced contains a mass of refactoring which, I assume from the commit description, is on top of a refactoring that was done some time in the past (not hunted that down). If it is apparently fixing the need to call rmt_driver_install before rmt_config, then does that mean either order is fine after this hits main or are we just introducing a workaround that will need removing at some point in the future?

I’m not sure I’ll be able to make enough time to test the necessary permutations!

@dpgeorge
Copy link

At this stage it's enough to just switch the order of rmt_driver_install and rmt_config, that's confirmed by @mogplus88 to work. And it should continue to work after we pull in that fix from the IDF. So no need for excessive testing, if you could just test that the original bug reported here is fixed, that'd be great.

@jonathanhogg
Copy link
Author

@dpgeorge: I've created a new issue for this at micropython#6167 – it felt weird to have the issue not documented somewhere on the main project. PR in against that.

@jonathanhogg
Copy link
Author

@dpgeorge / @mattytrentini: I meant to say, there's a bunch of other very useful new RMT functionality that is languishing in this branch.

In particular, the idle_level initialisation and ability to explicitly specify levels in write_pulses is something I am using in my own code.

If it would be useful, I am happy to pick up some or all of this work and submit a PR against main.

@dpgeorge
Copy link

If it would be useful, I am happy to pick up some or all of this work and submit a PR against main.

It sounds like a good idea to me, but I'll let @mattytrentini make the final decision

@mogplus88
Copy link

@jonathanhogg a bit off topic maybe but I was wondering if you have tried changing the values written to the output by this function. I have been trying to do it but it seems once the write to the output has been sent (either rmt_write_items or rmt_write_sample) the function blocks regardless of whether looping is enabled or not, or anything else. I have tried every possible combination and the write function always blocks, effectively stopping the program in its tracks. I've tried everything I can think of, but I must be missing something. I just can't get it to work. Any hints? Thanks, Ian

@mattytrentini
Copy link
Owner

I'm trying to find time to massage the esp32_rmt2 branch in to shape. Sorry for the delays @jonathanhogg!

Of course, feel free to take my branch and complete it if you are so motivated - but this is back on my radar.

@mogplus88
Copy link

@jonathanhogg @mattytrentini , done some more playing with this and I think I've found the answer to my problem.
I have been using VS Code with PlatformIO plugin for running all the tests I've been doing so far. Just for fun I decided to try Espressif's development tool, which is a kludgy command line tool but works sort of okay once you get it installed. (for me it seems to need to be reinstalled every time I open a console window to run it, but that's a whole other problem). Anyway, long story short, the rmt_write_items function (haven't tried rmt_write_sample yet) works as expected when it's uploaded to the board using the idf.py function provided by Espressif. It doesn't work with PlatformIO, so maybe their version of the library is not up to date or something, I'll put a question on their forum shortly.
I should mention that the only test I've done is to put the command into a while(true) loop with looping set to false (i.e. no looping) so it outputs one "frame" of pulses (corresponding to the number of items in the input array) and then returns to the program. I can then update the values and the new values are reflected in the next "frame" in the output stream.
Hope this helps.
;-)

@jonathanhogg
Copy link
Author

Hi @mogplus88, sorry for delay getting back to you. I've just had a go at hooking my kit back up and doing a couple of tests and – indeed – something is up. The write_pulses() function is locking-up the interpreter on a second call.

Sorry, this was clearly insufficient testing on my part: I only checked that my simple looping example was working before declaring this a win.

I'm going to do some more digging, but I'm gonna report back over on micropython#6167 so I can keep what I find associated with the main project and my last PR.

@jonathanhogg
Copy link
Author

Hey @mattytrentini,

Just to let you know that I'm working on a copy of your branch over at jonathanhogg/micropython. I've rebased it (with my usual fat-fingered mistakes) to a recent HEAD, worked on the things you'd marked as "fixme", added support for disabling idle output (with idle_level=None) and updated the documentation.

I've got to do some proper testing tomorrow (UK time) with the oscilloscope to make sure I've not bust anything, but will be putting in a PR soon.

@jonathanhogg
Copy link
Author

Closing this issue since the original problem has been fixed. The additional RMT features are still languishing in the PR, but nothing useful on that can be said here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants