ESP32: Fix watchdog timeout handling #1249

tve · 2023-11-18T01:53:04Z

The support in the esp32 xsHost has gotten a bit wonky due to changes in the CONFIG_ESP_TASK constants. Specifically, CONFIG_ESP_TASK_WDT is now CONFIG_ESP_TASK_WDT_EN and there's CONFIG_ESP_TASK_WDT_INIT.

In addition, the task WDT can be enabled at run-time and the timeout can be set/changed at run-time too.

This PR fixes the various #ifdef around the config constants. I wrote a little watchdog module that exercises all this. It can be found at https://github.com/tve/modtve/tree/main/modules/drivers/wdt with an example at https://github.com/tve/modtve/tree/main/modules/examples/wdt. If this module looks good I'm happy to add it to this PR or to modify it to fit. I was hoping it could fit into an ecma-419 Host scheme of sorts...

In any case, there's an issue with the WDT timeout since it can be changed dynamically. The calculation of maxDelayMS in xsHost.c doesn't really work necessarily. To add insult to injury, one can't query the current timeout from esp-idf. The best I can come up with is to at least call some function in xsHost from the WDT module if the user changes the timeout.

I tested on esp32 and esp32-c3

Feedback welcome.

tve · 2023-11-19T23:50:42Z

Oops, I missed one instance of CONFIG_ESP_TASK_WDT, will push that later.

phoddie · 2023-11-20T05:50:48Z

This looks very reasonable. I'm not sure how we overlooked the #define name change. I remember testing with WDT recently and was working.

The driver and example would be welcome additions.

The dynamic configuration of the WDT timeout is obscure but I suppose worth trying to support. Your approach is reasonable as a first order attempt. A trickier solution would be to patch into esp_task_wdt_reconfigure to get the value. You could use --wrap to have gcc reroute the call to you.

tve · 2023-11-20T20:07:11Z

The dynamic configuration of the WDT timeout is obscure

I ended up there more through trying to figure out the whole thing than by hard requirement. When I discovered that the timeout ends up being global my enthusiasm dropped significantly. I think that a 2-level WDT may be more appropriate with the "application level WDT" implemented in JS. That would be more portable too.

phoddie · 2023-11-20T20:44:46Z

Agreed. The one place I can think of where extending the timeout could be useful is your goal of disabling it when stopped at a breakpoint. But, that could be done transparently to the scripts.

I think that means that nothing needs to change in this PR, so it is good to go as-is?

tve · 2023-11-20T20:51:06Z

I need to add the commit for the one instance I missed. I don't think I'll get to it today (lots of git wrangling).
I'm fine if you want to add it:

diff --git a/modules/base/worker/modWorker.c b/modules/base/worker/modWorker.c
index 904880f93..54cbc04c2 100644
--- a/modules/base/worker/modWorker.c
+++ b/modules/base/worker/modWorker.c
@@ -512,7 +512,7 @@ void workerLoop(void *pvParameter)
 {
        modWorker worker = (modWorker)pvParameter;
 
-#if CONFIG_ESP_TASK_WDT
+#if CONFIG_ESP_TASK_WDT_EN
        esp_task_wdt_add(NULL);
 #endif

phoddie · 2023-11-20T22:13:53Z

It'll be cleanest if you can update the PR. Thanks.

tve · 2023-11-28T05:27:28Z

I added the missing commit. Ready from my point of view.

phoddie · 2023-11-29T05:21:11Z

Thanks for that. We'll try to get this one merged.

phoddie · 2023-12-01T00:31:46Z

Just FYI – I did some testing with this today. It all seems to work nicely. Setting up the watchdog in sdkconfig seems to get increasingly complex with each ESP-IDF. Nothing we can do about that. FWIW, this worked for me:

CONFIG_ESP_TASK_WDT_EN=y
CONFIG_ESP_TASK_WDT_INIT=y
CONFIG_ESP_TASK_WDT_PANIC=y

Not setting CONFIG_ESP_TASK_WDT_PANIC gave a really strange behavior. The watchdog fires and outputs a bunch of state to the console. Then it keeps running.

phoddie · 2023-12-18T23:12:53Z

Merged.

tve · 2023-12-21T03:11:16Z

Sooo... it turns out this patch is not entirely OK. The problem is that CONFIG_ESP_TASK_WDT_EN defaults to 'y' and CONFIG_ESP_TASK_WDT_INIT is set to 'n' in the default sdkconfig in XS. The result is that esp-idf doesn't like the calls to esp_task_wdt_reset that are happening because of the former (i.e. "enabled bu not initialized"). This is hidden because logging is disabled by default. If one just enables logging everything hangs on the mxDebugMutex because of the avalanche of messages about esp_task_wdt_reset:

I (136) main_task: Calling app_main()                                                  
I (163) uart: queue free spaces: 8                                                     
E (163) task_wdt: esp_task_wdt_add(747): TWDT was never initialized                    
App started                                                                            
This is fx_vprintf                                                                     
I (164) main_task: Returned from app_main()                                            
<?xs.3FFBF7E8?>E (415) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized   
E (415) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized                  
E (418) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (424) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (430) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (436) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (442) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (449) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (455) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (461) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (467) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (473) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (479) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (486) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (492) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
...

I don't know what your thoughts are about the TWDT. I would enable and init it by default. But disabling and not init'ing it by default ought to work as well. But it looks like both CONFIG variables need to be set in concert. I would also set the ESP_LOG to ERROR level and not NONE, but that's probably just me.

tve · 2023-12-21T17:04:09Z

For reference, this is what I ended up doing. First changes to sdkconfig.defaults to enable some logging, get a backtrace on crash, and enable the task WDT:

--- a/build/devices/esp32/xsProj-esp32/sdkconfig.defaults
+++ b/build/devices/esp32/xsProj-esp32/sdkconfig.defaults
@@ -205,9 +205,9 @@ CONFIG_ESP32_MEMMAP_TRACEMEM_TWOBANKS=n
 CONFIG_ESP32_TRAX=n
 CONFIG_ESP32_TRACEMEM_RESERVE_DRAM=0x0
 CONFIG_ESP_COREDUMP_ENABLE_TO_FLASH=n
-CONFIG_ESP_COREDUMP_ENABLE_TO_UART=y
+CONFIG_ESP_COREDUMP_ENABLE_TO_UART=n
 CONFIG_ESP_COREDUMP_ENABLE_TO_NONE=n
-CONFIG_ESP_COREDUMP_ENABLE=y
+CONFIG_ESP_COREDUMP_ENABLE=n
 CONFIG_ESP_COREDUMP_UART_DELAY=0
 CONFIG_ESP32_CORE_DUMP_LOG_LEVEL=1
 CONFIG_ESP32_UNIVERSAL_MAC_ADDRESSES_TWO=n
@@ -225,20 +225,21 @@ CONFIG_NEWLIB_STDIN_LINE_ENDING_CRLF=n
 CONFIG_NEWLIB_STDIN_LINE_ENDING_LF=n
 CONFIG_NEWLIB_STDIN_LINE_ENDING_CR=y
 CONFIG_NEWLIB_NANO_FORMAT=n
-CONFIG_ESP_CONSOLE_UART_DEFAULT=n
+CONFIG_ESP_CONSOLE_UART_DEFAULT=y
 CONFIG_ESP_CONSOLE_UART_CUSTOM=n
-CONFIG_ESP_CONSOLE_NONE=y
+CONFIG_ESP_CONSOLE_NONE=n
 CONFIG_ESP_CONSOLE_UART_NUM=0
 CONFIG_ULP_COPROC_ENABLED=n
 CONFIG_ULP_COPROC_RESERVE_MEM=0
 CONFIG_ESP_SYSTEM_PANIC_PRINT_HALT=n
-CONFIG_ESP_SYSTEM_PANIC_PRINT_REBOOT=n
+CONFIG_ESP_SYSTEM_PANIC_PRINT_REBOOT=y
 CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT=n
 CONFIG_ESP_SYSTEM_PANIC_GDBSTUB=n
 CONFIG_ESP_DEBUG_OCDAWARE=y
 CONFIG_ESP_DEBUG_STUBS_ENABLE=n
 CONFIG_ESP_INT_WDT=n
-CONFIG_ESP_TASK_WDT_INIT=n
+CONFIG_ESP_TASK_WDT_EN=y
+CONFIG_ESP_TASK_WDT_INIT=y
 CONFIG_ESP_BROWNOUT_DET=y
 CONFIG_ESP_BROWNOUT_DET_LVL_SEL_0=y
 CONFIG_ESP_BROWNOUT_DET_LVL_SEL_1=n
@@ -404,13 +405,13 @@ CONFIG_HEAP_TRACING=n
 #
 # Log output
 #
-CONFIG_LOG_DEFAULT_LEVEL_NONE=y
+CONFIG_LOG_DEFAULT_LEVEL_NONE=n
 CONFIG_LOG_DEFAULT_LEVEL_ERROR=n
-CONFIG_LOG_DEFAULT_LEVEL_WARN=n
+CONFIG_LOG_DEFAULT_LEVEL_WARN=y
 CONFIG_LOG_DEFAULT_LEVEL_INFO=n
 CONFIG_LOG_DEFAULT_LEVEL_DEBUG=n
 CONFIG_LOG_DEFAULT_LEVEL_VERBOSE=n
-CONFIG_LOG_DEFAULT_LEVEL=0
+CONFIG_LOG_DEFAULT_LEVEL=2
 CONFIG_LOG_COLORS=n
 
 #

And in main.c I added a TX buffer for the uart (IIRC the HW fifo is 128 bytes):

--- a/build/devices/esp32/xsProj-esp32/main/main.c
+++ b/build/devices/esp32/xsProj-esp32/main/main.c
@@ -247,10 +256,10 @@ void app_main() {
 
 #ifdef mxDebug
        QueueHandle_t uartQueue;
-       uart_driver_install(USE_UART, UART_FIFO_LEN * 2, 0, 8, &uartQueue, 0);
+       uart_driver_install(USE_UART, UART_FIFO_LEN * 2, UART_FIFO_LEN * 2, 8, &uartQueue, 0);
        xTaskCreate(debug_task, "debug", (768 + XT_STACK_EXTRA) / sizeof(StackType_t), uartQueue, 8, NULL);
 #else
-       uart_driver_install(USE_UART, UART_FIFO_LEN * 2, 0, 0, NULL, 0);
+       uart_driver_install(USE_UART, UART_FIFO_LEN * 2, UART_FIFO_LEN * 2, 0, NULL, 0);
 #endif
 
        xTaskCreate(loop_task, "main", kStack, NULL, 4, NULL);

And in xsPlatform.c I added a timeout to taking the mxDebug mutex:

--- a/xs/platforms/esp/xsPlatform.c
+++ b/xs/platforms/esp/xsPlatform.c
@@ -74,12 +74,14 @@ static void doRemoteCommand(txMachine *the, uint8_t *cmd, uint32_t cmdLen);
        SemaphoreHandle_t gDebugMutex;
 
        #define mxDebugMutexTake() xSemaphoreTake(gDebugMutex, portMAX_DELAY)
+       #define mxDebugMutexTakeTicks(ticks) xSemaphoreTake(gDebugMutex, ticks)
        #define mxDebugMutexGive() xSemaphoreGive(gDebugMutex)
        #define mxDebugMutexAllocated() (NULL != gDebugMutex)
 
        static int fx_vprintf(const char *str, va_list list);
 #else
        #define mxDebugMutexTake()
+       #define mxDebugMutexTakeTicks()
        #define mxDebugMutexGive()
        #define mxDebugMutexAllocated() (true)
 #endif
@@ -1291,9 +1295,9 @@ int fx_vprintf(const char *str, va_list list)
 {
        int result;
 
-       mxDebugMutexTake();
-               result = vprintf(str, list);
-       mxDebugMutexGive();
+       int ok = mxDebugMutexTakeTicks(10);
+       result = vprintf(str, list);
+       if (ok) mxDebugMutexGive();
 
        return result;
 }

There is an issue with the mxDebug mutex, which I suspect is the fact that fxReceiveLoop holds the mutex while calling modMessagePostToMachine, c_malloc, and modWatchDogReset, all of which can trigger an esp-idf log message, which results in deadlock, which I observed. I worked around this by adding more fifo and a timeout in fx_vprintf but the correct solution would be to rework fxReceiveLoop IMHO. (Moving the modWatchDogReset out of the critical section seems trivial.)

phoddie · 2023-12-21T18:44:59Z

These changes (mostly?) seem related to the watchdog when debugging is enabled. As a rule, the watchdog shouldn't be enabled in debug builds because it conflicts with debugging.

I understand that CONFIG_ESP_TASK_WDT_EN and CONFIG_ESP_TASK_WDT_INIT need to be set together. Maybe the right change is to set those both to off (N) in the defaults to reinforce that?

tve · 2023-12-21T20:37:41Z

I you want to disable the watchdog in debug mode, then yes, they should both be set to 'n'. I did a quick test and don't see any error/warning messages.

Can you remind me how it interferes with debugging other than the fact that Timer.delay(n) with n>5_000 will trigger the WDT? I just tried stopping at a break point for several minutes and nothing untoward happened. (You're probably correct that it's best to disable it for debug builds, though.)

phoddie · 2023-12-21T20:42:41Z

I'm not confident that we can always guarantee the debugger behavior will be good with the watchdog enabled. I tried to get that right more than once and it resisted.

...then yes, they should both be set to 'n'. I did a quick test and don't see any error/warning messages.

Since you are in that code now and can easily test, would you do a PR for that? It will be a much bigger project for me.

tve · 2023-12-21T20:45:58Z

Haha, not easy here either, but you did fix a bunch of things for me... Do you agree with having error logging enabled and backtrace on crash? If you'd like different LMK so I prepare the right PR.

phoddie · 2023-12-21T20:47:55Z

Do you agree with having error logging enabled and backtrace on crash?

On a debug build? I'm OK with backtrace on crash. Not wild about logging. That can generate a fair amount of noise. Generally that Is useful when tracking down native code issues, not script issues, so leaving logging to instrumentation builds seems appropriate for the majority of developers.

linfan68 · 2024-06-08T02:25:44Z

I'm currently on the latest public branch and encountering an issue where every time the Timer executes, I see the following error messages:

E (1369) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (1569) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (1769) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (1969) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized
E (2169) task_wdt: esp_task_wdt_reset(774): TWDT was never initialized

The Timer itself executes correctly despite these errors. adding these to sdkconfig works:

CONFIG_ESP_TASK_WDT_EN=y
CONFIG_ESP_TASK_WDT_INIT=y
CONFIG_ESP_TASK_WDT_PANIC=y

phoddie · 2024-06-08T19:59:20Z

esp_task_wdt_reset is only ever called through this macro:

moddable/xs/platforms/esp/xsHost.h

Lines 303 to 307 in 03fc510

    
           #if CONFIG_ESP_TASK_WDT_EN 
        
           	#define modWatchDogReset() esp_task_wdt_reset() 
        
           #else 
        
           	#define modWatchDogReset() 
        
           #endif

The message "TWDT was never initialized" means that CONFIG_ESP_TASK_WDT_EN is defined (since we are calling esp_task_wdt_reset ) but CONFIG_ESP_TASK_WDT_INIT is not (because the WDT hasn't been initialized). The Moddable SDK doesn't explicitly initialize the WDT. It assumes that a host that does not set CONFIG_ESP_TASK_WDT_INIT wants to manage the WDT initialization itself.

Which host are you using when you see this?

linfan68 · 2024-06-09T06:12:04Z

@phoddie I'm using esp32/esp32s3 platform.
Here is the problem

moddable/build/devices/esp32/xsProj-esp32s3/sdkconfig.defaults

Line 531 in 03fc510

# CONFIG_ESP_INT_WDT is not set

# CONFIG_ESP_INT_WDT is not set
# CONFIG_ESP_TASK_WDT_INIT is not set

and the generated sdkconfig becomes:

# CONFIG_ESP_INT_WDT is not set
CONFIG_ESP_TASK_WDT_EN=y
# CONFIG_ESP_TASK_WDT_INIT is not set

We need to add this:

# CONFIG_ESP_INT_WDT is not set
# CONFIG_ESP_TASK_WDT_EN is not set <<<< Adding this line
# CONFIG_ESP_TASK_WDT_INIT is not set

and the generated sdkconfig becomes:

# CONFIG_ESP_INT_WDT is not set
# CONFIG_ESP_TASK_WDT_EN is not set
# CONFIG_ESP_TASK_WDT_INIT is not set

phoddie · 2024-06-10T16:48:26Z

@linfan68 – thank you for the hint. I wouldn't have guessed to add a comment to solve this. The ESP32-S2 and ESP-C3 sdkconfig.defaults had the same issue, so I updated those in addition to the ESP32-S3.

ESP32: Fix watchdog timeout handling

d79f4b7

handle CONFIG_ESP_TASK_WDT_EN in modworker

fe70680

phoddie closed this Dec 18, 2023

mkellner pushed a commit that referenced this pull request Jun 11, 2024

ESP32 WDT config clean-up (#1249 @linfan68)

603628a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESP32: Fix watchdog timeout handling #1249

ESP32: Fix watchdog timeout handling #1249

tve commented Nov 18, 2023 •

edited

Loading

tve commented Nov 19, 2023

phoddie commented Nov 20, 2023

tve commented Nov 20, 2023

phoddie commented Nov 20, 2023

tve commented Nov 20, 2023

phoddie commented Nov 20, 2023

tve commented Nov 28, 2023

phoddie commented Nov 29, 2023

phoddie commented Dec 1, 2023

phoddie commented Dec 18, 2023

tve commented Dec 21, 2023

tve commented Dec 21, 2023 •

edited

Loading

phoddie commented Dec 21, 2023

tve commented Dec 21, 2023 •

edited

Loading

phoddie commented Dec 21, 2023

tve commented Dec 21, 2023

phoddie commented Dec 21, 2023

linfan68 commented Jun 8, 2024

phoddie commented Jun 8, 2024

linfan68 commented Jun 9, 2024

phoddie commented Jun 10, 2024

ESP32: Fix watchdog timeout handling #1249

ESP32: Fix watchdog timeout handling #1249

Conversation

tve commented Nov 18, 2023 • edited Loading

tve commented Nov 19, 2023

phoddie commented Nov 20, 2023

tve commented Nov 20, 2023

phoddie commented Nov 20, 2023

tve commented Nov 20, 2023

phoddie commented Nov 20, 2023

tve commented Nov 28, 2023

phoddie commented Nov 29, 2023

phoddie commented Dec 1, 2023

phoddie commented Dec 18, 2023

tve commented Dec 21, 2023

tve commented Dec 21, 2023 • edited Loading

phoddie commented Dec 21, 2023

tve commented Dec 21, 2023 • edited Loading

phoddie commented Dec 21, 2023

tve commented Dec 21, 2023

phoddie commented Dec 21, 2023

linfan68 commented Jun 8, 2024

phoddie commented Jun 8, 2024

linfan68 commented Jun 9, 2024

phoddie commented Jun 10, 2024

tve commented Nov 18, 2023 •

edited

Loading

tve commented Dec 21, 2023 •

edited

Loading

tve commented Dec 21, 2023 •

edited

Loading