-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP32: Fix watchdog timeout handling #1249
Conversation
Oops, I missed one instance of CONFIG_ESP_TASK_WDT, will push that later. |
This looks very reasonable. I'm not sure how we overlooked the #define name change. I remember testing with WDT recently and was working. The driver and example would be welcome additions. The dynamic configuration of the WDT timeout is obscure but I suppose worth trying to support. Your approach is reasonable as a first order attempt. A trickier solution would be to patch into |
I ended up there more through trying to figure out the whole thing than by hard requirement. When I discovered that the timeout ends up being global my enthusiasm dropped significantly. I think that a 2-level WDT may be more appropriate with the "application level WDT" implemented in JS. That would be more portable too. |
Agreed. The one place I can think of where extending the timeout could be useful is your goal of disabling it when stopped at a breakpoint. But, that could be done transparently to the scripts. I think that means that nothing needs to change in this PR, so it is good to go as-is? |
I need to add the commit for the one instance I missed. I don't think I'll get to it today (lots of git wrangling).
|
It'll be cleanest if you can update the PR. Thanks. |
I added the missing commit. Ready from my point of view. |
Thanks for that. We'll try to get this one merged. |
Just FYI – I did some testing with this today. It all seems to work nicely. Setting up the watchdog in sdkconfig seems to get increasingly complex with each ESP-IDF. Nothing we can do about that. FWIW, this worked for me:
Not setting |
Sooo... it turns out this patch is not entirely OK. The problem is that CONFIG_ESP_TASK_WDT_EN defaults to 'y' and CONFIG_ESP_TASK_WDT_INIT is set to 'n' in the default sdkconfig in XS. The result is that esp-idf doesn't like the calls to esp_task_wdt_reset that are happening because of the former (i.e. "enabled bu not initialized"). This is hidden because logging is disabled by default. If one just enables logging everything hangs on the mxDebugMutex because of the avalanche of messages about esp_task_wdt_reset:
I don't know what your thoughts are about the TWDT. I would enable and init it by default. But disabling and not init'ing it by default ought to work as well. But it looks like both CONFIG variables need to be set in concert. I would also set the ESP_LOG to ERROR level and not NONE, but that's probably just me. |
For reference, this is what I ended up doing. First changes to sdkconfig.defaults to enable some logging, get a backtrace on crash, and enable the task WDT: --- a/build/devices/esp32/xsProj-esp32/sdkconfig.defaults
+++ b/build/devices/esp32/xsProj-esp32/sdkconfig.defaults
@@ -205,9 +205,9 @@ CONFIG_ESP32_MEMMAP_TRACEMEM_TWOBANKS=n
CONFIG_ESP32_TRAX=n
CONFIG_ESP32_TRACEMEM_RESERVE_DRAM=0x0
CONFIG_ESP_COREDUMP_ENABLE_TO_FLASH=n
-CONFIG_ESP_COREDUMP_ENABLE_TO_UART=y
+CONFIG_ESP_COREDUMP_ENABLE_TO_UART=n
CONFIG_ESP_COREDUMP_ENABLE_TO_NONE=n
-CONFIG_ESP_COREDUMP_ENABLE=y
+CONFIG_ESP_COREDUMP_ENABLE=n
CONFIG_ESP_COREDUMP_UART_DELAY=0
CONFIG_ESP32_CORE_DUMP_LOG_LEVEL=1
CONFIG_ESP32_UNIVERSAL_MAC_ADDRESSES_TWO=n
@@ -225,20 +225,21 @@ CONFIG_NEWLIB_STDIN_LINE_ENDING_CRLF=n
CONFIG_NEWLIB_STDIN_LINE_ENDING_LF=n
CONFIG_NEWLIB_STDIN_LINE_ENDING_CR=y
CONFIG_NEWLIB_NANO_FORMAT=n
-CONFIG_ESP_CONSOLE_UART_DEFAULT=n
+CONFIG_ESP_CONSOLE_UART_DEFAULT=y
CONFIG_ESP_CONSOLE_UART_CUSTOM=n
-CONFIG_ESP_CONSOLE_NONE=y
+CONFIG_ESP_CONSOLE_NONE=n
CONFIG_ESP_CONSOLE_UART_NUM=0
CONFIG_ULP_COPROC_ENABLED=n
CONFIG_ULP_COPROC_RESERVE_MEM=0
CONFIG_ESP_SYSTEM_PANIC_PRINT_HALT=n
-CONFIG_ESP_SYSTEM_PANIC_PRINT_REBOOT=n
+CONFIG_ESP_SYSTEM_PANIC_PRINT_REBOOT=y
CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT=n
CONFIG_ESP_SYSTEM_PANIC_GDBSTUB=n
CONFIG_ESP_DEBUG_OCDAWARE=y
CONFIG_ESP_DEBUG_STUBS_ENABLE=n
CONFIG_ESP_INT_WDT=n
-CONFIG_ESP_TASK_WDT_INIT=n
+CONFIG_ESP_TASK_WDT_EN=y
+CONFIG_ESP_TASK_WDT_INIT=y
CONFIG_ESP_BROWNOUT_DET=y
CONFIG_ESP_BROWNOUT_DET_LVL_SEL_0=y
CONFIG_ESP_BROWNOUT_DET_LVL_SEL_1=n
@@ -404,13 +405,13 @@ CONFIG_HEAP_TRACING=n
#
# Log output
#
-CONFIG_LOG_DEFAULT_LEVEL_NONE=y
+CONFIG_LOG_DEFAULT_LEVEL_NONE=n
CONFIG_LOG_DEFAULT_LEVEL_ERROR=n
-CONFIG_LOG_DEFAULT_LEVEL_WARN=n
+CONFIG_LOG_DEFAULT_LEVEL_WARN=y
CONFIG_LOG_DEFAULT_LEVEL_INFO=n
CONFIG_LOG_DEFAULT_LEVEL_DEBUG=n
CONFIG_LOG_DEFAULT_LEVEL_VERBOSE=n
-CONFIG_LOG_DEFAULT_LEVEL=0
+CONFIG_LOG_DEFAULT_LEVEL=2
CONFIG_LOG_COLORS=n
# And in main.c I added a TX buffer for the uart (IIRC the HW fifo is 128 bytes): --- a/build/devices/esp32/xsProj-esp32/main/main.c
+++ b/build/devices/esp32/xsProj-esp32/main/main.c
@@ -247,10 +256,10 @@ void app_main() {
#ifdef mxDebug
QueueHandle_t uartQueue;
- uart_driver_install(USE_UART, UART_FIFO_LEN * 2, 0, 8, &uartQueue, 0);
+ uart_driver_install(USE_UART, UART_FIFO_LEN * 2, UART_FIFO_LEN * 2, 8, &uartQueue, 0);
xTaskCreate(debug_task, "debug", (768 + XT_STACK_EXTRA) / sizeof(StackType_t), uartQueue, 8, NULL);
#else
- uart_driver_install(USE_UART, UART_FIFO_LEN * 2, 0, 0, NULL, 0);
+ uart_driver_install(USE_UART, UART_FIFO_LEN * 2, UART_FIFO_LEN * 2, 0, NULL, 0);
#endif
xTaskCreate(loop_task, "main", kStack, NULL, 4, NULL); And in xsPlatform.c I added a timeout to taking the mxDebug mutex: --- a/xs/platforms/esp/xsPlatform.c
+++ b/xs/platforms/esp/xsPlatform.c
@@ -74,12 +74,14 @@ static void doRemoteCommand(txMachine *the, uint8_t *cmd, uint32_t cmdLen);
SemaphoreHandle_t gDebugMutex;
#define mxDebugMutexTake() xSemaphoreTake(gDebugMutex, portMAX_DELAY)
+ #define mxDebugMutexTakeTicks(ticks) xSemaphoreTake(gDebugMutex, ticks)
#define mxDebugMutexGive() xSemaphoreGive(gDebugMutex)
#define mxDebugMutexAllocated() (NULL != gDebugMutex)
static int fx_vprintf(const char *str, va_list list);
#else
#define mxDebugMutexTake()
+ #define mxDebugMutexTakeTicks()
#define mxDebugMutexGive()
#define mxDebugMutexAllocated() (true)
#endif
@@ -1291,9 +1295,9 @@ int fx_vprintf(const char *str, va_list list)
{
int result;
- mxDebugMutexTake();
- result = vprintf(str, list);
- mxDebugMutexGive();
+ int ok = mxDebugMutexTakeTicks(10);
+ result = vprintf(str, list);
+ if (ok) mxDebugMutexGive();
return result;
} There is an issue with the mxDebug mutex, which I suspect is the fact that fxReceiveLoop holds the mutex while calling modMessagePostToMachine, c_malloc, and modWatchDogReset, all of which can trigger an esp-idf log message, which results in deadlock, which I observed. I worked around this by adding more fifo and a timeout in fx_vprintf but the correct solution would be to rework fxReceiveLoop IMHO. (Moving the modWatchDogReset out of the critical section seems trivial.) |
These changes (mostly?) seem related to the watchdog when debugging is enabled. As a rule, the watchdog shouldn't be enabled in debug builds because it conflicts with debugging. I understand that |
I you want to disable the watchdog in debug mode, then yes, they should both be set to 'n'. I did a quick test and don't see any error/warning messages. Can you remind me how it interferes with debugging other than the fact that |
I'm not confident that we can always guarantee the debugger behavior will be good with the watchdog enabled. I tried to get that right more than once and it resisted.
Since you are in that code now and can easily test, would you do a PR for that? It will be a much bigger project for me. |
Haha, not easy here either, but you did fix a bunch of things for me... Do you agree with having error logging enabled and backtrace on crash? If you'd like different LMK so I prepare the right PR. |
On a debug build? I'm OK with backtrace on crash. Not wild about logging. That can generate a fair amount of noise. Generally that Is useful when tracking down native code issues, not script issues, so leaving logging to instrumentation builds seems appropriate for the majority of developers. |
I'm currently on the latest public branch and encountering an issue where every time the Timer executes, I see the following error messages:
The Timer itself executes correctly despite these errors. adding these to sdkconfig works:
|
moddable/xs/platforms/esp/xsHost.h Lines 303 to 307 in 03fc510
The message "TWDT was never initialized" means that Which host are you using when you see this? |
@phoddie I'm using esp32/esp32s3 platform.
and the generated sdkconfig becomes:
We need to add this:
and the generated sdkconfig becomes:
|
@linfan68 – thank you for the hint. I wouldn't have guessed to add a comment to solve this. The ESP32-S2 and ESP-C3 |
The support in the esp32 xsHost has gotten a bit wonky due to changes in the CONFIG_ESP_TASK constants. Specifically, CONFIG_ESP_TASK_WDT is now CONFIG_ESP_TASK_WDT_EN and there's CONFIG_ESP_TASK_WDT_INIT.
In addition, the task WDT can be enabled at run-time and the timeout can be set/changed at run-time too.
This PR fixes the various #ifdef around the config constants. I wrote a little watchdog module that exercises all this. It can be found at https://github.com/tve/modtve/tree/main/modules/drivers/wdt with an example at https://github.com/tve/modtve/tree/main/modules/examples/wdt. If this module looks good I'm happy to add it to this PR or to modify it to fit. I was hoping it could fit into an ecma-419 Host scheme of sorts...
In any case, there's an issue with the WDT timeout since it can be changed dynamically. The calculation of maxDelayMS in
xsHost.c
doesn't really work necessarily. To add insult to injury, one can't query the current timeout from esp-idf. The best I can come up with is to at least call some function in xsHost from the WDT module if the user changes the timeout.I tested on esp32 and esp32-c3
Feedback welcome.