Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture stack trace and reset after startup? #1152

Closed
penfold42 opened this issue Dec 5, 2015 · 30 comments
Closed

Capture stack trace and reset after startup? #1152

penfold42 opened this issue Dec 5, 2015 · 30 comments

Comments

@penfold42
Copy link

penfold42 commented Dec 5, 2015

Is there any way to capture the reset reason and stack trace after a restart in my sketch ?

I want to send this over the network rather than capture it on the serial interface.

Thanks !

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@Links2004
Copy link
Collaborator

for the reset reason:

ESP.getResetInfo();
//or
ESP.getResetInfoPtr();

Arduino/cores/esp8266/Esp.h

Lines 128 to 129 in 342c4ae

String getResetInfo();
struct rst_info * getResetInfoPtr();

String EspClass::getResetInfo(void) {
if(resetInfo.reason != 0) {
char buff[200];
sprintf(&buff[0], "Fatal exception:%d flag:%d (%s) epc1:0x%08x epc2:0x%08x epc3:0x%08x excvaddr:0x%08x depc:0x%08x", resetInfo.exccause, resetInfo.reason, (resetInfo.reason == 0 ? "DEFAULT" : resetInfo.reason == 1 ? "WDT" : resetInfo.reason == 2 ? "EXCEPTION" : resetInfo.reason == 3 ? "SOFT_WDT" : resetInfo.reason == 4 ? "SOFT_RESTART" : resetInfo.reason == 5 ? "DEEP_SLEEP_AWAKE" : resetInfo.reason == 6 ? "EXT_SYS_RST" : "???"), resetInfo.epc1, resetInfo.epc2, resetInfo.epc3, resetInfo.excvaddr, resetInfo.depc);
return String(buff);
}
return String("flag: 0");
}
struct rst_info * EspClass::getResetInfoPtr(void) {
return &resetInfo;
}

@igrr
Copy link
Member

igrr commented Dec 5, 2015

You can also register a crash callback (see this commit)
Within it, you can save the stack somewhere in flash, and load it on next restart. You have to be careful with what you do in that callback though.

Edit: when defining custom_crash_callback within the sketch, be sure to mark it extern "C".

@penfold42
Copy link
Author

Thanks guys,

What does "careful" mean ?
I'm guessing this is call as little as possible because you don't know what is still working ?

What happens if the crash handler crashes ?

@igrr
Copy link
Member

igrr commented Dec 6, 2015

Pretty much, yes. Don't use dynamic memory allocation, don't use blocking functions (delay, Serial, network), don't spend too much time inside the interrupt handler (because hardware watchdog timer is still ticking).
If you cause another exception inside your handler, you can still rely on hardware WDT to reset the ESP in a few seconds.

@TheAustrian
Copy link

Are there any plans to incorporate this saving to flash of the stack as a ready-made function or at least an example or some more pointers on how to implement such a thing? It seems a bit over my head since I didn't dabble with memory before...

Background: I would like to be able to send a stack dump over the net after a reset, since I have an ESP that basically gets a hardware WDT reset every 10-20 hours and which isn't easily accessible.

@djoele
Copy link

djoele commented May 19, 2016

I got it working. Below is the code that I used. I'm not the best coder, so any comments are welcome. At least it is working, now still have to make it better.

What it is doing:

  1. Get the complete failure message at a crash, including the stack trace
  2. Write this message to eeprom
  3. At startup read the message from eeprom
    Then you have it available and can do anyting with it.
//This function gathers the stack itself from starter to ender
void getStack(uint32_t starter, uint32_t ender){
  char stackline[46];

  for (uint32_t pos = starter; pos < ender; pos += 0x10) {
      uint32_t* values = (uint32_t*)(pos);
      //rough indicator: stack frames usually have SP saved as the second word
      bool looksLikeStackFrame = (values[2] == pos + 0x10);
      sprintf(stackline, "%08x:  %08x %08x %08x %08x %c", pos, values[0], values[1], values[2], values[3], (looksLikeStackFrame)?'<':' ');
      sprintf(buf2 + strlen(buf2), "%s", stackline);
  } 
}

extern "C" void custom_crash_callback(struct rst_info * rst_info, uint32_t stack, uint32_t stack_end ){  
  register uint32_t sp asm("a1");
  cont_t g_cont __attribute__ ((aligned (16)));

  uint32_t cont_stack_start = (uint32_t) &(g_cont.stack);
  uint32_t cont_stack_end = (uint32_t) g_cont.stack_end;
  uint32_t stack_end2 = stack_end;
  uint32_t offset = 0;

  if (rst_info->reason == REASON_SOFT_WDT_RST) {
      offset = 0x1b0;
  }
  else if (rst_info->reason == REASON_EXCEPTION_RST) {
      offset = 0x1a0;
  }
  else if (rst_info->reason == REASON_WDT_RST) {
      offset = 0x10;
  }
  if (stack > cont_stack_start && stack < cont_stack_end) {
      sprintf(buf2 + strlen(buf2), "%s", "ctx: cont");
  }
  else {
      sprintf(buf2 + strlen(buf2), "%s", "ctx: sys");
  }
  sprintf(buf2 + strlen(buf2), "sp: %08x end: %08x offset: %04x\n", stack, stack_end, offset);
  getStack(stack, stack_end);

  eeprom_erase_all();
  eeprom_write_string(0, buf2);
  EEPROM.commit();
} 

Then in setup():

....
#include <cont.h>

EEPROM.begin(EEPROM_MAX_ADDR);
SPIFFS.begin();
stack = loadStack();
char buf2[2000];

This is the loadStack function:

String loadStack(){
  char *rinfo;
  String reset;
  reset = ESP.getResetInfo();
  rinfo = &reset[0];
  char rr[2500];

  eeprom_read_string(0, buf, EEPROM_MAX_ADDR);
  String stack = urlencode(buf);
  const char find[4] = "ctx";
  const char find2[10] = "Exception";
  const char * stackkie = stack.c_str();
  char *ret;
  ret = strstr(stackkie, find2);
  if (ret==NULL){
    ret = strstr(stackkie, find);
  }

  strcpy(rr, rinfo);
  if (ret != NULL){
    strcat(rr, (const char *)ret);
  }
  String bla = urlencode(rr);
  eeprom_erase_all();
  EEPROM.commit();
  return bla;
}

@djoele
Copy link

djoele commented May 25, 2016

For the interested, with the above code I got results. See below for a SOFT_WDT, EXCEPTION and WDT result. When these are decode with the ESP Exception Decoder they give the expected results.

I implemented 3 test functions that run on the ESP Webserver and cause these 3 cases.

Fatal+exception%3A4+flag%3A3+%28SOFT%5FWDT%29+epc1%3A0x4020e9c9+epc2%3A0x00000000+epc3%3A0x00000000+excvaddr%3A0x00000000+depc%3A0x00000000ctx%253A%2Bsys%2Bsp%253A%2B3fff3d80%2Bend%253A%2B3fff3ec0%2Boffset%253A%2B01b0%250A%2B3fff3d80%253A%2B%2B3ffe8c70%2B00000007%2B3fff164c%2B4020e9d8%2B%2B3fff3d90%253A%2B%2B00000002%2B00000001%2B3ffe8ca0%2B4020709a%2B%2B3fff3da0%253A%2B%2B00000000%2B00000000%2B00000000%2B4010068c%2B%2B3fff3db0%253A%2B%2B00000000%2B00000000%2B3fff622c%2B4020cb9e%2B%2B3fff3dc0%253A%2B%2B3fff622c%2B3fff0db0%2B3fff622c%2B4020cbda%2B%2B3fff3dd0%253A%2B%2B00000000%2B00000000%2B00000000%2B4020f3f0%2B%2B3fff3de0%253A%2B%2B3fff622c%2B3fff0db0%2B3fff0d70%2B4020cc69%2B%2B3fff3df0%253A%2B%2B3fff648c%2B0000000f%2B00000007%2B40209728%2B%2B3fff3e00%253A%2B%2B00000000%2B00000004%2B00000004%2B00000001%2B%2B3fff3e10%253A%2B%2B00000002%2B00000004%2B0000000e%2B3fff2ea0%2B%2B3fff3e20%253A%2B%2B00000000%2B00000000%2B3fff0d70%2B3fff2e8c%2B%2B3fff3e30%253A%2B%2B00000001%2B3fff0d94%2B3fff0d70%2B4020ce77%2B%2B3fff3e40%253A%2B%2B3ffe9168%2B00000000%2B000003e8%2B4020a964%2B%2B3fff3e50%253A%2B%2B00000000%2B3fff6534%2B000003e8%2B4020a9e2%2B%2B3fff3e60%253A%2B%2B3fffdad0%2B00000000%2B3fff2e85%2B4020874f%2B%2B3fff3e70%253A%2B%2B00000000%2B00000000%2B00000000%2B00000000%2B%2B3fff3e80%253A%2B%2B00000000%2B00000000%2B402073e4%2B40208484%2B%2B3fff3e90%253A%2B%2B00000000%2B00000000%2B00000001%2B3fff2e8c%2B%2B3fff3ea0%253A%2B%2B3fffdad0%2B00000000%2B3fff2e85%2B4020fdb0%2B%2B3fff3eb0%253A%2B%2Bfeefeffe%2Bfeefeffe%2B3fff2ea0%2B40100718%2B%2B%2B%253Cnull%253E

Fatal+exception%3A28+flag%3A2+%28EXCEPTION%29+epc1%3A0x402071b2+epc2%3A0x00000000+epc3%3A0x00000000+excvaddr%3A0x00000000+depc%3A0x00000000ctx%253A%2Bsys%2Bsp%253A%2B3fff3d90%2Bend%253A%2B3fff3ec0%2Boffset%253A%2B01a0%250A%2B3fff3d90%253A%2B%2B00000002%2B00000001%2B00000000%2B402071b0%2B%2B3fff3da0%253A%2B%2B32317830%2B36353433%2B00000000%2B4010068c%2B%2B3fff3db0%253A%2B%2B00000000%2B00000000%2B3fff606c%2B4020cb9e%2B%2B3fff3dc0%253A%2B%2B3fff606c%2B3fff0db0%2B3fff606c%2B4020cbda%2B%2B3fff3dd0%253A%2B%2B00000000%2B00000000%2B00000000%2B4020f3f0%2B%2B3fff3de0%253A%2B%2B3fff606c%2B3fff0db0%2B3fff0d70%2B4020cc69%2B%2B3fff3df0%253A%2B%2B3fff6624%2B0000000f%2B00000006%2B40209728%2B%2B3fff3e00%253A%2B%2B00000000%2B00000004%2B00000004%2B00000001%2B%2B3fff3e10%253A%2B%2B00000002%2B00000004%2B0000000e%2B3fff2ea0%2B%2B3fff3e20%253A%2B%2B00000000%2B00000000%2B3fff0d70%2B3fff2e8c%2B%2B3fff3e30%253A%2B%2B00000001%2B3fff0d94%2B3fff0d70%2B4020ce77%2B%2B3fff3e40%253A%2B%2B3ffe9168%2B00000000%2B000003e8%2B4020a964%2B%2B3fff3e50%253A%2B%2B00000000%2B3fff65e4%2B000003e8%2B4020a9e2%2B%2B3fff3e60%253A%2B%2B3fffdad0%2B00000000%2B3fff2e85%2B4020874f%2B%2B3fff3e70%253A%2B%2B00000000%2B00000000%2B00000000%2B00000000%2B%2B3fff3e80%253A%2B%2B00000000%2B00000000%2B402073e4%2B40208484%2B%2B3fff3e90%253A%2B%2B00000000%2B00000000%2B00000001%2B3fff2e8c%2B%2B3fff3ea0%253A%2B%2B3fffdad0%2B00000000%2B3fff2e85%2B4020fdb0%2B%2B3fff3eb0%253A%2B%2Bfeefeffe%2Bfeefeffe%2B3fff2ea0%2B40100718%2B%2B%2B%253Cnull%253E

Fatal+exception%3A4+flag%3A1+%28WDT%29+epc1%3A0x402097c7+epc2%3A0x00000000+epc3%3A0x00000000+excvaddr%3A0x00000000+depc%3A0x00000000

@djoele
Copy link

djoele commented May 26, 2016

I want to do some more testing with these functions. I have the ability to call a URL like http://ESP IP:80/crash and then execute code that causes a crash.

All examples that I used give the correct stack trace after startup and when decoded also give the correct information.

This is what I have:

server.on("/crash", {
if(!server.authenticate(www_username, www_password))
return server.requestAuthentication();
server.send(200, "text/plain", "ESP8266 gaat crashen met EXCEPTION..");
char linea[]="0x123456",*_ap;
int num;
num=strtol(linea,ap,0);
printf("%d\n%s",num,_ap);
int k;
});
server.on("/crash2", {
if(!server.authenticate(www_username, www_password))
return server.requestAuthentication();
server.send(200, "text/plain", "ESP8266 gaat crashen met SOFT_WDT..");
while (true){
serverClient.println("Crashing...");
}
});
server.on("/crash3", {
if(!server.authenticate(www_username, www_password))
return server.requestAuthentication();
server.send(200, "text/plain", "ESP8266 gaat crashen met WDT..");
ESP.wdtDisable();
while (true){
serverClient.println("Crashing...");
}
});
server.on("/crash4", {
if(!server.authenticate(www_username, www_password))
return server.requestAuthentication();
server.send(200, "text/plain", "ESP8266 gaat crashen met EXCEPTION..");
crashme();
});
server.on("/crash5", {
if(!server.authenticate(www_username, www_password))
return server.requestAuthentication();
server.send(200, "text/plain", "ESP8266 gaat crashen met EXCEPTION..");
crashme2();
});

And then:

void crashme(){
int* i = NULL;
*i = 80;
}

void crashme2(){
char svptr = NULL;
static char
str_input = NULL;
const char delim[] = " ";
char input[] = "bla";
size_t malloc_amount = (sizeof(char) * 0) & (~3);
str_input = (char *)malloc(malloc_amount);
memset(str_input, '\0', 0);
strcpy(str_input, input);
}

@djoele
Copy link

djoele commented Jun 2, 2016

Hello interested people, I created a minimal Arduino sketch with the following functionality:

  1. At startup it connects to Wifi and registers 5 crash functions that can be called
  2. The reset reason and stack (if present) are printed to Serial at startup
  3. A custom crash callback stores the stack in EEPROM

Simply do the following:

  • Open the sketch
  • Set your ssid and password
  • Fire up the sketch

Then call the next URL's in a row, for example by using curl:

  1. wget IP:80/crash
  2. wget IP:80/crash2
  3. wget IP:80/crash3
  4. wget IP:80/crash4
  5. wget IP:80/crash5

The stack printed at startup can be given to ESP Exception Decoder and there you have your stack in human readble format.

The output I receive is attached. It can be seen that after startup the reasot reason and stack are available.
crash_test_output.txt
minimal.zip

@djoele
Copy link

djoele commented Jun 2, 2016

@TheAustrian and @penfold42: could you have a try with this sketch I created?

@djoele
Copy link

djoele commented Jun 4, 2016

Cool! This morning I had a crash and got the crash below in my inbox. So, is seems to be working fine. I was able to decode it in ESP Exception Decoder.

Fatal exception:28 flag:2 (EXCEPTION) epc1:0x4022a136 epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000008 depc:0x00000000
ctx: cont sp: 3fff3c70 end: 3fff3f40 offset: 01a0
3fff3c70: 402090cc 3fff6984 00000001 4022a0f0 3fff3c80: 3ffefbe0 00000000 00000000 3fff3d70 3fff3c90: 00000002 0000000f 3fff3cd0 3fff6a43 3fff3ca0: 3fff6a41 07077c0a 00001387 00000030 3fff3cb0: 4023705b 6db5cf70 ebff7ce3 00000000 3fff3cc0: 3fff6984 00080000 00000003 4022affd 3fff3cd0: 00020315 3fff6a02 3fff852c 00000940 3fff3ce0: 00000002 00000015 0000000b 00000004 3fff3cf0: 3fff6984 3fff6a41 00000004 00000000 3fff3d00: 00000000 00080000 3fff6984 4022b61e 3fff3d10: 00000001 3fff675c 00000015 401004d8 3fff3d20: 3ffe9c8c 00000028 3fff6654 00000000 3fff3d30: 3fff3d70 0000000e 00000010 00001387 3fff3d40: 07077c0a 00000000 3fff6984 4022b67a 3fff3d50: 07077c0a 3fff6984 3fff6654 40208cc2 3fff3d60: 3fff675c 3fff4034 3fff6654 402090cc 3fff3d70: 00000000 3fff4034 3fff4034 4020842a 3fff3d80: a2a75754 00000000 3fff3dc0 00000000 3fff3d90: 000001bb 3fff4034 3fff675c 40209409 3fff3da0: 3ffe9740 a2a75754 3ffe9740 a2a75754 3fff3db0: 000001bb 00000000 3fff650c 4020a760 3fff3dc0: 00000000 00000001 3fff3e40 4020e138 3fff3dd0: 3fff650c 00000000 3fff650c 4020abc0 3fff3de0: 00000000 00000000 00000000 00000000 3fff3df0: 00000000 00000000 00000000 00000000 3fff3e00: 3ffe9078 00000012 00000012 4010020c 3fff3e10: 000001bb 3fff3e9c 3fff3e40 00047c50 3fff3e20: 000001bb 3fff3e9c 3fff650c 4020ac66 3fff3e30: 000001bb 3fff3e9c 3fff650c 402070e8 3fff3e40: 00000000 00000000 00000000 00000000 3fff3e50: 00000000 00000000 00000000 00000000 3fff3e60: 00000000 3ffe8dec 3fff3e90 4020e31c 3fff3e70: 3fff3e90 3ffe8dec 0000012c 4020e344 3fff3e80: 3fff3e9c 3ffe8dec 0000012c 40207236 3fff3e90: 3fff63dc 0000001f 00000014 3fff6384 3fff3ea0: 0000004f 00000046 00000000 00000000 3fff3eb0: 00000000 3fff64b4 0000004f 00000046 3fff3ec0: 3fff3e9c 3ffe8d0c 3fff2d18 402096fe 3fff3ed0: 4072c000 00000000 00000008 3fff2f18 3fff3ee0: 3fff2d08 402072d4 3fff2d08 402072fd 3fff3ef0: 3fff2d08 402072d4 3fff2d08 40209984 3fff3f00: 00000000 070779f2 000003e8 402099d4 3fff3f10: 3fffdad0 00000000 3fff1420 40207653 3fff3f20: feefeffe 00000000 3fff2f10 4020ebac 3fff3f30: feefeffe feefeffe 3fff2f20 40100718

@TheAustrian
Copy link

@djoele Awesome, thanks! I'll have a go at it in a few days and report back.

@djoele
Copy link

djoele commented Jun 15, 2016

@TheAustrian: have you been able to check the sketch?

@djoele
Copy link

djoele commented Jun 23, 2016

Hello everybody, this question is over half a year old.
I provided a sketch that is working, at least in my situation.

The original poster has not responded and one person has but did not reply with some results of testing the sketch.

Is there anybody interested or should we marked the question as closed?

@TheAustrian
Copy link

Hi! I'm still interested and have tested the sketch you provided and it seems it works (thanks for that and sorry I did not immediately respond!), but I haven't had the time to adapt the sketch to my needs i.e. either integrate it into my existing program and transmit the crash info with websocket to the website I use or log it in a database on my server.

Work has been a pain lately (everyone wants everything ready before the holidays, it seems)...

@djoele
Copy link

djoele commented Jun 25, 2016

Good that it worked. Let us know when you're done putting it in your own sketch.
Perhaps this functionality could be converted into some general functionality, as written by you on Apr 13.

@djoele
Copy link

djoele commented Jul 11, 2016

@TheAustrian: did you find any time to work on your sketch?

Has anybody else tried it?

@krzychb
Copy link
Contributor

krzychb commented Jul 11, 2016

Hi @djoele,

I believe your sketch is a very useful tool.

I have encountered cases like @penfold42, when I like to debug some exception / ESP restarts that occur infrequently / at remote location so it has no sense or it is not practical to watch at serial monitor waiting for it to happen.

This thread and your sketch in particular inspired me to write a library to convert what you have developed into some general functionality. I am tied up with some other tasks now but I am planning to return to it in next coupe of weeks. If anybody is interested then share your thoughts.

Thank you for development of this functionality and sharing!

Krzysztof

@djoele
Copy link

djoele commented Jul 11, 2016

@krzychb: I appreciate it that you pick up where I left. Looking forward to see the results of the library you are making. Perhaps the code as I wrote it could be improved as well.

@djoele
Copy link

djoele commented Aug 4, 2016

@krzychb: have you made any progress on your library?

@krzychb
Copy link
Contributor

krzychb commented Aug 5, 2016

Hi @djoele,
I jump on another exciting project and did not have a chance to get to this one.
Thank you for getting me motivated.
I think I can aim for end of August to release preliminary version for review.
Krzysztof

@djoele
Copy link

djoele commented Aug 5, 2016

No problem :) Looking forward to your first release.

@krzychb
Copy link
Contributor

krzychb commented Aug 15, 2016

Hi @djoele,

I have prepared preliminary version of library to capture and save crash information https://github.com/krzychb/EspSaveCrash.

There are also examples attached. They allow to crash ESP and then show captured data on a serial monitor. There is also an example to show crash information with a web browser. I plan to redo this one and prepare a more compelling version 😄

Krzysztof

@djoele
Copy link

djoele commented Aug 17, 2016

This is done in a good way, I think.
I tried your examples and for my cause they are working.

I created something similar for your web browser functionality. I created a call that is handled when my server tries to get the crash information from the ESP device.

Good job!

@krzychb
Copy link
Contributor

krzychb commented Aug 17, 2016

Thank you for quickly checking it and good news!
Krzysztof

@devyte
Copy link
Collaborator

devyte commented Oct 19, 2017

I think having this in its own repo is ok. Thank you, I use it myself.
Closing as resolved.

@popok75
Copy link

popok75 commented Dec 28, 2018

I'm not sure if I was careful enough in that callback but I made little lib to save esp8266 exception trace in a spiffs file, in case someone is interested, very useful if you run a server or if you have long stack traces ;) , based on Krzysztof work :
crashsaverspiffs.zip

@brainelectronics
Copy link

@popok75 I've developed your code a little bit further. May you and other interested hackers checkout my EspSaveCrashSpiffs lib. It's based on @krzychb EEPROM lib, but uses the filesystem for creating crash log files everytime a crash occurs. Thereby the files stay small and you can almost create as many crashes as you want

@arihantdaga
Copy link

Is there any way or any function which will be called right before esp shuts down because of power loss.
I have an application where i want to store cumulative energy consumed by an appliance in the EEPROM just before power goes, so that when the power comes back, i can start from the energy value till now.
custom_crash_callback is called only when it crashes, i am looking for something which will be called everytime before device shuts down. I am sure something is done because ESP. getResetInfo can tell that last restart reason was power on/power off. So its writing this somewhere.

@d-a-v
Copy link
Collaborator

d-a-v commented Jan 23, 2020

can tell that last restart reason was power on/power off.

Maybe because something is not written before boot (default value on cold boot means power-on, other values are written right before reboots).

Anyway I personally am not aware such function and I'm pretty sure there's none.

I would use a 3v3powersupply->schottky diode->supercapacitor-on-vcc,
and an interrupt management on a pin connected to 3v3power-before-diode to trigger the emergency function.
An electronician may comment :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests