Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Using the RTC memory for telemetry / crash data #1839

Open
s0170071 opened this issue Oct 2, 2018 · 3 comments
Open

Feature Request: Using the RTC memory for telemetry / crash data #1839

s0170071 opened this issue Oct 2, 2018 · 3 comments
Labels
Status: Needs Info Needs more info before action can be taken Type: Feature Request Add a completely new feature (e.g. controller/plugin)

Comments

@s0170071
Copy link
Contributor

s0170071 commented Oct 2, 2018

As far as I know the RTC memory currently only used for keeping track of the flash cycles per day. But it can do much more. Debugging for example

Howto: Using the RTC memory:

https://www.youtube.com/watch?v=r-hEOL007nw
Its 512 bytes arranged in multiples of 4. Survives DeepSleep and Reset, Read/Write as often as you like.
Beware: if you write it you have to write it all at once. You must not update individual bytes.

as soon as I find the time, I will play with it to implement the new %previousUptime% system variable. It will hold the uptime before the latest reset/crash. Also a lot of other variables could be stored in here to survive a crash or so.

My suggestion for the RTC memory would be :

  1. Uptime1
  2. Uptime2
  3. Uptime3
  4. Uptime4
  5. UptimePointer
  6. freeHeap
  7. freeStack
    8-18. CallStack (maybe only when debug output is enabled, requires a exception decoder compatible output)
  8. WifiEvent
  9. @TD-er indicated some other WiFi data would be interesting

...

If an email service is configured, that could allow for a small real-time crash-report.

It can also hold measuring values so that the ESP waking up from deep sleep does not necessarily need to transmit values after each wake up.

If all four (previous) uptimes are <10s, that could indicate a boot loop and trigger a start with a default configuration.

You see, lots of possibilities.

As a first step I would suggest so collect your thoughts on the matter and maybe assemble a conclusive list of data that should survive a reset / deepsleep and what to do with it on reboot / wakeup.

@TD-er
Copy link
Member

TD-er commented Oct 2, 2018

It is currently also used to keep the last values of plugins.

The ones you mention are all rather large (16 - 32 bit)
Also you may want to have a properly aligned struct to loose as little storage as possible.

An example of bad design:

  • uint32_t
  • bool
  • byte
  • uint16_t
  • bool
  • uint32_t

This may take up-to 6x 32 bits of storage.
To get an idea, please have a look at this part, I recently added to the SettingsStruct:

  // VariousBits1 defaults to 0, keep in mind when adding bit lookups.
  bool appendUnitToHostname() {  return !getBitFromUL(VariousBits1, 1); }
  void appendUnitToHostname(bool value) { setBitToUL(VariousBits1, 1, !value); }

See also #1597

@TD-er TD-er added Type: Feature Request Add a completely new feature (e.g. controller/plugin) Status: Needs Info Needs more info before action can be taken labels Oct 2, 2018
@TD-er
Copy link
Member

TD-er commented Oct 2, 2018

For current implementation, see functions in Misc.ino:

  • saveToRTC
  • initRTC
  • readFromRTC
  • saveUserVarToRTC
  • readUserVarFromRTC

And RTCStruct in ESPEasy-globals.h

It is currently used to keep track of factoryResetCounter, flashCounter and flashDayCounter (to protect flash)
And also bootFailedCount and bootCounter which are used to get an idea of bootloops and start disabling plugins/controllers/notifications one-by-one to be able to boot again.

I don't know why that struct is limited to 40 bytes, but it looks like about 15 bytes is currently used (may be more due to bad alignment)
It starts at RTC_BASE_STRUCT (= 64 * 4).
The user vars are stored at RTC_BASE_USERVAR (= 74 * 4) and that is defined as:

float UserVar[VARS_PER_TASK * TASKS_MAX];

=> 4*12 * 4 bytes + 4 for checksum
So at "address" (74 + 49) *4 should be some room left for new data.

On the other hand, this is some kind of cache, which will be lost and re-init after power loss. So we're not really bound to current layout definitions.
As long as we're not using parts already used by the core libraries.

@s0170071
Copy link
Contributor Author

s0170071 commented Oct 8, 2018

Some more info to consider (so it does not get lost): There is a custom_crash_callback used by core_esp8266_postmortem.c Somebody uses this to save the stack and email it on reboot. You could also toggle a gpio if you like. There is also a library .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Needs Info Needs more info before action can be taken Type: Feature Request Add a completely new feature (e.g. controller/plugin)
Projects
None yet
Development

No branches or pull requests

2 participants