Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No proper check of memory allocation when using DynamicJsonDocument results in sometimes temporary unexpected behavior during runtime #591

Closed
helgeerbe opened this issue Jan 4, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@helgeerbe
Copy link
Collaborator

What happened?

I opened the bug first in the upstream project: tbnobody#1615

But we have to fix it in our code extensions as well..

DynamicJsonDocumentis used to create datasets for the Web-API, MQTT messaging and to backup config data.

Problem is, that DynamicJsonDocument doesn't throw an exception, if memory allocation fails. This results in strange and unexpected (sometimes temporary) behavior of openDTU during runtime.

  • Liveview api returns null
  • Mqtt doesn't send messages
  • Saved config files are corrupted

Since this error is not handled, there is no logging of this resource problem.

To Reproduce Bug

Extend INV_MAX_COUNTsignificantly and you will never see live data, nor you will get any logging.

The try block std::bad_alloc catches errors from serializeJason(), when there is not enough memory for DynamicJsonDocument but not enough memory for serialization.

Expected Behavior

At least I would expect some logging for live view and mqtt that there is not enough memory.
Config files should never be saved when they are corrupted.

Install Method

Self-Compiled

What git-hash/version of OpenDTU?

7946dfc

Relevant log/trace output

No response

Anything else?

After each call of DynamicJsonDocument()there should be a test with DynamicJsonDocument::capacity != 0.

During my tests I used:

  • ESP.getFreeHeap()
  • ESP.getMaxAllocHeap()

While both functions tells me that there is enough memory DynamicJsonDocument::capacity() returns 0. This may be due to my particular Esp32, as I suspect that the flash memory is no longer working properly. I have to desolder it and replace it with a new Esp32.

@helgeerbe helgeerbe added the bug Something isn't working label Jan 4, 2024
@helgeerbe
Copy link
Collaborator Author

tbnobody#1615 (comment)

Running on a D1 mini it seems we reached a critical memory allocation if TLS is enabled in MQTT.

As a workaround don't use TLS or reduce the max number of supported intervertes. I tend to just wait for the live view refactoring.

@schlimmchen
Copy link
Member

schlimmchen commented Jan 5, 2024

@helgeerbe I wrote this in #595:

notice that we will still miss the case where two processes (not TaskScheduler tasks) are both trying to allocate more than 50% of the largest free block.

Has anyone ever thought about this? If a TaskScheduler task (or formerly any of the loop()s) was using a large chunk of memory, other threads (AsyncWebServer and MQTT client) might not be able to allocate their big chunk of memory. However, I only know of large memory allocations in the AsyncWebServer context: Prometheus, websocket JSON chunks. Are their TaskScheduler tasks that use a large block of memory for a short amount of time?

@helgeerbe
Copy link
Collaborator Author

I already observed this behavior. That's the reaseon somtimes mqtt is not working, or you can't call a setting page or the saved config file is corrupted.

tbnobody will refactor the live view api.

@helgeerbe
Copy link
Collaborator Author

btw. with release v24.1.4 we will see a log entry, if openDTU runs out of memory using DynamicJsonDocuments.

@schlimmchen
Copy link
Member

@helgeerbe Is this issue resolved from your point of view with the upstream changes in ca18d2c? Additionally, we now have 8bfb5c6, which I am going to propose upstream as well.

@helgeerbe
Copy link
Collaborator Author

There is definitely an improvement. I'm not sure, if we handle all memory allocations for json documents correctly throughout the whole code. But I think we should close this issue. And open specific tickets, if we find unfixed code.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants