-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take 2 on #[ram]
soundness
#1677
Conversation
It will now only zero persistent ram after initial boot, since, I think1 that's the only time the RTC ram is not preserved. IMO, this should be expanded to all the reset reasons that could theoretically happen before the init had finished. It seems to me that that would be anything not caused by a watchdog or software (for S3: brown out, clock glitch, efuse error, and usb uart/jtag resets). I also updated Additionally, I noticed the Footnotes
|
I tend to agree - maybe good to hear others opinion on that, too |
Looks quite good overall - I guess after adding a CHANGELOG.md entry this should be good to go One thing maybe worth considering would be to have a (doc-hidden) function in esp-hal like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking quite good to me, thanks for taking the time to tackle those issues. Once CHANGELOG.md
has been updated I think this should be good to go. I'll give @bjoernQ the final say on this, though.
Great. My last concern is with the edge case of an externally triggered RTC RAM-preserving reset (brown out, power glitch, jtag reset, etc.) occurring very early in the first boot such that the zero initialization gets skipped, causing undefined behavior. If this is a worry, which kinds of resets should be added to the list to trigger zeroing? I wrote the condition in assembly to match the rest of the risc-v init, since I'm not sure how it interacts with the soundness issues that prompted the decision to avoid Rust there. |
Probably all the reset reasons which are chip-reset or system-reset should always trigger zeroing: (from C6 TRM)
Yes - makes sense. In theory when being cautious writing such a function it should be okay but sure - in assembly it's easier to know what really happens 👍 |
If that looks good, I'll update the changelog and rebase. The risc-v code could also use some testing on real chips—I've only been able to test on an S3 and qemu. |
Seems like the description in the TRM fooled me a bit and "System Reset: resets the whole digital system, including LP system." doesn't mean LP/RTC RAM is reset. So only "Chip Reset: resets the whole chip." seems to be a reason to zero the memory. (Sorry for the confusion) Maybe we should just assume "Brown-out system reset" is "Chip Reset" (it can also be System Reset according to the TRM) and only zero out memory for these two reasons |
I interpreted that to mean that a minor brown out could cause a system reset (preserving RTC RAM) and would give a reset code of 0x0F, while a more severe brownout could cause a chip reset with code 0x01. If it could give code 0x0F even after causing a chip reset, then I agree—definitely need to init after that. I had expanded the list to guard against something like this:
I am unsure which resets could occur at step 2, hence the long list. Perhaps anything that could trigger a reset that early would result in Perhaps the time between a reset that could cause the above sequence becoming possible and the ram init finishing is simply so short that it should not be considered. I personally dislike that solution as a soundness-purist, but I'll follow your lead there. |
t.b.h. I guess the above scenario is not completely impossible in a development setup but maybe unlikely in production - not too sure At least not zeroing on e.g. WDT resets can be useful, I guess. On the other hand, if code is manipulating data stored in RTC RAM and gets interrupted by a WDT reset (or any other data preserving reset) the data might be invalid then. I start to wonder if there would be a 100% solution other than e.g. using checksums? |
Oh yeah, interrupted writes are an issue. What about this?
Also, from reading this ESP-IDF source, it looks like some cases of brownout resets are detected by the IDF, so Shall I split the fix for #1650 out into its own PR so that isn't held up by finalizing these specifics? |
Yes, we probably should explain these things more
That is basically just
Yes, sounds good. Regarding splitting the PR: Ideally, I would love to see all of this in the next release - not sure if splitting out parts of the PR helps or causes more work for you. I'm fine with both I guess |
I noticed this note on
Combined with this part of the IDF, I think that a chip level brownout will not give
I guess it's close, yeah. But this can be used without any #[ram(rtc_fast, persistent)]
static BOOT_MODE: AtomicU8 = AtomicU8::new(0);
match BOOT_MODE.load(order) {
// ...
} |
Well, I missed the note in more recent versions of the |
Looks good but I'm on vacation this week -not sure if and when I will get to this before next week. So, if anyone likes to review this: feel free |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry forgot about that 😅
@MabezDev would you mind taking a look at the changes to the assembly in riscv-rt
? I think everything looks okay, just want another set of eyes on that specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay on my end, this LGTM, thanks for taking the time to write this PR!
Submission Checklist 📝
cargo xtask fmt-packages
command to ensure that all changed code is formatted correctly.CHANGELOG.md
in the proper section.Extra:
Pull Request Details 📖
Description
Closes #1110, closes #1650, closes #1649
New PR as this is a full rewrite. It also fixes #1650 since both
#[ram(persistent)]
and#[ram(zeroed)]
require the same check that the type can take an all-zero bit pattern. I usedbytemuck::Zeroable
since it seems like the most widely used, rather than implement. It did mean addingbytemuck
as a dependency ofesp-hal
, but I used version 1.0.0 to maximize compatibility with whatever version is probably already in users' dependency graphs.Error message for usage on non-Zeroable types
If exposing the
#[doc(hidden)]
function name in the error message is undesirable, I use the method I did in #1649.Testing
The updated ram example ran on an S3 dev board. I do not have any of the RISC-V ESPs, so someone else will need to test that once it's implemented.
Additional To-dos
esp-riscv-rt