Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

littlefs checksum behaviour #843

Open
ebasle opened this issue Jun 16, 2023 · 15 comments
Open

littlefs checksum behaviour #843

ebasle opened this issue Jun 16, 2023 · 15 comments

Comments

@ebasle
Copy link

ebasle commented Jun 16, 2023

Hello,

From previous reading I understand that littlefs CRC is calculated only on the file metadata and not the actual file data.
To ensure file data integrity in my application, I would like to implement my own integrity check at file level using file attributes.

I wanted to check what happens when file data integrity is altered so I made the following test:

  1. Open/Create a file "file1" (FS_O_CREATE | FS_O_RDWR)
  2. Write "hello1"
  3. Close "file1"
  4. Read file size ---> 6
  5. Using a debugger tool, modify (in the flash device) the "file1" data to "helli1" (1 byte altered)
  6. Open "file1" (LFS_O_RDWR)
  7. Tell ---> 0 (to be sure I point to the start of file)
  8. Read 6 bytes ---> 0 byte read
  9. Close "file1"
  10. Read file size ---> 0

note: littlefs 2.5 is integrated and works otherwise fine in my environment.

A few questions/remarks:

  • Step6: Open does not return error ---> OK with that since if I understand littlefs is not aware of integrity loss on file data
  • Step8: I was expecting to be able to read the 6 bytes here ("helli1"). So it seems that littlefs detected an integrity loss on the file data and took the responsibility to discard the file to open a new one?

Memory state before step5

initial_state

Memory state after step5

file_data_altered

Is this the expected behavior or am I missing something?

With this behavior I don't see how I can perform my custom integrity check since file data is lost when trying to open the file.

Thank you

@geky
Copy link
Member

geky commented Jun 20, 2023

Hi @ebasle, I can see where this can be confusing, and in a dangerous way.

To be clear, littlefs does not provide any form of error detection at either the data or metadata level.

It does checksum the metadata, but in a way that only provides power-loss guarantees. It doesn't provide protection against general bitrot like metadata checksumming in other filesystems.

On a checksum failure, littlefs assumes a power-loss must have occurred and rolls back that metadata transaction. But since littlefs is built out of many independent metadata logs, this can put littlefs into an unstable state.

When you modify hello1 -> helli1, it breaks the metadata checksum causing littlefs to assume a powerloss occurred. littlefs then rolls that metadata log back to a time when that file wasn't written, since that's the last time a checksum was valid. That's most likely what's happening (note that the hello1 is small enough to be inlined, so it's a part of a metadata log and doesn't get its own data block yet).


Unfortunately, the only way to provide reliable error detection with littlefs right now would need to be under the filesystem, so at the block-device layer.

This may not be that difficult, though it is extra work.

One scheme:

  1. Break each block into some number of slices, where each slice is checksummed:

           prog_size       checksum
     .---------+---------. .---+---.
    +---------------------+---------+ -.
    |        data         |  csum   |  |
    +---------------------+---------+  |
    |        data         |  csum   |  |
    +---------------------+---------+  |
    |        data         |  csum   |  +- block_size
    +---------------------+---------+  |
    |              ...              |  |
    +---------------------+---------+  |
    |        data         |  csum   |  |
    +---------------------+---------+ -'
    
  2. Set read_size/prog_size to the slice size minus the checksum, and block_size to the effective block size minus the checksums. This will limit littlefs to only read/write in units of this slice.

  3. On read/write, validate each slices checksum. If you return LFS_ERR_CORRUPT, littlefs should report any errors to the user.

This can be extended to full ECC, but that's another can of worms.

Depending on what you set your prog_size to, you can trade off write granularity for storage overhead. A 15-byte slice + 1-byte CRC, as an arbitrary example, would probably be fairly unnoticeable granularity at ~7% storage overhead with detection of any 4 bit errors (depending on polynomial).


The good news is I do have some some features in the works that should improve the state of things:

  1. Global checksum => should prevent invalid rollback, allow external storage for no rollback
  2. Metadata redundancy => should allow error correction at the metadata level

Unfortunately these are only in the design phase and will take some time to implement.

@ebasle
Copy link
Author

ebasle commented Jun 21, 2023

Hello @geky

Thank you for your answer!
I understand the idea of adding the error detection at the block-device-level however I not sure about how to configure littlefs to achieve this.

In my case:

  • I use a 8mB NOR flash with 4kB erasable sectors and 256b pages
  • I would like to use a 32 bytes HMAC instead of a CRC

And this is my current littlefs configuration:

  • read_size = 256
  • prog_size = 256
  • block_size = 4096
  • block_count = 2048
  • cache_size = 256
  • lookahead_size = 32
  • block_cycles = 512

So if I follow your instructions and setup 256b slices, it would result in:

  • read_size = 224 (256-32)
  • prog_size = 224 (256-32)
  • block_size = 3584 (4096-32*16) ?
  • block_count = 2048 (8388608‬/4096) ?
  • cache_size = 224 (256-32) ?
  • lookahead_size = 32
  • block_cycles = 512

I am really not sure about the block_size/block_count/cache_size values.
For example I wonder how littlefs could still manage to erase 4kB sectors with this config.
Could you point me toward a correct littlefs configuration for this use case? Thank you.

@geky
Copy link
Member

geky commented Jun 21, 2023

That configuration looks correct to me.

From littlefs's perspective, it really is only managing 3.5KiB sectors, since that's the amount of usable space. The trick is that the block device is the one doing the work to map the 3.5KiB sectors -> 4KiB sectors.

This does make implementing the block device a bit trickier. Here's some pseudocode of what a block device might look like, sorry if there's bugs:

PROG_SIZE = 224
HMAC_SIZE = 32

int bd_read(block, off, buffer, size):
    assert(off % PROG_SIZE == 0)
    assert(size % PROG_SIZE == 0)

    # block address is unchanged, but we need to map off -> raw_off
    raw_off = (off / PROG_SIZE) * (PROG_SIZE+HMAC_SIZE)

    # littlefs may read in _multiples_ of read_size, handle this here
    while size > 0:
        # buffer may not have enough space for hmac, can either allocate
        # another buffer here, or call rawbd_read twice
        uint8_t raw_buffer[PROG_SIZE+HMAC_SIZE];
        rawbd_read(block, raw_off, raw_buffer, PROG_SIZE+HMAC_SIZE)

        # find corrupt blocks here
        if !hmac_validate(raw_buffer):
            return LFS_ERR_CORRUPT

        memcpy(buffer, raw_buffer, PROG_SIZE)
        buffer += PROG_SIZE
        size -= PROG_SIZE

    return 0

int bd_prog(block, off, buffer, size):
    assert(off % PROG_SIZE == 0)
    assert(size % PROG_SIZE == 0)

    # block address is unchanged, but we need to map off -> raw_off
    raw_off = (off / PROG_SIZE) * (PROG_SIZE+HMAC_SIZE)

    # littlefs may read in _multiples_ of read_size, handle this here
    while size > 0:
        # buffer may not have enough space for hmac, can either allocate
        # another buffer here, or call rawbd_read twice
        uint8_t raw_buffer[PROG_SIZE+HMAC_SIZE];
        memcpy(raw_buffer, buffer, PROG_SIZE)
        hmac_hmac(raw_buffer)

        rawbd_prog(block, raw_off, raw_buffer, PROG_SIZE+HMAC_SIZE)

        buffer += PROG_SIZE
        size -= PROG_SIZE

    return 0

int bd_erase(block):
    # block address is unchanged
    return rawbd_erase(block)

It's also worth noting littlefs reads back every piece of data written to see if it was written correctly. So the above indirectly validates writes even though it doesn't appear that way.

@ebasle
Copy link
Author

ebasle commented Jun 26, 2023

Hello @geky, thanks for the detailed response.

So I implemented the HMAC on the block device layer as you describe.
HMAC computation/verification works fine and I return LFS_ERR_CORRUPT to littlefs when I am not able to verify the HMAC.

However, when I corrupt a file (as explained in my inital post), I witness the following behavior when the block device layer returns LFS_ERR_CORRUPT to littlefs:
1 - littlefs tries to return the previous valid version of the file (might be empty if it has not been written yet)
2 - littlefs opencfg API still returns 0 (LFS_ERR_OK) to the application
-> It is the same behavior as I had with the HMAC on the file level

Additionally, it looks like having a corrupted sector/slice cause a side effect on other file open (lfs_opencfg).
Let me explain with this test:

  1. open & write file1 (hello1)
  2. open & write file2 (hello2)
  3. open & write file3 (hello3)
  4. Corrupt file2
  5. open & read file1 --> OK
  6. open & read file2 --> OK (but roll back on empty file)
  7. open & read file3 --> KO (LFS_ERR_NOENT)
    When inspecting flash access I can see the open fails because littlefs read the corrupted sector (even if its not part of file3).
    The block device layer returns LFS_ERR_CORRUPT (as expected) which in this case cause littlefs to return LFS_ERR_NOENT to the application layer.

In this test, I was expecting:

  • step6: littlefs API returns LFS_ERR_CORRUPT when opening file2
  • step7: No error when opening file3

In the end I just would like to make littlefs return the information that a file has been corrupted to the application.
However seeing this behavior make me think it might be complicated.. Any thoughts or leads? Thank you.

@geky
Copy link
Member

geky commented Jun 26, 2023

Ah you are right. Sorry, I made a mistake. I'm not sure what I was thinking.

littlefs can't tell the difference between a bad HMAC and power-loss for the same reason it can't rely on its own checksum to detect bit-errors. Both incomplete writes and bit-errors look "bad" so littlefs assumes a power-loss occurred.

I'm trying to think of a workaround but currently turning up empty. Returning LFS_ERR_IO gets reported directly to the user without any special handling in littlefs, but it's probably not what you want since it would also make power-loss reported as LFS_ERR_IO. Same for reporting errors out-of-band.

I'll let you know if I think of anything, but at the moment it may not be possible to littlefs to handle both power-loss and bit-errors until either metadata redundancy or global checksums are added.


Additionally, it looks like having a corrupted sector/slice cause a side effect on other file open (lfs_opencfg).

In littlefs, files share metadata logs. Most likely all three of these files are in the same metadata log, and when file 2 is corrupted littlefs stops parsing the metadata log before it finds file 3. Over times metadata logs are eventually compacted, so what gets rolled back may be different. This makes metadata rollbacks particularly nasty.

@ebasle
Copy link
Author

ebasle commented Jul 4, 2023

Thank you time @geky, let me know if you think of anything!

@mikk-leini-krakul
Copy link

Happened to read this discussion. Do I understand correctly that a single bit error in metadata could cause revert back to old file contents (if it exists)? If so, then it doesn't sound good and I'm wondering if it makes sense to write critical files, such as configuration files, at least twice?

@geky
Copy link
Member

geky commented Sep 19, 2024

@mikk-leini-krakul, it's a bit worse in that a single bit error can corrupt the filesystem's metadata itself. Unless you have some form of error-correction in the block device layer.

There are two features in the works to improve this: metadata redundancy and global checksums, but these are not ready yet.

The best option right now is to correct bit errors in the block device layer using a Reed-Solomon/BCH code or something else. Some block devices even provide error-correction in hardware. I realize this is not a great answer but it is what it is.

@geky
Copy link
Member

geky commented Sep 19, 2024

@ebasle, it's a bit late and I realize you're probably not looking at this problem anymore, but for posterity it's probably worth noting you can turn any checksum/hash/hmac into a limited error correcting code via brute force.

If the hash fails, you can try recalculating the hash with the first bit flipped, then with the second bit flipped, then the third, etc, to see if any single bit flips would result in a valid hash. You could do the same for 2 bit flips, but it would be $O(n^2)$ and grows $O(n^b)$ per $b$ bit flips so it's probably only practical for correcting 1 or maybe 2 bit errors.

And depending on how expensive your hash/hmac is to calculate, this approach may not be tenable... Nesting your hmac inside an RS/BCH/CRC error correcting code would probably be much more efficient at a storage cost.

@mikk-leini-krakul
Copy link

mikk-leini-krakul commented Sep 23, 2024

@mikk-leini-krakul, it's a bit worse in that a single bit error can corrupt the filesystem's metadata itself. Unless you have some form of error-correction in the block device layer.

There are two features in the works to improve this: metadata redundancy and global checksums, but these are not ready yet.

The best option right now is to correct bit errors in the block device layer using a Reed-Solomon/BCH code or something else. Some block devices even provide error-correction in hardware. I realize this is not a great answer but it is what it is.

I realized it needs quite a lot of time to do full impact analysis of soft errors on LFS and that time I don't have. So I did like the LFS design documentation suggested and used software ECC. Unfortunately there are a very few FOSS options. I found single-bit error correcting ECC from Linux kernel:
https://github.com/torvalds/linux/blob/master/drivers/mtd/nand/ecc-sw-hamming.c
And here's some blog-style explanation about it:
https://docs.kernel.org/driver-api/mtd/nand_ecc.html

On every external QSPI Flash 256 byte page I gave LFS 252 bytes of payload and used 3 bytes for ECC. 1 byte left for CRC-8 (although 16 or 32 bit CRC would be much safer). In addition, every write is done with verify (read-compare). Can't say if this is sufficient or not, but it certainly feels better than no ECC/CRC.

@jacobschloss
Copy link

jacobschloss commented Sep 25, 2024

I'm interested in doing something similar, but on a small eeprom or fram using crc16 with small blocks (150b-180b LFS block, 30b LFS prog size, 32b eeprom pages split into 30b data and 2b crc). I am hoping to catch bus-transfer errors more than flipped bits in flash so my block device layer would be able to retry the read a few times (eg before lfs starts trying to do power loss recovery).

Previously I was treating erase as a no-op and has been working fine - #77 (comment)

With block-level CRCs, If I don't scrub the eeprom and pre-fill all blocks with valid crcs lfs_mount is failing after format (which succeeds). It seems it is trying to read a block it has not written yet, so I return LFS_ERR_CORRUPT and it fails during lfs_mount.

Is this expected to need to make all the blocks valid for LFS first, when using block level crcs? I'd like to avoid the extra flash program cycles of an explicit erase if I can. It is also possible my block layer still has bugs, I just hacked it together for a quick test. I'll play with it some more.

@geky
Copy link
Member

geky commented Oct 2, 2024

I realized it needs quite a lot of time to do full impact analysis of soft errors on LFS and that time I don't have. So I did like the LFS design documentation suggested and used software ECC. Unfortunately there are a very few FOSS options. I found single-bit error correcting ECC from Linux kernel:
https://github.com/torvalds/linux/blob/master/drivers/mtd/nand/ecc-sw-hamming.c
And here's some blog-style explanation about it:
https://docs.kernel.org/driver-api/mtd/nand_ecc.html

That is interesting, thanks for sharing. I've gotten kind of stuck in the theory side of things, so don't have the best understanding of what's out there practically.

Using Hamming codes is an interesting choice because I believe they're limited to single-bit error correction? It seems like you could do something better with crc32s, but maybe I'm missing something...


I am hoping to catch bus-transfer errors more than flipped bits in flash so my block device layer would be able to retry the read a few times (eg before lfs starts trying to do power loss recovery).

Quite an interesting use case, should work fine.

Is this expected to need to make all the blocks valid for LFS first, when using block level crcs?

No, you should be able to return LFS_ERR_CORRUPT and littlefs will treat it as not written yet. Though you do need to return LFS_ERR_CORRUPT consistently until the block is written.

With block-level CRCs, If I don't scrub the eeprom and pre-fill all blocks with valid crcs lfs_mount is failing after format (which succeeds).

This is especially curious because littlefs tries to mount the superblock in lfs_format before returning success:

littlefs/lfs.c

Lines 4385 to 4386 in d01280e

// sanity check that fetch works
err = lfs_dir_fetch(lfs, &root, (const lfs_block_t[2]){0, 1});

Is it possible there's some issue on the device's RAM side of things?

@jacobschloss
Copy link

jacobschloss commented Oct 14, 2024

Quite an interesting use case, should work fine.

Is it possible there's some issue on the device's RAM side of things?

Looks like just a bug in my code, I was not correctly handling certain read-modify-write cases spanning pages, and what to do about RMW cycles on blocks that were never erased but also not fully written (I lazily zero fill them now if there is a partial page write and the page had a bad crc, and assume it was unallocated). After fixing that, both my block layer and LFS are happy with or without erasing. I'll play with this some more. Thanks!

@BenBE
Copy link

BenBE commented Oct 14, 2024

Regarding software ECC, there's a few things you can do. While hamming codes are the easiest, there's also BCH, Viterbi and Golay codes for bit errors and Reed Solomon for burst errors. There are implementations available for all of these (written one myself for all of these). If on the bus side you were worried about missing data blocks (erasures), you could even use fountain codes like RaptorQ (although that one is a hassle to get working without malloc; guess how I know) …

@geky
Copy link
Member

geky commented Oct 31, 2024

I figured I'd also take a crack at it since these seems genuinely useful for littlefs users:

Not exactly drop-in solutions, because of existing API issues, but they shouldn't be too difficult to adapt to your own use cases.

I'll probably add them to littlefs's README.md as examples at some point. Feedback welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants