-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
commit: fix ostree deployment on 64-bit inode fs #2874
Conversation
This commit addresses a bug that was causing ostree deployment to become corrupted on the large fs, when any package was installed using 'rpm-ostree install'. In such instances, multiple files were assigned the same inode. For example, the '/home' directory and a regular file 'pkg-get' were assigned the same inode (2147484070), making the deployment unusable. A root cause analysis was performed, running the process under gdb, which revealed a lossy conversion from guint64 to guint32, for example 6442451366 converted to 2147484070: (gdb) p name $10 = 0x7fe9224d2d70 "home" (gdb) p inode $73 = 6442451366 (gdb) s device=66311, modifier=0x7fe914791840) at src/libostree/ostree-repo-commit.c:1590 The conversion resulted in entirely independent files potentially receiving the same inode. The issue was discovered on PoC machine equipped with a large NVME (3.4TB), but the bug can be easily reproduced using `cosa run -m 4000 --qemu-size +3TB', followed by installation of any package using `rpm-ostree install`. The resulting deployment will be unusable due to many files being "corrupted" by the aforementioned issue.
Hi @aospan. Thanks for your PR. I'm waiting for a ostreedev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, ouch. Nice find. This bug has been lurking for a long time...I think we've mostly been saved by how most Linux filesystems keep inodes generally below 32 bits unless there's pressure to go higher.
/ok-to-test |
Side note: Rust doesn't allow silent integer truncations like this. Hmm...we could try building with -Werror=conversion...there's definitely stuff to fix for that in a quick check it looks like it may be tractable in an hour of work. Well, maybe...with this I also see warnings from glib.h like
But maybe we can juggle that with a pragma that only enables the warning for our C code, not headers (particularly not external ones). Well, a project for another time. Thanks for the fix! |
Absolutely, happy to contribute! Agreed, it's certainly essential to have some safeguards in place to defend against such bugs. |
I want to do a bit of a "risk analysis" on this. It's important to emphasize that I believe right now that this bug does not affect "plain ostree" usage, where the ostree client is fetching objects. This issue affects "builds" - that's where we do the device/inode caching. rpm-ostree's local installs here count as a "build". But, if one was doing an ostree based build system and using |
See ostreedev/ostree#2874 "commit: fix ostree deployment on 64-bit inode fs"
See ostreedev/ostree#2874 "commit: fix ostree deployment on 64-bit inode fs"
Thanks so much for finding this. It turns out that I was wrong...in the ostree-container flow today, we do use a devino cache to "flatten" container image layers, and so anything using the ostree-container flow is subject to this. To xref I'm going to be backporting this with high urgency to RHEL; see https://bugzilla.redhat.com/show_bug.cgi?id=2224081 |
I wrote a lot of new code to detect this situation (to prove it's happening): ostreedev/ostree-rs-ext#504 |
This commit addresses a bug that was causing ostree deployment to become corrupted on the large fs, when any package was installed using 'rpm-ostree install'.
In such instances, multiple files were assigned the same inode. For example, the '/home' directory and a regular file 'pkg-get' were assigned the same inode (2147484070), making the deployment unusable.
A root cause analysis was performed, running the process under gdb, which revealed a lossy conversion from guint64 to guint32, for example 6442451366 converted to 2147484070:
(gdb) p name
$10 = 0x7fe9224d2d70 "home"
(gdb) p inode
$73 = 6442451366
(gdb) s
device=66311, modifier=0x7fe914791840) at
src/libostree/ostree-repo-commit.c:1590
The conversion resulted in entirely independent files potentially receiving the same inode.
The issue was discovered on PoC machine equipped with a large NVME (3.4TB), but the bug can be easily reproduced using
cosa run -m 4000 --qemu-size +3TB', followed by installation of any package using
rpm-ostree install`. The resulting deployment will be unusable due to many files being "corrupted" by the aforementioned issue.