-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalise various aspects of the RPM database that vary build-to-build #3165
Conversation
Hi @jeamland. Thanks for your PR. I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for the PR! I've left a few comments in review. I agree that some of those changes may be a bit controversial. Other developers are offline for a few days, so I'll leave this hanging for a bit so that they can chime in. As it stands, I think the shadow normalization logic may be good to land anyway quickly (but see comments). Let's maybe split that one to its own PR? |
Hi @jeamland, thanks for this PR!
Feels like handing of Not completely against landing something in rpm-ostree either, though that's a lot of bit-flipping behind librpm's back to carry. Re. Berkeley DB, support for writing it has been removed upstream, so I wouldn't care too much about that. You mentioned rpm 4.14 -- are you also trying to reproducibly build a system using that rpm version? In that case, it might be easier to carry the bdb tweaking in a separate utility that you run before |
I've split the shadow stuff into its own PR (#3174) as requested. Happy to continue discussion on the other changes here. |
So the With regards to Berkeley DB I suspect I'm not the only person who's going to be building images based on Oracle Linux 8 or RHEL 8 or the like or a while which means we have to care about 4.14 for a bit. If we had 4.15 I'd simply be looking at one of the other backends but unfortunately I don't really have that option unless I want to push hard in certain directions. I'm happy to make this purely an internal thing but for that to land I'll need the unified core argument changes in #3162 (or something equivalent) to land. Alternatively I'm also happy for these to land for now and get removed later once they're definitely not hanging around any more. I'll do a pass over these to address the feedback above. Would it be best to force-push those into this PR or would it be better to drop this PR and start a new one, especially since some of the pieces have been moved to #3174? Thanks for the discussion so far! |
Force pushing is fine and expected from my PoV!
Ultimately though, we do want to head towards reproducibility for standard container images that have RPMs inside, for similar reasons. And actually, that intersection becomes even more obvious with coreos/enhancements#7 I'd at least reach out to to rpm upstream, I am happy to champion this cause there. |
To clarify, even if rpm upstream accepts this, I am OK to carry normalization code here in the short term to unblock shipping it relatively soon. Particularly since you are in a position to e.g. provide custom build options, we can easily add something like Also: snce we already write the rpmdb as a separate phase, perhaps we could fork that off into a separate process, and do a crude hack of an It would actually make sense to do that forking unconditionally because it'd increase isolation versus the host daemon. We could even drop root privileges for doing it, just provide a writable tempdir that we take ownership of. |
8f1a2b4
to
12ea901
Compare
So I've retitled/redescribed this to focus on the RPM database aspects given those are the bits that remain. I've reworked the RPM database backend detection to address @lucab's feedback above. I've also started working with rpm upstream to try and get the necessary APIs in place so that we can remove the RPMTAG_INSTALLTID and RPMTAG_INSTALLTIME mucking around, although that will be dependent on having a version of librpm with those APIs in place (assuming they land): rpm-software-management/rpm#1803 Lastly, even if those ones land the BDB normalisation will need to hang around for as long as people are using BDB-based installations. I'm going to dig in to what might need to be done to SQLite and/or NDB backends and those either may not need anything or may be amenable to accepting patches but either way I think keeping this stuff is worthwhile. |
52a926b
to
3429534
Compare
Oh, and to respond to the LD_PRELOAD idea: I tried that, and I needed to override time(2) and it caused a lot of things to break. 🙃 |
3429534
to
a82ea32
Compare
/ok-to-test |
I can't see exactly what failed in the test that didn't pass. If you can point me at what the issue is I can look in to it. |
@jeamland the CI worker failed to spawn. I re-triggered the job and everything is green now. |
The logic to normalise rpmdb looks fine to me now, thanks! I haven't yet looked at the BDB logic, as I will need to learn more about internal details of the format first. |
a82ea32
to
3a7fe03
Compare
3a7fe03
to
58de328
Compare
So there's a basic smoke test in the Rust code now. I can have a go at a more comprehensive test but I'd need a better idea of how to get the tests running locally. Is there a stepwise guide anywhere? |
Uh, wow...really? That's rather bad. Seems almost CVE worthy. |
We all tend to use something like toolbox here but it's basically I'd probably add it to |
If I'm reading the source right (yes, that's how I worked all this out) then what ends up there is data that may have been in other database pages. Basically the code constructs a page cache and reuses no-longer-needed pages from that so I don't think you'll get random process data in there, just random other bits of database. |
RPM package headers may contain several values that are either timestamps or derived from timestamps. These introduce variation into the RPM database. This patch looks for the SOURCE_DATE_EPOCH environment variable and, if that is present, rewrites these values to match the value it contains.
We originally used (lazy) statics to hold the value of SOURCE_DATE_EPOCH if we were using it but this can interfere with unit testing.
Berkeley DB has several issues that cause unreproducible builds: 1) Upon creation each file is assigned a unique ID generated using a mixture of process ID, current time, and some randomness. 2) Pages used to hold data to be written out to disk are not zeroed prior to use. This leads to arbitrary data from the current process being written out to disk. 3) Unused fields in structures are not zeroed leading to arbitrary stack data being written out to disk. Replacing the unique file ID causes no issues broadly but to ensure "sufficient" uniqueness these are replaced with a value generated by feeding the current time or the current value of SOURCE_DATE_EPOCH along with a partial file path into sha256 and using the first 20 bytes as the ID. For the other problems, areas known to be unused are found and zeroed out. In order to ensure no change to data, the `db_dump` utility is run prior to any changes and the output is hashed using sha256. After changes the `db_verify` utility is run and, assuming this is successful, `db_dump` is re-run and the hash of the contents is compared. Any variation is considered a failure. This change does not look at any potential reproducibility issues in the ndb or sqlite backends.
58de328
to
3bd2bdb
Compare
Ok, this should address all the review feedback. I haven't done any further testing stuff yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome work!
I have some minor nits, but really, it can come after this merges.
Thanks so much for your contribution!
Nice work here! Thanks! |
@jeamland I just want you to know I've cited this PR so far at least 3-4 times as an exemplary pull request. One of the things I love about working in FOSS is when someone appears out of the blue (from my PoV) with a big, nice improvement. |
There are various parts of the compose process that result in build-to-build variations in the RPM database.
RPMTAG_INSTALLTIME
andRPMTAG_INSTALLTID
values in the RPM database are based on the time the compose process was run and do not respectSOURCE_DATE_EPOCH
.With regards to the second item, the
ndb
andsqlite
backends have not been examined yet as all my stuff is based around Oracle Linux 8 which uses rpm 4.14 which doesn't have them.Lastly I'm aware that what these patches do is rather... "vigorous"... and may not be to some people's taste. I'm happy to discuss putting these other flags or options or whatever solution is felt to be appropriate.