Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
compose: normalise underlying BDB files in RPM database
Berkeley DB has several issues that cause unreproducible builds: 1) Upon creation each file is assigned a unique ID generated using a mixture of process ID, current time, and some randomness. 2) Pages used to hold data to be written out to disk are not zeroed prior to use. This leads to arbitrary data from the current process being written out to disk. 3) Unused fields in structures are not zeroed leading to arbitrary stack data being written out to disk. Replacing the unique file ID causes no issues broadly but to ensure "sufficient" uniqueness these are replaced with a value generated by feeding the current time or the current value of SOURCE_DATE_EPOCH along with a partial file path into sha256 and using the first 20 bytes as the ID. For the other problems, areas known to be unused are found and zeroed out. In order to ensure no change to data, the `db_dump` utility is run prior to any changes and the output is hashed using sha256. After changes the `db_verify` utility is run and, assuming this is successful, `db_dump` is re-run and the hash of the contents is compared. Any variation is considered a failure. This change does not look at any potential reproducibility issues in the ndb or sqlite backends.
- Loading branch information