-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ostree archive's mtime modification results in slower python execution #1469
Comments
I think the ideal fix here is to change RPM to canonicalize those timestamps to 0 as well, basically bringing the libostree model there. Alternatively, cPython could learn to stop doing timestamp checks on read-only filesystems in general, or perhaps just |
I suspect in practice a bigger win though would be teaching cPython to generate a larger cache file that could be easily |
Seems like related to https://gitlab.com/freedesktop-sdk/freedesktop-sdk/issues/554 (although not identical root cause) |
I currently looking into a related issue in meta-updater, which creates an OSTree rootfs using OpenEmbedded: OpenEmbedded has reproducible build support. Setting the |
I've been debugging a related RHEL 9 issue and I found the following. On Fedora Silverblue 40 Beta, some of the For example:
(The code above returns 1689897600 on my non-Silverblue Fedora 39 system.) This is good because the However, apparently, only some of the
The See:
This is even broken for the standard library:
This seems like a design flaw of the mtime zeroing thing. When a Python app is running in a restricted SELinux context, any attempt to rewrite a
|
FWIW, ostree proper doesn't do anything with python modules besides changing the modification time like it does for all files. Recompiling must be happening at a different layer in Fedora. |
A few things here: At some point, we will switch to entirely relying on composefs which gives us all the advantages of an ostree-like model in terms of on-disk/page cache sharing, without the mtime canonicalization need to ensure hardlinking works. Today with e.g. podman (without composefs) content-identical files in distinct layers are not shared on disk or memory, partially because of this issue.
Hmm. That is odd. I was going to say I wasn't sure how that would be possible, but actually...on the build system, in some cases we may materialize cached filesystem trees using zeroed mtime, and if the Python code happens to be executed at build time, I think that could result in this. Python code not executed at build wouldn't rewrite its mtimes. That would explain some of the mystery here if true. |
I've been using composefs with ostree in CentOS Automotive Stream Distribution on a couple of devices the last few weeks and it's been working great FWIW |
But that's still running rpm-ostree which canonicalizes all mtimes to zero, right? There's a much bigger step to take in this project where we have a mode where we hard require composefs - and actually note that ostree very intentionally doesn't include mtimes in its metadata, so moving towards a model where they are included again would lead more closely to including e.g. OCI container or other metadata as source of truth. (All of this of course heavily intersects with https://reproducible-builds.org/ - it still doesn't make sense to let mtimes just randomly "float" in images as it will cause spurious changes, even though that's what happens with most image builds (including podman/docker) by default. At least nowdays with RPM one can canonicalize by setting |
This is by the way why freedesktop-sdk and all derivative runtimes use hash-based bytecode, not timestamp-based bytecode. (effectively that SOURC_DATE_EPOCH approach) It works correctly with ostree. |
Notably semantically correct use of SOURCE_DATE_EPOCH is not to set it to 0 but the timestamp when your sources have last been modified. |
We produce the same RPM content also for standard distributions where timestamp-based invalidation of bytecode cache files is faster. |
Solution ideas: A. Fedora Python RPM packages can switch from timestamp-based to hash-based invalidation
B. Python timestamp-based invalidation can special case mtime==0
C. ostree can regenerate all .pyc files when building
D. ostree can patch all .pyc files when building
Footnotes
|
Thanks for the detailed options! Could option C be implemented by running How would we do option D? |
You don't want to run it on all .py files, only those that already have their pyc files. There are .py files in /usr without .pyc files and we want to keep them that way. But otherwise, yes, this is correct.
If we edit the files in place and we never write anything if the third word is already zero, the hardlinks mentioned earlier should be preserved. The listed implementations of C a D assume we only care for Python bytecode for the "main" Python version (e.g. 3.12 in Fedora 40). If it needs to support all Pythons available in Fedora, it might be a bit more complex. |
Could you explain why we should not do that? Won't those Python files not be optimized as well? |
Only some .py files are importable modules. See https://fedoraproject.org/wiki/Changes/No_more_automagic_Python_bytecompilation#Detailed_Description on why we don't blindly bytecompile them all. |
For Fedora Atomic Desktops, we can plug this at the end of https://pagure.io/workstation-ostree-config/blob/main/f/fedora-common-ostree.yaml#_104, which is basically a script with full RW access to the system during the build of the ostree commit (image of the system). |
At that point, are the mtimes of files in /usr already zero? If they are, we can plug in both C and D. If they are not, we would need to pre-zero them before C (or use D). |
Let's try to find the rough list of commands to do option C & D: Option C:
Option D:
Given the above, it feels like option D would be easier to do.
Good point, I don't think they are, I'll give it a try. |
I've confirmed that they don't have a 0 mtimes yet as they are not part of the ostree commit yet at this point. So option D is likely the preferred route. |
all paths matching the following path regexes:
For option C, we would need to find them separately. For option D, we can find them all at once. For option C, we need to find the source. To do that, we replace the
Assume we group the sources to the following buckets:
(Other combinations should not exist in Fedora, but if we are paranoid, we need to check.) Now we call:
The
Absolutely. |
MIN_MAGIC = 3390 # The first magic number supporting PEP 552
ZERO = bytes((0, 0, 0, 0))
def pyc_set_zero_mtime(pyc_path):
with open(pyc_path, "r+b") as f:
w = f.read(4)
if len(w) < 4:
return
magic = (w[0] + (w[1] << 8) + (w[2] << 16) + (w[3] << 24)) & 0xFFFF
if magic < MIN_MAGIC:
invalidation = ZERO
else:
invalidation = f.read(4)
if len(invalidation) < 4:
return
if invalidation == ZERO:
f.write(ZERO) Here's a Python function that implements D for one given pyc path. It silently skips invalid (short) pyc files or pyc files with hash-based invalidation modes. It supports pyc files for Python 3.4+ |
We've been compiling all python modules at ostree build time on Endless for years. But then Debian doesn't include the compiled modules in the packages. Note the hard linking at compile time is mostly pointless here since ostree will already deduplicate the objects when committing. |
@cgwalters it's conceptually not build timestamp. It's .py file modification timestamp. Coincidentally this ends up being build timestamp in RPM but this is coincidence only and not they purpose of the timestamp. |
I've adapted your script to include finding the files:
But it fails on:
Investigating but someone with more Python experience might find the answer faster than me :). |
- regex = re.compile(".*/__pycache__/[^/]+\.cpython-.*(\.opt-1|\.opt-2)?\.pyc$")
+ regex = re.compile(r".*/__pycache__/[^/]+\.cpython-.*(\.opt-1|\.opt-2)?\.pyc$") |
Also, |
if __name__ == "__main__":
regex = re.compile(r".*/__pycache__/[^/]+\.cpython-.*(\.opt-1|\.opt-2)?\.pyc$")
for root, dirs, files in os.walk("/usr"):
for file in files:
path = os.path.join(root, file)
if regex.match(path):
print(path)
# pyc_set_zero_mtime(path) |
Looks better now indeed. Thanks |
PR to add the workaround for Fedora Atomic Desktops: https://pagure.io/workstation-ostree-config/pull-request/505 |
Something also to keep in mind with long-term solutions Fedora might move to checked hash as part of https://fedoraproject.org/wiki/Changes/ReproduciblePackageBuilds. The performance hit is quite negligible price for reproducible bytecode. |
That's unlikely. We use the timestamp invalidation mode and we clamp mtimes to SOURCE_DATE_EPOCH (or whatever is that called). |
OK so there's basically an impedance mismatch between the current container -> ostree flow. In the more medium term what I'm thinking is that we will move away from the existing ostree-container model implemented in https://github.com/ostreedev/ostree-rs-ext/ to one where the canonical storage is composefs backed. There's a lot going on on this topic, but I started on containers/composefs#286 related to this, which would be a potential new composers-native storage for both podman and bootc/ostree. |
xref https://github.com/keszybz/add-determinism/ which is a tool which forcibly canonicalizes python timestamps; it'd probably be good to glom onto that. |
Well, "installs that use it" are … all installs? Fedora/Suse/anybody-else-who-cares-about-reproducible-rpm-builds set I came here from keszybz/add-determinism#26. Generally, it looks like something that can be added to add-det without too much trouble. In particular, add-det has logic to preserve hard links, which came up in the discussion. (Even if the answer is that it doesn't matter for rpm-ostree, I think it's nicer to preserve them, not least because So let's say that we implement |
This comes from https://gitlab.com/fedora/bootc/tracker/-/issues/3 and the current workaround code is in https://pagure.io/workstation-ostree-config/pull-request/505. |
OK, so after looking at the script in 505, it seems that add-det should set embedded mtime to 0 unconditionally, and ignore mtimes. |
Well, Im not sure we want to unconditionally set mtime to 0. Other non-ostree systems likely dont need or want that. I think having it set as an argument which can be triggered by a caller who knows what they want is best. The logic has to remain fairly conditional as we use the conditional logic to verify the version of python before trying to replace any bytes since the location that the mtime is placed is version dependent. |
Right, sorry, my comment was not clear. What I wanted to say is: when invoked by ostree for the special ostree cleanup add-det should set embedded mtime to 0 unconditionally, and ignore mtimes. |
Pyc files store the mtime, size, and optionally hash of the original py file. Mtime is used to validate the cached bytecode: the mtime of the .py file (if present) must be equal to the stored value in the .pyc file. If not, the pyc file is considered stale. On ostree systems, all mtimes are set to 0. The problem is that the .py file is created first, then the byte compilation step creates .pyc with some embedded timestamp, and later ostree discards mtimes on all files. On the end system, the bytecode cache is considered invalid. This new handler does the two very simple things: 1. zero out the mtime in the .pyc file 2. set mtime (in the filesystem) to zero for the .py file This second step is also done by ostree, so strictly speaking, it's not necessary. But it's easy to do it, and this way the bytecode still matches the file on disk, so if we called Python after this operation, it would be able to use the bytecode and wouldn't try to rewrite it. Replaces #26. See ostreedev/ostree#1469, https://gitlab.com/fedora/bootc/tracker/-/issues/3, https://pagure.io/workstation-ostree-config/pull-request/505.
Please see if keszybz/add-determinism#27 would work for you: |
Fixes: ostreedev/ostree#1469 Assuming that add-determinism is installed on the system, apply add-determinism to set the embedded mtime in all .pyc files to zero. Signed-off-by: Luke Yang <luyang@redhat.com> Co-authored-by: Steven Presti <spresti@redhat.com>
Python mtime timestamp logic does not work well with ostree force 0 mtime. Use a small script to post-process all compiled '.pyc' and force the timestamp to 0. See: ostreedev/ostree#1469 Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
A lot of history here, but essentially today the pycache files generated at compose time are invalid because they're of course created before the epoch mtime canonicalization ostree does and pyc files include the mtime in their headers to know if they're out of date. Since `/usr` is not read-only in the layering flow, Python will happily regenerate the cache files as Python code gets executed. This isn't necessarily a big deal and actually would make dnf faster, but it's also awkward and confusing to have only a subset of Python code in the OS cached correctly and sloshing that diff around in container layers when what we really want is having it fixed correctly at base compose time. So let's just avoid it for consistency and to reduce the diff with the legacy variants. We do this by moving the base ones out of the way and then moving them back in place, possibly overwriting regenerated ones. Note that using `ENV PYTHONDONTWRITEBYTECODE=1` is not sufficient here; that doesn't seem to reach RPM scriptlets which use Python. See also: ostreedev/ostree#1469
A lot of history here, but essentially today the pycache files generated at compose time are invalid because they're of course created before the epoch mtime canonicalization ostree does and pyc files include the mtime in their headers to know if they're out of date. Since `/usr` is not read-only in the layering flow, Python will happily regenerate the cache files as Python code gets executed. This isn't necessarily a big deal and actually would make dnf faster, but it's also awkward and confusing to have only a subset of Python code in the OS cached correctly and sloshing that diff around in container layers when what we really want is having it fixed correctly at base compose time. So let's just avoid it for consistency and to reduce the diff with the legacy variants. We do this by moving the base ones out of the way and then moving them back in place, possibly overwriting regenerated ones. Note that using `ENV PYTHONDONTWRITEBYTECODE=1` is not sufficient here; that doesn't seem to reach RPM scriptlets which use Python. See also: ostreedev/ostree#1469
System information
High level problem
At a high level, the problem is that
atomic
and other python programs are slow by default.As we can clearly see, the atomic cli has a performance gain of 3x to 4x simply by letting it rewrite the pyc files. This performance issue will likely be shared by most other python software.
Why does this happen?
ostree sets a file's timestamps to 0:
ostree/src/libostree/ostree-repo-libarchive.c
Lines 986 to 989 in 6bf4b3e
Unfortunately, this results in some files being considered incorrect. For example, a
.pyc
file, per this blog, includes information about the 'mtime' of the.py
file it was sourced from.Since ostree modifies that timestamp, the
.pyc
file created during the rpmbuild becomes invalid and is ignored.We can verify this is the problem by using perf to notice where time is spent, or using strace to notice which files are changed that result in the performance speedup, e.g.:
Additional considerations
Note that python is not unique in using
mtime
to store information about whether a given cache file is up to date. I don't know off-hand of other software that's definitely impacted by this choice., but I wouldn't be surprised if the go toolchain had a similar problem(edit, nope, go'spkg
directory on usr seems to be handled fine)The text was updated successfully, but these errors were encountered: