-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/qemu-vm: Allow building a Nix store image instead of relying on 9p + more #140792
Conversation
fe6016e
to
ce971b7
Compare
I wonder if something like fscache could be a solution. Potentially it gives the best of both worlds: not building an entire image, yet avoiding 9p for repeated accesses. Could you compare performance with |
@roberth Tried setting |
I have no idea but could https://virtio-fs.gitlab.io/ be an alternative? |
@mohe2015 Yeah, I was considering it, but figured it wouldn't give quite the same performance boost, plus it needs a userspace daemon running on the host, so this seemed like the safer bet. It could certainly be interesting to try, though - if the benchmarks are correct, it should give a decent boost over 9p :) |
I'm curious about the downsides of this. Will this increase disk usage or network traffic on hydra or the binary-cache? Do we need to worry about that? |
That currently requires cooperation from the Host OS as virtiofs requires a daemon running as root (and a socket available to QEmu during execution). |
Yeah, that was already mentioned above. As far as I can tell this solution has the disadvantage of more disk usage but the performance is just so much better |
@knedlsepp AFAIK, this should only increase disk usage on the builder. The disk image doesn't end up in the test's closure and I don't see why any of it should end up in the binary cache. Maybe someone with more insight into hydra can chime in, though :) |
1bfc013
to
67a6de6
Compare
Anything more to do here? Any objections to merging it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a potential simplification. Otherwise lgtm!
67a6de6
to
5d7977e
Compare
Okay, now it's done! |
shared = { source = ''"''${SHARED_DIR:-$TMPDIR/xchg}"''; target = "/tmp/shared"; }; | ||
nix-store = mkIf (!cfg.useNixStoreImage) { | ||
source = builtins.storeDir; | ||
target = "/nix/store"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
target = "/nix/store"; | |
target = "/nix/store"; # should be storeDir, but is not supported in scripts yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, technically, it should be set to whatever storeDir
was set to in the nix derivation used in the vm, which currently isn't exposed, but could be added to its passthru
attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was possible, that would be the solution, but you can't choose storeDir
from inside the Nix language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, you're right. I guess the reasonable thing to do then is just un-hardcode all the values.
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
While #144014 breaks the infinite recursion when
The first has no performance impact. The second does. |
@roberth It seems like hydra doesn't want to build the disk images due to their size. Raising the |
This may be a hard sell.
Not 100% sure, but I believe it caches everything it builds. You generally don't want to throw away your tools. |
A shame if it's true, since that would mean all tests using this will always fail :/
True, but I would have guessed that all tools we want to keep are primary inputs anyway. It would be interesting to know if there are any attributes that control this, like |
Yep, everything ends up in the cache, it seems. At least I'm able to fetch stuff that shouldn't be referenced by any build's final closure. Unless the outputs are GCd at some point, they would eat up quite a lot of disk space and be essentially useless. |
A potential solution is to override this setting in |
Yes, there's no GC unfortunately. I'm not sure if we should keep all this in the binary cache. As long as there's no way to tell hydra to not upload that, I'm afraid we might need to revert the tests :-/
|
Well, currently, hydra fails to build the disk images since they larger than It's all a bit ironic, since making this build better on hydra was one of my goals with this - the GitLab |
|
As for |
Another possibility is to create an image not for the entire system, but only for the paths that matter for performance. When the filesystem is smaller, it may also be feasible to generate it at the last minute, in the test runner, so that it doesn't have be stored in the store. If the filesystem generation is quick enough, that'd be worthwhile. This can be done with Instead of
So our options for improving this are:
|
I've done some exploration with adding an overlay containing a subset of derivations, with squashfs and ext4. Conclusion so far: a non-stored squashfs is a viable replacement for a stored ext4 image in the gitlab test, but not quite as efficient as what could be achieved by patching Squashfs work in #181537. Findings
Open questions:
My data and some notes: ext4 store image (this PR baseline) 9p host store + reversed overlay (ineffective) (probably close to baseline before this PR) 9p host store + squashfs overlay 9p host store + uncompressed squashfs overlay 9p host store + erofs overlay (default options) erofs (writableStore = false) 9p host store + ext4 overlay 9p host store + uncompressed squashfs overlay with entire Hardware: AMD Ryzen 9 3900 12-Core |
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
Motivation for this change
The goal of this PR is to speed up some IO heavy tests by building a Nix store disk image instead of using 9p to access the host store.
To this end, the following things are done:
make-disk-image.nix
:onlyNixStore
argument, which allows building an image containing only the Nix storeinstallBootLoader
argument, which seemed to have been accidentally removedcopyChannel
argument, to control whether a default nix channel is copied to the imageadditionalPaths
argument to allow additional paths to be copied to the image's nix storeqemu-vm.nix
:useNixStoreImage
option, which controls whether an image is used rather than 9ppathsInNixDB
option to the more generaladditionalPaths
, since its purpose is different when building an imagetrivial-builders.nix
:writeStringReferencesToFile
, a builder which extracts a string's references to derivations and paths and writes them to a text file, removing the input string itself from the dependency graphtesting-python.nix
:additionalPaths
, since they need to be explicitly made available when building an imageThese changes are then put to use in the
discourse
andgitlab
tests. With my laptop (AMD Ryzen 7 PRO 2700U) as the reference system, I see the following test run times:Discourse
No change:
25 mins, 49 secs
Building a Nix store image:
4 mins, 44 secs
Building a Nix store image and bumping the core count to
4
:3 mins, 6 secs
GitLab
No change:
Times out after 28 mins
Building a Nix store image:
7 mins, 48 secs
Building a Nix store image and bumping the core count to
4
:7 mins, 17 secs
The times include the time it takes to build the image (~40 secs for
discourse
and ~1 min, 20 secs forgitlab
).Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)