-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tell in detail why remote building fails on -v #3927
base: master
Are you sure you want to change the base?
Tell in detail why remote building fails on -v #3927
Conversation
7468a80
to
5063f8c
Compare
Got a screenshot of output from this? |
581e999
to
ddd4556
Compare
I've squashed the commit suggestions. |
@bburdette I didn't make a screenshot but it's litearlly just another of the many lines in the
|
@nh2 could you rebase and we'll merge it? |
@Ericson2314 Can you help me rebase it? You deleted the message I enhanced in https://github.com/NixOS/nix/pull/3927/files#diff-19f999107b609d37cfb22c58e7f0bc1cf76edf1180e238dd6389e03cc279b604R4885 as part of #4114 here: 3bab1c5#diff-5784d1ad7c22601c75b86872534049bea5173377de5d61b7446452beabc16d3fL5324 Thanks! |
@Ericson2314 short ping :) |
🙏 this one is a huge UX improvement, it would be a shame to let it stale. |
I marked this as stale due to inactivity. → More info |
@Ericson2314 Short ping 2 years later ^ :) |
Just to chime in, I lost half an hour on this today. It'd be great to get this merged. |
Ah! I am seeing and all the pings this now :( |
As penance, I will just rebase this myself. |
ddd4556
to
28a52ff
Compare
@nh2 I have rebased, and applied the PR description template. |
As to the content of the new error messages, I hope @fricklerhandwerk can review them. |
Also further improve the final error message, giving a practical example of how to set machine features because that's what most people need to succeed in building large packages remotely.
28a52ff
to
b1d22a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be more explicit with the expectations that were failed as represented by the error message. I'd be in favor of having a very structured presentation similar to what we had in the profile conflict PR, e.g
error condition
expected: ...
actual: ...
hint on what to do
@roberth we may want to write a guideline :)
debug("declined remote machine '%s' for building '%s' because it does not have the needed system type '%s", m.storeUri, drvstr, neededSystem); | ||
} else if (!m.allSupported(requiredFeatures)) { | ||
std::string joinedFeatures = concatStringsSep(" ", requiredFeatures); // includes leading space | ||
debug("declined remote machine '%s' for building '%s' because it does not have all required features: [ %s ]", m.storeUri, drvstr, joinedFeatures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be explicit on what was expected and what is available.
debug("declined remote machine '%s' for building '%s' because it does not have all required features: [ %s ]", m.storeUri, drvstr, joinedFeatures); | ||
} else if (!m.mandatoryMet(requiredFeatures)) { | ||
std::string joinedFeatures = concatStringsSep(" ", requiredFeatures); // includes leading space | ||
debug("declined remote machine '%s' for building '%s' because the derivation does not have all mandatory features to build on that machine: [ %s ]", m.storeUri, drvstr, joinedFeatures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this means. I thought a machine may be lacking features to build a drv, how would it work the other way round?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely the message is backwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's accurate as is, but "decline" might not be the best word perhaps (for all of these)
"For example, to enable the 'big-parallel' feature, use:" | ||
"\n --builders 'yourbuilderhostname - - - - big-parallel'" | ||
"\nhttps://nixos.org/nix/manual/stable/advanced-topics/distributed-builds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should instead refer to the features section in man nix.conf
as we did for other hints in error messages recently.
"For example, to enable the 'big-parallel' feature, use:" | ||
"\n --builders 'yourbuilderhostname - - - - big-parallel'" | ||
"\nhttps://nixos.org/nix/manual/stable/advanced-topics/distributed-builds" | ||
"\nYou can enable -v verbosity to see why any machine " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"\nYou can enable -v verbosity to see why any machine " | |
"\nEnable -v verbosity to show why any machine " |
} else if (neededSystem != "builtin" && | ||
std::find(m.systemTypes.begin(), m.systemTypes.end(), neededSystem) | ||
== m.systemTypes.end()) { | ||
debug("declined remote machine '%s' for building '%s' because it does not have the needed system type '%s", m.storeUri, drvstr, neededSystem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one does it have then?
Explaining why nothing doesn't satisfy a set of constraints is always a recipe for long and complicated error messages. That's just the nature of the problem: there's a lot of data involved and we can't guess what should match up with what. Realistically the error message would have to be
You could sort the build machines by relevance and truncate after some limit (say, 3) to reduce the length of the message. To make sure that the algorithm for checking and the algorithm for giving reasons are in sync, we'll want to use some abstractions that we don't have yet, or perhaps we should accept that we're computing and remembering too much in the happy flow. |
If I read the code correctly, the debug outputs are per machine, so all we need at this point are slightly more verbose texts that simply also contain the expected properties. And I only propose to make them multiline and possibly indented to make sure that the additional visual noise doesn't preclude noticing the relevant bits. I don't see why this would require additional infrastructure. |
I assumed that you wanted to incorporate the details into the exception message instead, so that extra log verbosity isn't needed. If it remains a |
@roberth I have no opinion on whether it should be in the exception. Setting up the infra to lift it to the exception level sounds like quite some work, so I'd rather merge it as debug messages and allow for following up in a more principled manner when time allows. |
#7865 structured exceptions would help with this. |
m.mandatoryMet(requiredFeatures)) | ||
{ | ||
if (!m.enabled) { | ||
debug("declined remote machine '%s' because it is disabled", storeUri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug("declined remote machine '%s' because it is disabled", storeUri); | |
debug("avoiding remote machine '%s' because it is disabled", storeUri); |
How about avoid for all of these?
The problem with decline is that it seems like the machine is initiating the remote build or something.
Motivation
See #1572. Further improves upon #3897 (CC @bburdette).
Also further improve the final error message, giving a practical example of
how to set machine features because that's what most people need to succeed
in building large packages remotely.
Context
Checklist for maintainers
Maintainers: tick if completed or explain if not relevant
tests/**.sh
src/*/tests
tests/nixos/*
Priorities
Add 👍 to pull requests you find important.