-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib.path.subpath.{isValid,normalise}: init #205190
Conversation
3d97c55
to
c2584ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix a crucial term.
In #200718 we originally wanted to use |
8f1e221
to
6e85f9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the cleanest, most self-explaining code I have seen in months. Only lots of nitpicks about phrasing in the docs.
I added a few suggestions which add periods to the argument lines, but maybe this is a bad idea. Specifically it makes those lines which end in ./.
and the like harder to read, and I don't think it makes sense to reword those artificially. Feel free to amend them and remove the periods if you don't agree, or make the change consistent otherwise.
Most importantly I think we should be careful not to overdo the subpath thing. We can keep it generic where we can be sure it will stay generic. That would allow for minimal changes in case we ever need to extend the library towards actual relative paths in a safe, backwards-compatible way.
eac0c14
to
a568997
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't even know if we need a relative normalisation yet. Personally I don't think we'll ever need it.
That's an argument in favor of not making that distinction in the name and go for path.relative.normalise
as we originally intended. We can still say in lib.path.append
that it takes a subpath as a second argument, as defined in the design document. If we should ever need it, relaxing relative paths and splitting out a subpath variant will not break existing code, and can be supported by warnings to consumers.
@roberth please review. Other than the last open naming question this look fine to merge to me.
lib/path.nix
Outdated
# Splits and normalises a subpath string into its components. | ||
# Errors for ".." components and doesn't include "." components | ||
splitSubpath = path: errorPrefix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Splits and normalises a subpath string into its components. | |
# Errors for ".." components and doesn't include "." components | |
splitSubpath = path: errorPrefix: | |
# Split and normalise a relative path into its components, skipping "." components. | |
split = path: |
There is nothing specific to subpaths left here. Since this function is internal we may as well use a short name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd run into infinite recursion with split
though, because we use inherit (builtins) split
and use it in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitRelPath
and joinRelPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh neat. Also similarly: You can select text from a GitHub comment, then click "Quote Reply" and it will insert only the selected text as a quote.
This is an internal name anyways, we don't need to care about the name that much imo :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we use inherit (builtins) split and use it in this function
You can rename the import to splitString
! Okay, this is getting into a level of detail that suggests the implementation is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah "Quote Reply" is also nice.
internal
Is it a scandal if no-one knows?
I guess what I was going to say is that "relative" isn't quite right here.
because we use
inherit (builtins) split
We don't have to.
I guess the risk is someone might want to expose the function and then suddenly it's not internal anymore. Scandalous!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add you @roberth and @fricklerhandwerk as code owners of lib/path.{nix,md}
? I'm also already a code owner of lib
. This way we'll get notifications for future updates to this library
lib/path.nix
Outdated
if ! isString value then | ||
throw "${errorPrefix}:\n Not a string" | ||
else if value == "" then | ||
throw "${errorPrefix}:\n The string is empty" | ||
else if substring 0 1 value == "/" then | ||
throw "${errorPrefix}:\n The string is an absolute path because it starts with `/`" | ||
else if match "(.*/)?\\.\\.(/.*)?" value != null then | ||
# We don't support ".." components, see ./path.md | ||
throw "${errorPrefix}:\n The string contains a `..` component, which is not allowed in subpaths" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ! isString value then | |
throw "${errorPrefix}:\n Not a string" | |
else if value == "" then | |
throw "${errorPrefix}:\n The string is empty" | |
else if substring 0 1 value == "/" then | |
throw "${errorPrefix}:\n The string is an absolute path because it starts with `/`" | |
else if match "(.*/)?\\.\\.(/.*)?" value != null then | |
# We don't support ".." components, see ./path.md | |
throw "${errorPrefix}:\n The string contains a `..` component, which is not allowed in subpaths" | |
if ! isString value then throw "${errorPrefix}: | |
Not a string" | |
else if value == "" then throw "${errorPrefix}: | |
The string is empty" | |
else if substring 0 1 value == "/" then throw "${errorPrefix}: | |
The string is an absolute path because it starts with `/`" | |
else if match "(.*/)?\\.\\.(/.*)?" value != null then throw "${errorPrefix}: | |
The string contains a `..` component, which is not allowed in subpaths" |
If it's an ugly hack already, why not make it a creepy hack that leverages the code's own indentation! 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha damn, I'd rather not because it's really not clear how much it indents like this, but this is interesting.
We could do this though:
if ! isString value then throw ''
${errorPrefix}:
Not a string''
else if value == "" then throw ''
${errorPrefix}:
The string is empty''
else if substring 0 1 value == "/" then throw ''
${errorPrefix}:
The string is an absolute path because it starts with `/`''
# We don't support ".." components, see ./path.md
else if match "(.*/)?\\.\\.(/.*)?" value != null then throw ''
${errorPrefix}:
The string contains a `..` component, which is not allowed in subpaths''
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least this still expresses intent, rather than a potential misunderstanding that kinda worked out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add the value
value to the error message when appropriate. Should make troubleshooting a lot quicker in many cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value is included in the prefix:
nix-repl> path.subpath.normalise "/foo/bar"
error: lib.path.subpath.normalise: Argument "/foo/bar" is not a valid subpath string:
The string is an absolute path because it starts with `/`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I would like to a see a streaming
toPretty
version that can also handle failures and early exit at some point though, see #98761 for a previous attempt of mine)
You can't catch aborts, and currently not even C++ really catches a stack overflow... and then there's non-termination of course.
Not entirely unsolvable for the purpose of error messages, in theory, but not something today's Nix will let us solve.
Perhaps a nice trick for attrsets specifically is to make use of the fact that at least the keys exist. If the attrNames are few enough, we could print them; otherwise show the "first" and "last" attribute name and perhaps make use of _type
if there is one, if you're in a daring mood.
Sounds like we could have a safePrint
function for such heuristics. It's a general problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time I try to do common-sense things in the Nix language it feels like writing a standard library from scratch. Even when writing a standard library from scratch.
I also think it should be good enough to account for the common errors at this point. The Nixpkgs Architecture Team can still open the topic of the standard library once the package reorg is done, and address the issue wholesale, with a targeted effort. Otherwise other targeted efforts like this one will constantly stall on side issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time I try to do common-sense things in the Nix language it feels like writing a standard library from scratch. Even when writing a standard library from scratch.
That's an indication that such code shouldn't be written in Nix at all. It's not a general purpose language and attempts to use it as such will always be painful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact it's not all that painful, it just feels a bit like pioneering because the tools are not readily available.
The reason why "just don't use the Nix language for that purpose" doesn't always work is that the alternative is to use intermediate built steps. And you want to avoid those at almost all costs, because they are orders of magnitude slower than some clumsy string wrangling.
And no, I don't think building the tools into the language wholesale is the right answer to that issue. Because that's where you surely won't ever get rid of unfortunate design decisions.
I'll repeat my earlier objection: Nix is not a suitable language for complex string manipulation, since it's way too slow for that. It's a DSL for describing compositions of packages/configurations. If we need operations for e.g. normalising paths, they should be added to the language (for instance, the Also, I would suggest doing this as a flake first. If it turns out to be sufficiently useful, it can always be added to the standard library. But once it's in the standard library, we're stuck with it forever. |
This is about first making it work. As @infinisil already said, if it turns out to be too slow, we can still make it fast by including it in the Nix language.
There is enough interest and capacity to add this in the standard library maintain it for a reasonable amount of time. It's much less useful and practically impossible to discover outside of Nixpkgs. Also, the goal of this PR is to clear the ground for source combinators which also has multiple interested parties waiting for it to land.
That's not completely true. We can have breaking changes with sufficient grace periods. Nixpkgs is easy to pin for consumers (as opposed to Nix), and can be refactored internally. The Nixpkgs Architecture Team discussed strategies for handling compatibility multiple times now. Also, of all things, this design should rather be in the standard library as opposed to some other quite random-sounding functions which are already in there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive test. I might have done something a bit more humble like in the source combinator tests, but this is better.
82a4aa8
to
1304d92
Compare
Rebased on master after merge conflicts. Also reorganized the commit history. |
Adds initial work towards a `lib.path` library Originally proposed in #200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io> Co-Authored-By: Robert Hensing <robert@roberthensing.nl>
The first path library function
1304d92
to
0667ef5
Compare
I think merging this is a mistake, and I don't think my objections have been addressed.
|
@edolstra Most of these have been addressed before:
Agreed, it's not obvious, but it is needed for things like source combinators which needs to handle subpaths. We plan to add
Was already addressed twice, in #200718 (comment) and #205190 (comment)
This was partly addressed here, but is also addressed in the merged design document.
Subpaths aren't part of the language, it's just a nixpkgs concept.
|
The tests fail with (at least) one particular random seed:
(I could reproduce that on current master and on this merge commit introducing the test.) |
Description of changes
Creates a new
lib.path
library component, originally proposed in #200718. This commit adds two main parts of it:lib.path.subpath.isValid
function to check whether a value is a valid subpath according to the design documentlib.path.subpath.normalise
function implementing a safe normalisation of subpaths according to the design document.In the future further library functions will be implemented upon these two functions, it's the working horse of this path library. The next functions from #200718 that can be implemented are
subpath.join
andappend
, see #210426 for progress.This work is sponsored by Antithesis ✨
Things done