-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib.path.hasPrefix, lib.path.removePrefix: init #210423
Conversation
lib/path/default.nix
Outdated
in /* No rec! Add dependencies on this file at the top. */ { | ||
|
||
/* | ||
Whether the second path is a descendant of the first path, or equal to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether the second path is a descendant of the first path, or equal to it. | |
Whether the second path is a suffix of the first path, or equal to it. |
The parent/child relation only applies to trees. Paths are sequences (of nodes in a tree), and have a suffix and prefix.
The function name should change accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paths do represent trees though, I don't think anybody would be confused by that. In contrast, calling these functions isPrefixOf
or hasPrefix
has a good chance of confusing users with the string-based lib.strings.hasPrefix
, lib.strings.hasSuffix
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paths do represent trees though
I'm genuinely confused now. How is a path not a sequence but a tree?
And while common parlance conflates the two, that kind of fuzzy speaking leads to fuzzy thinking that we really want to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying a path is a tree, I'm saying it represents one. E.g. if you import /foo into the store is will also import /foo/bar because it's a descendant.
But I now see that this doesn't really hold when the paths don't exist, a parent-descendant relationship doesn't make sense then. So yes agreed this naming isn't ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, you did not say it is one in terms of equivalence, but it also doesn't represent one. A path leads to a file system object, which can be a tree – or a single file. That is something very different than representation. A string of some shape may represent a path, or a drawing of some sort may represent a tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberth I don't think anyone here is trying to weasel their way into being right, and instead we try to find the most convincing answer. 🙂
I really don't like the potential confusion with
lib.strings.hasPrefix
@infinisil We brought this up multiple times and while I see your concern, I consider it lower priority. The attribute-set-namespace explicitly says one thing is for strings, the other for paths. Relative paths are not regular strings. We should plaster that all over documentation. Yes, one can discard the namespace prefix on import, but you'd have to do it deliberately in each file if you follow best practices.
We could perhaps change
[string.]hasPrefix
to give an error for path values, but that's a breaking change.
most
lib.strings
functions don't make a lot of sense to call on paths directly, they all implicitly convert them to store paths.
Fully agree.
Paths are always fictional things for which existence is irrelevant.
Exactly, although with existence we may mean two different things:
- existence of a path in the directory tree, in the graph-theoretic sense
- existence of a path in the sense of being written down in some representation
The former indeed only starts to matter once we want to traverse the path in the directory tree.
What we care about in this library is only the latter, and suddenly the conceptual model (as opposed to some representation) matters, and that is – sorry for being repetitive – a sequence of path components.
Arguably we'll never say "this path does not exist" to mean "this path is not written down", but what we really mean by that that it does not exist in a particular tree which is usally implied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one can discard the namespace prefix on import, but you'd have to do it deliberately in each file if you follow best practices.
The thing is, people aren't following best practices! In the grand scheme of things, what we do here is highly irregular, no third party takes that much care writing their code.
In particular, the (arguably anti-)pattern of using with lib;
is all over the place. If both lib.path.hasPrefix
and lib.hasPrefix
exist, better make sure to never forget the path.
prefix, otherwise you'll get wrong results like the one I showed. And this is even more likely to happen because traditionally everything has been re-exported from lib.*
, meaning people may assume that lib.path.hasPrefix
also follows that, and sure enough lib.hasPrefix
exists but is completely different.
Because of that I don't think we can "just" have lib.path.hasPrefix
be a thing. The only reasonably thing that I can see working is to deprecate support for passing paths to lib.strings.hasPrefix
.
Additionally we could also deprecate lib.hasPrefix
and require people to use lib.strings.hasPrefix
instead, but this would break a lot of code so I'd rather not unless/until we're sure that these lib.*
are truly a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now opened #221204 to implement my suggestion of deprecating hasPrefix
for paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now switched this PR to use hasPrefix
and hasProperPrefix
, which I think is okay if the above PR is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weak typing was particularly bad for hasPrefix
:
nix-repl> lib.hasPrefix ~/. ~/h/nixpkgs
^C^C^C^C^C^C^C^C^C^C^Z
[1]+ Stopped nix repl
$ kill -9 %%
Friends don't let friends add their home directory to the store ;)
lib/path/default.nix
Outdated
isDescendantOf /. /foo | ||
=> true | ||
*/ | ||
isDescendantOf = ancestor: descendant: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't know if the relation holds, we can as well call the arguments a and b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like how the names make the ordering clear. I'll change this to ancestor
and potentialDescendant
though
a03ede8
to
e09e786
Compare
e268daf
to
a8dfc7e
Compare
a8dfc7e
to
a588da3
Compare
I updated this PR with a rebase, changing the function names to |
d385326
to
f8c21b0
Compare
This PR depends on #221204 though, see #210423 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read and ponder for a long time now.
Some thoughts about making the error scenario about unequal roots handle-able, but these functions are the right tool for the job, because most of the time, equality of roots is an ambient property that can be assumed and forgotten by the caller exactly because it's checked here.
We might want to help with detecting unequal roots ahead of time for functions that need to care about that for whatever reason, but I don't think we have an example of that yet, and it doesn't have to be in scope for this pr.
f8c21b0
to
7155fec
Compare
0ca5739
to
9abd3bd
Compare
lib/path/default.nix
Outdated
removePrefix /. /foo | ||
=> "./foo" | ||
*/ | ||
removePrefix = withTwoDeconstructedPaths "lib.path.removePrefix" (prefix: deconPrefix: path: deconPath: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now added a third function to this PR, removePrefix
because it's very related to the other two and can share a lot of the code. Previously in #200718 I called it relativeTo
9abd3bd
to
701845d
Compare
With removePrefix introduced in a future commit this law can then be used to derive removePrefix p (append p s) == subpath.normalise s => (wrap with append) append p (removePrefix p (append p s)) == append p (subpath.normalise s) => (append is not influenced by subpath normalisation) append p (removePrefix p (append p s)) == append p s => (substitute q = append p s) append p (removePrefix p q) == q Not included in the docs because it's not that important, just shows that the first statement is more general than the second one (because this derivation doesn't work the other way)
701845d
to
82deabf
Compare
I now removed |
3161889
to
44c73ef
Compare
lib/path/default.nix
Outdated
# Type: String -> (DeconstructedPath -> DeconstructedPath -> a) -> (Path -> (Path -> a)) | ||
# | ||
# Turn a binary path function on deconstructed paths into a function on | ||
# non-deconstructed paths. The resulting function takes one path after | ||
# another, deconstructing it, and caching it before taking the next path as | ||
# an argument, allowing the deconstruction to be reused between calls with | ||
# the same first argument. With both paths passed, a check ensures that they | ||
# have the same filesystem root, before the passed function on deconstructed | ||
# paths is called. The first argument is the context for error messages. | ||
withTwoDeconstructedPaths = context: f: | ||
path1: | ||
assert assertMsg | ||
(isPath path1) | ||
"${context}: First argument is of type ${typeOf path1}, but a path was expected"; | ||
let | ||
deconPath1 = deconstructPath path1; | ||
in | ||
path2: | ||
assert assertMsg | ||
(isPath path2) | ||
"${context}: Second argument is of type ${typeOf path2}, but a path was expected"; | ||
let | ||
deconPath2 = deconstructPath path2; | ||
in | ||
assert assertMsg | ||
(deconPath1.root == deconPath2.root) '' | ||
${context}: Filesystem roots must be the same for both paths, but paths with different roots were given: | ||
first argument: "${toString path1}" (root "${toString deconPath1.root}") | ||
second argument: "${toString path2}" (root "${toString deconPath2.root}")''; | ||
f deconPath1 deconPath2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way too clever to be maintainable.
Every day it looks more like what you actually want to implement is error tracing and type checking at the Nix language level.
Apparently what we really need here is
-
generic type assertions, this seems reasonable to extract into a helper function
-
the root check, but there is no reason to hard-code it to be about pairs of paths
actually it's specific enough to duplicate it across the two functions, as it was originally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
generic type assertions
This would perhaps be nice to have in the language, although we'd have to choose between having a static type system or slow evaluation or more infinite recursions. I won't go into more detail here.
To move this forward, I suggest removing this function, which is a non-abstraction, and use more general checks at the call sites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree that this function is a big problem, it removes the need for duplication between hasPrefix
and removePrefix
and strips these functions down to effectively do, which is kind of what functions are for, compare
{
hasPrefix = withTwoDeconstructedPaths "lib.path.hasPrefix" (prefix: path:
take (length prefix.components) path.components == prefix.components
);
}
to
{
hasPrefix =
path1:
assert assertMsg
(isPath path1)
"lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected";
let
path1Deconstructed = deconstructPath path1;
in
path2:
assert assertMsg
(isPath path2)
"lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected";
let
path2Deconstructed = deconstructPath path2;
in
assert assertMsg
(path1Deconstructed.root == path2Deconstructed.root) ''
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given:
first argument: "${toString path1}" (root "${toString path1Deconstructed.root}")
second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")'';
take (length path1Deconstructed.components) path2Deconstructed.components == path1Deconstructed.components;
}
But I don't think this needs to be discussed at length, it's just an internal detail. I removed the function now, inlining it at both use sites.
44c73ef
to
a5866fa
Compare
lib/path/default.nix
Outdated
path1: | ||
assert assertMsg | ||
(isPath path1) | ||
"lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected"; | ||
let | ||
path1Deconstructed = deconstructPath path1; | ||
in | ||
path2: | ||
assert assertMsg | ||
(isPath path2) | ||
"lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected"; | ||
let | ||
path2Deconstructed = deconstructPath path2; | ||
in | ||
assert assertMsg | ||
(path1Deconstructed.root == path2Deconstructed.root) '' | ||
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given: | ||
first argument: "${toString path1}" (root "${toString path1Deconstructed.root}") | ||
second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path1: | |
assert assertMsg | |
(isPath path1) | |
"lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected"; | |
let | |
path1Deconstructed = deconstructPath path1; | |
in | |
path2: | |
assert assertMsg | |
(isPath path2) | |
"lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected"; | |
let | |
path2Deconstructed = deconstructPath path2; | |
in | |
assert assertMsg | |
(path1Deconstructed.root == path2Deconstructed.root) '' | |
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given: | |
first argument: "${toString path1}" (root "${toString path1Deconstructed.root}") | |
second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")''; | |
path1: assert assertMsg (isPath path1) "lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected"; | |
path2: assert assertMsg (isPath path2) "lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected"; | |
let | |
path1Deconstructed = deconstructPath path1; | |
path2Deconstructed = deconstructPath path2; | |
in | |
assert assertMsg (path1Deconstructed.root == path2Deconstructed.root) '' | |
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given: | |
first argument: "${toString path1}" with root "${toString path1Deconstructed.root}" | |
second argument: "${toString path2}" with root "${toString path2Deconstructed.root}"''; | |
Changes:
- a slight refactor that saves an Env and is a bit more readable in my opinion.
- removed parentheses from the error message.
- added a newline before the actual work starts for readability.
Consider these for removePrefix
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of the current implementation is that it memoizes the deconstruction of path1
, so you can do
let
inProject = hasPrefix ./.;
in {
a = inProject ./a;
b = inProject ./b;
}
Without duplicated deconstructions of ./.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright then I guess. ViewPatterns would be nice.
I fell into this rabbit hole by the way, while unaware of the uncached transformation NixOS/nix#8211
It stops just short of ViewPatterns, unless @=
isn't deduplicated or has a non-deduped variant (though I guess at that point we need keywords instead of operator soup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subscribed to that issue, though I don't want to get into that rabbit hole myself right now :)
I applied your other suggestion to remove the parenthesis from the error message though
a5866fa
to
e797a07
Compare
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/tweag-nix-dev-update-47/27387/1 |
removePrefix /. /foo | ||
=> "./foo" | ||
*/ | ||
removePrefix = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to just rethink this a little bit. While working on #222981 I realized the frequent need to operate over the components of a subpath. This function as implemented here however returns the subpath as a string, which would then have to be parsed into a list of components again, which is a bit wasteful.
So I'm tilting towards an API like this:
lib.path.components.removePrefix /foo /foo/bar -> [ "bar" ]
lib.path.components.toSubpath [ "bar" ] -> "./bar"
lib.path.components.fromSubpath "./bar" -> [ "bar" ]
lib.path.parts.deconstruct /foo -> { root = /; components = [ "foo" ]; }
Needs more thought, so I don't consider this PR ready to merge anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That API does make sense though.
I now opened #237610 only containing |
I opened #238013 for a new |
Description of changes
Adds two new path library functions:
lib.path.hasPrefix :: Path -> Path -> Bool
: Whether the second path is a prefix of the first path, or equal to it:lib.path.removePrefix :: Path -> Path -> String
: Remove a first prefix path from a second path, the result is a normalised subpath string, seelib.path.subpath.normalise
:This relates to other work in the path library effort.
This PR depended on #221204, which is now merged.
This work is sponsored by Antithesis ✨
Things done