Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.path.hasPrefix, lib.path.removePrefix: init #210423

Closed
wants to merge 3 commits into from

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Jan 12, 2023

Description of changes

Adds two new path library functions:

  • lib.path.hasPrefix :: Path -> Path -> Bool: Whether the second path is a prefix of the first path, or equal to it:
    lib.path.hasPrefix ./. ./lib
    -> true
  • lib.path.removePrefix :: Path -> Path -> String: Remove a first prefix path from a second path, the result is a normalised subpath string, see lib.path.subpath.normalise:
    lib.path.removePrefix ./. ./lib
    -> "./lib"

This relates to other work in the path library effort.
This PR depended on #221204, which is now merged.

This work is sponsored by Antithesis

Things done
  • Wrote docs
  • Wrote tests

@infinisil infinisil mentioned this pull request Jan 12, 2023
13 tasks
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Jan 12, 2023
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
in /* No rec! Add dependencies on this file at the top. */ {

/*
Whether the second path is a descendant of the first path, or equal to it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Whether the second path is a descendant of the first path, or equal to it.
Whether the second path is a suffix of the first path, or equal to it.

The parent/child relation only applies to trees. Paths are sequences (of nodes in a tree), and have a suffix and prefix.

The function name should change accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paths do represent trees though, I don't think anybody would be confused by that. In contrast, calling these functions isPrefixOf or hasPrefix has a good chance of confusing users with the string-based lib.strings.hasPrefix, lib.strings.hasSuffix, etc.

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paths do represent trees though

I'm genuinely confused now. How is a path not a sequence but a tree?

And while common parlance conflates the two, that kind of fuzzy speaking leads to fuzzy thinking that we really want to avoid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying a path is a tree, I'm saying it represents one. E.g. if you import /foo into the store is will also import /foo/bar because it's a descendant.

But I now see that this doesn't really hold when the paths don't exist, a parent-descendant relationship doesn't make sense then. So yes agreed this naming isn't ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, you did not say it is one in terms of equivalence, but it also doesn't represent one. A path leads to a file system object, which can be a tree – or a single file. That is something very different than representation. A string of some shape may represent a path, or a drawing of some sort may represent a tree.

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberth I don't think anyone here is trying to weasel their way into being right, and instead we try to find the most convincing answer. 🙂

I really don't like the potential confusion with lib.strings.hasPrefix

@infinisil We brought this up multiple times and while I see your concern, I consider it lower priority. The attribute-set-namespace explicitly says one thing is for strings, the other for paths. Relative paths are not regular strings. We should plaster that all over documentation. Yes, one can discard the namespace prefix on import, but you'd have to do it deliberately in each file if you follow best practices.

We could perhaps change [string.]hasPrefix to give an error for path values, but that's a breaking change.

most lib.strings functions don't make a lot of sense to call on paths directly, they all implicitly convert them to store paths.

Fully agree.

Paths are always fictional things for which existence is irrelevant.

Exactly, although with existence we may mean two different things:

  • existence of a path in the directory tree, in the graph-theoretic sense
  • existence of a path in the sense of being written down in some representation

The former indeed only starts to matter once we want to traverse the path in the directory tree.
What we care about in this library is only the latter, and suddenly the conceptual model (as opposed to some representation) matters, and that is – sorry for being repetitive – a sequence of path components.

Arguably we'll never say "this path does not exist" to mean "this path is not written down", but what we really mean by that that it does not exist in a particular tree which is usally implied.

Copy link
Member Author

@infinisil infinisil Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one can discard the namespace prefix on import, but you'd have to do it deliberately in each file if you follow best practices.

The thing is, people aren't following best practices! In the grand scheme of things, what we do here is highly irregular, no third party takes that much care writing their code.

In particular, the (arguably anti-)pattern of using with lib; is all over the place. If both lib.path.hasPrefix and lib.hasPrefix exist, better make sure to never forget the path. prefix, otherwise you'll get wrong results like the one I showed. And this is even more likely to happen because traditionally everything has been re-exported from lib.*, meaning people may assume that lib.path.hasPrefix also follows that, and sure enough lib.hasPrefix exists but is completely different.

Because of that I don't think we can "just" have lib.path.hasPrefix be a thing. The only reasonably thing that I can see working is to deprecate support for passing paths to lib.strings.hasPrefix.

Additionally we could also deprecate lib.hasPrefix and require people to use lib.strings.hasPrefix instead, but this would break a lot of code so I'd rather not unless/until we're sure that these lib.* are truly a bad idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now opened #221204 to implement my suggestion of deprecating hasPrefix for paths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now switched this PR to use hasPrefix and hasProperPrefix, which I think is okay if the above PR is merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak typing was particularly bad for hasPrefix:

nix-repl> lib.hasPrefix ~/. ~/h/nixpkgs
^C^C^C^C^C^C^C^C^C^C^Z
[1]+  Stopped                 nix repl

$ kill -9 %%

Friends don't let friends add their home directory to the store ;)

lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
isDescendantOf /. /foo
=> true
*/
isDescendantOf = ancestor: descendant:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't know if the relation holds, we can as well call the arguments a and b.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like how the names make the ordering clear. I'll change this to ancestor and potentialDescendant though

lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
@infinisil infinisil mentioned this pull request Jan 17, 2023
2 tasks
@infinisil infinisil changed the title lib.path.is[Proper]DescendantOf: init lib.path.has[Proper]Prefix: init Mar 14, 2023
@infinisil
Copy link
Member Author

I updated this PR with a rebase, changing the function names to has[Proper]Prefix, and adding tests. This looks ready to me now.

@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 14, 2023
@infinisil infinisil force-pushed the lib.path-ordering branch 2 times, most recently from d385326 to f8c21b0 Compare March 14, 2023 20:21
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 14, 2023
@infinisil
Copy link
Member Author

This looks ready to me now.

This PR depends on #221204 though, see #210423 (comment)

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read and ponder for a long time now.
Some thoughts about making the error scenario about unequal roots handle-able, but these functions are the right tool for the job, because most of the time, equality of roots is an ambient property that can be assumed and forgotten by the caller exactly because it's checked here.
We might want to help with detecting unequal roots ahead of time for functions that need to care about that for whatever reason, but I don't think we have an example of that yet, and it doesn't have to be in scope for this pr.

lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
@infinisil infinisil changed the title lib.path.has[Proper]Prefix: init lib.path.{hasPrefix,hasProperPrefix,removePrefix}: init Mar 15, 2023
removePrefix /. /foo
=> "./foo"
*/
removePrefix = withTwoDeconstructedPaths "lib.path.removePrefix" (prefix: deconPrefix: path: deconPath:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now added a third function to this PR, removePrefix because it's very related to the other two and can share a lot of the code. Previously in #200718 I called it relativeTo

@infinisil infinisil requested a review from roberth March 15, 2023 21:10
With removePrefix introduced in a future commit this law can then be
used to derive

        removePrefix p (append p s) == subpath.normalise s
        => (wrap with append)
        append p (removePrefix p (append p s)) == append p (subpath.normalise s)
        => (append is not influenced by subpath normalisation)
        append p (removePrefix p (append p s)) == append p s
        => (substitute q = append p s)
        append p (removePrefix p q) == q

Not included in the docs because it's not that important, just shows
that the first statement is more general than the second one (because
this derivation doesn't work the other way)
@infinisil infinisil force-pushed the lib.path-ordering branch from 701845d to 82deabf Compare April 5, 2023 18:34
@infinisil infinisil changed the title lib.path.{hasPrefix,hasProperPrefix,removePrefix}: init lib.path.hasPrefix, lib.path.removePrefix: init Apr 5, 2023
@infinisil
Copy link
Member Author

infinisil commented Apr 5, 2023

I now removed hasProperPrefix, because I don't think there are actually any use cases for that function. I also cleaned up the code and commit history.

@infinisil infinisil force-pushed the lib.path-ordering branch 2 times, most recently from 3161889 to 44c73ef Compare April 5, 2023 18:44
Comment on lines 123 to 152
# Type: String -> (DeconstructedPath -> DeconstructedPath -> a) -> (Path -> (Path -> a))
#
# Turn a binary path function on deconstructed paths into a function on
# non-deconstructed paths. The resulting function takes one path after
# another, deconstructing it, and caching it before taking the next path as
# an argument, allowing the deconstruction to be reused between calls with
# the same first argument. With both paths passed, a check ensures that they
# have the same filesystem root, before the passed function on deconstructed
# paths is called. The first argument is the context for error messages.
withTwoDeconstructedPaths = context: f:
path1:
assert assertMsg
(isPath path1)
"${context}: First argument is of type ${typeOf path1}, but a path was expected";
let
deconPath1 = deconstructPath path1;
in
path2:
assert assertMsg
(isPath path2)
"${context}: Second argument is of type ${typeOf path2}, but a path was expected";
let
deconPath2 = deconstructPath path2;
in
assert assertMsg
(deconPath1.root == deconPath2.root) ''
${context}: Filesystem roots must be the same for both paths, but paths with different roots were given:
first argument: "${toString path1}" (root "${toString deconPath1.root}")
second argument: "${toString path2}" (root "${toString deconPath2.root}")'';
f deconPath1 deconPath2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way too clever to be maintainable.

Every day it looks more like what you actually want to implement is error tracing and type checking at the Nix language level.

Apparently what we really need here is

  1. generic type assertions, this seems reasonable to extract into a helper function

  2. the root check, but there is no reason to hard-code it to be about pairs of paths

    actually it's specific enough to duplicate it across the two functions, as it was originally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

generic type assertions

This would perhaps be nice to have in the language, although we'd have to choose between having a static type system or slow evaluation or more infinite recursions. I won't go into more detail here.

To move this forward, I suggest removing this function, which is a non-abstraction, and use more general checks at the call sites.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't necessarily agree that this function is a big problem, it removes the need for duplication between hasPrefix and removePrefix and strips these functions down to effectively do, which is kind of what functions are for, compare

{
  hasPrefix = withTwoDeconstructedPaths "lib.path.hasPrefix" (prefix: path:
    take (length prefix.components) path.components == prefix.components
  );
}

to

{
  hasPrefix =
    path1:
    assert assertMsg
      (isPath path1)
      "lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected";
    let
      path1Deconstructed = deconstructPath path1;
    in
      path2:
      assert assertMsg
        (isPath path2)
        "lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected";
      let
        path2Deconstructed = deconstructPath path2;
      in
        assert assertMsg
        (path1Deconstructed.root == path2Deconstructed.root) ''
          lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given:
              first argument: "${toString path1}" (root "${toString path1Deconstructed.root}")
              second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")'';
        take (length path1Deconstructed.components) path2Deconstructed.components == path1Deconstructed.components;
}

But I don't think this needs to be discussed at length, it's just an internal detail. I removed the function now, inlining it at both use sites.

Comment on lines 203 to 221
path1:
assert assertMsg
(isPath path1)
"lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected";
let
path1Deconstructed = deconstructPath path1;
in
path2:
assert assertMsg
(isPath path2)
"lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected";
let
path2Deconstructed = deconstructPath path2;
in
assert assertMsg
(path1Deconstructed.root == path2Deconstructed.root) ''
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given:
first argument: "${toString path1}" (root "${toString path1Deconstructed.root}")
second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")'';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
path1:
assert assertMsg
(isPath path1)
"lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected";
let
path1Deconstructed = deconstructPath path1;
in
path2:
assert assertMsg
(isPath path2)
"lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected";
let
path2Deconstructed = deconstructPath path2;
in
assert assertMsg
(path1Deconstructed.root == path2Deconstructed.root) ''
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given:
first argument: "${toString path1}" (root "${toString path1Deconstructed.root}")
second argument: "${toString path2}" (root "${toString path2Deconstructed.root}")'';
path1: assert assertMsg (isPath path1) "lib.path.hasPrefix: First argument is of type ${typeOf path1}, but a path was expected";
path2: assert assertMsg (isPath path2) "lib.path.hasPrefix: Second argument is of type ${typeOf path2}, but a path was expected";
let
path1Deconstructed = deconstructPath path1;
path2Deconstructed = deconstructPath path2;
in
assert assertMsg (path1Deconstructed.root == path2Deconstructed.root) ''
lib.path.hasPrefix: Filesystem roots must be the same for both paths, but paths with different roots were given:
first argument: "${toString path1}" with root "${toString path1Deconstructed.root}"
second argument: "${toString path2}" with root "${toString path2Deconstructed.root}"'';

Changes:

  • a slight refactor that saves an Env and is a bit more readable in my opinion.
  • removed parentheses from the error message.
  • added a newline before the actual work starts for readability.

Consider these for removePrefix as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of the current implementation is that it memoizes the deconstruction of path1, so you can do

let
  inProject = hasPrefix ./.;
in {
  a = inProject ./a;
  b = inProject ./b;
}

Without duplicated deconstructions of ./.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright then I guess. ViewPatterns would be nice.

I fell into this rabbit hole by the way, while unaware of the uncached transformation NixOS/nix#8211

It stops just short of ViewPatterns, unless @= isn't deduplicated or has a non-deduped variant (though I guess at that point we need keywords instead of operator soup).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subscribed to that issue, though I don't want to get into that rabbit hole myself right now :)

I applied your other suggestion to remove the parenthesis from the error message though

lib/path/default.nix Outdated Show resolved Hide resolved
@infinisil infinisil requested a review from roberth April 12, 2023 19:33
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-47/27387/1

removePrefix /. /foo
=> "./foo"
*/
removePrefix =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to just rethink this a little bit. While working on #222981 I realized the frequent need to operate over the components of a subpath. This function as implemented here however returns the subpath as a string, which would then have to be parsed into a list of components again, which is a bit wasteful.

So I'm tilting towards an API like this:

  • lib.path.components.removePrefix /foo /foo/bar -> [ "bar" ]
  • lib.path.components.toSubpath [ "bar" ] -> "./bar"
  • lib.path.components.fromSubpath "./bar" -> [ "bar" ]
  • lib.path.parts.deconstruct /foo -> { root = /; components = [ "foo" ]; }

Needs more thought, so I don't consider this PR ready to merge anymore

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That API does make sense though.

@infinisil infinisil marked this pull request as draft June 13, 2023 18:41
@infinisil infinisil mentioned this pull request Jun 13, 2023
2 tasks
@infinisil
Copy link
Member Author

I now opened #237610 only containing hasPrefix, so that can be merged before figuring out things here

@infinisil infinisil mentioned this pull request Jun 15, 2023
2 tasks
@infinisil
Copy link
Member Author

I opened #238013 for a new removePrefix design. I'll close this PR in favor of the above two PR's.

@infinisil infinisil closed this Jun 15, 2023
@infinisil infinisil deleted the lib.path-ordering branch June 15, 2023 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants