Path library design #200718

infinisil · 2022-11-11T15:34:37Z

Description of changes

There is a need for path operations in nixpkgs and third-party tools. This PR drafts a design document for what such a path library should look like.

Also ping @aakropotkin

This work is sponsored by Antithesis ✨

roberth · 2022-11-11T16:29:15Z

For path values, having a single representation was a great design choice. It reduces the number of possibilities that users need to consider. It reduces the intrinsic complexity of the path operations. It reduces the extrinsic complexity by avoiding the possibility of doing something slightly surprising when a path happens to be relative instead of absolute. It simplifies code that deals with path values.

Of course a relative path pops up every now and then, but not to the point where I've felt the need more than single small operations.
By designing a shadow path library that aims to "make relative paths easy", we run the risk of increasing the prevalence of relative paths. This effect makes the whole effort counterproductive when considering that the goal of a library is to simplify its callers' code.

What are the use cases? Do they really need to work with relative paths?

Small braindump:

a unified representation of (thing, path) objects. Path-like objects can be relative to /., a store path, a source object (with source combinators: extend, trace, cutAt, focusAt and more #112083). A lazy tree is already such a pair. Relative paths are not that, because the "thing" part is unknown, and because the "path" part can be negative (../) if I may use that term liberally.

infinisil · 2022-11-11T16:48:13Z

@roberth Relative paths are needed on a very basic level of just appending path components, e.g.

join [ "/foo/bar" "baz/qux" ]
== "/foo/bar/baz/qux"

should work, the second element is a relative path. And the opposite:

split "/foo/bar"
== [ "/" "foo" "bar" ]

outputs relative paths. Another case is for your source combinators, where you want to pass a subpath via e.g. ./foo/bar, which then gets converted to an absolute path by Nix. To get back the relative path you need to

relativeTo ./. ./foo/bar
== "foo/bar"

That's like your absolutePathComponentsBetween function.

Regarding .., I'm not yet sure whether that should be supported at all (should it throw an error or be ignored?). See the document for arguments for/against it. The safer default would be to throw an error at least for now.

aakropotkin · 2022-11-11T20:08:17Z

Boy do I have some path routines for you 😁
I'll post my libs in a bit. I expect we can cherry pick and tweak what I have written for compatibility.

infinisil · 2022-11-11T20:51:13Z

@aakropotkin I'd rather discuss the design of such a library beforehand, that's the hard part to get right before implementing it :) Please take a look at the document and what you think of it. I'm also very interested in use cases for such a library, let me know about yours! We should have justification for each part of this library

aakropotkin · 2022-11-12T09:43:57Z

Gotcha. One I'm a fan of for relative paths in cases where I'm loading several files is:

{ lib }: let
  pathrel = {
    basedir = throw "you must set basedir before reading files" ;
    __functor = self: x: let
      p   = toString x;
      abs = if lib.isAbspath p then p else "${self.basedir}/${p}";
  in ( self.__innerFunction or ( y: y ) ) abs;

  readRel = pathrel // {
    __InnerFunction = builtins.readFile;
  };

  importRel = pathRel // {
    __InnerFunction = p:
      if ( builtins.match ".*\\.json" p )
      then builtins.fromJSON ( builtins.readFile p )
      else import p;
  };
in ...

Mostly this cuts down boilerplate, but I also like being able to override basedir later with a simple //.

roberth · 2022-11-12T10:48:13Z

@infinisil

@roberth Relative paths are needed on a very basic level of just appending path components, e.g.
join [ "/foo/bar" "baz/qux" ]
== "/foo/bar/baz/qux"

This is not a use case.

should work, the second element is a relative path. And the opposite:
split "/foo/bar"
== [ "/" "foo" "bar" ]

You're putting an absolute path in a list of path components?

Again, this is not a use case. Why would I need this?

outputs relative paths. Another case is for your source combinators, where you want to pass a subpath via e.g. ./foo/bar, which then gets converted to an absolute path by Nix. To get back the relative path you need to
relativeTo ./. ./foo/bar
== "foo/bar"
That's like your absolutePathComponentsBetween function.

Exactly. I didn't need a relative path library for this. Why would you make it stringly typed? That's not something we should encourage.

Regarding .., I'm not yet sure whether that should be supported at all (should it throw an error or be ignored?). See the document for arguments for/against it. The safer default would be to throw an error at least for now.

If we throw it out, path components are all we need, and those are just lists of strings.

Please, please, show me a use case where users need a relative path library. Relative paths are a mess, with tons of corner cases and silly representations. You've shown that very clearly in the design document. By engaging with bad data types or representations, you'll only make code more confusing and error prone.

If everyone jumps off a cliff, that's not a reason for us to jump off a cliff.

roberth · 2022-11-12T10:51:36Z

@aakropotkin

Mostly this cuts down boilerplate,

What was the boilerplate?

roberth · 2022-11-12T13:19:14Z

Alright, alternative proposition:

Use Nix-provided paths first and foremost. Avoid the concept of relative paths, because it comes with a lot of baggage that we'd rather avoid. Instead, work with path differences represented as lists of path component strings. Prefer to keep this as an internal representation only, returning Nix paths instead. This way we don't create a split between functions that take paths as usual and functions that take relative paths, and we aren't tempted to do add a ton of dynamic type matching and half-baked implicit conversion code to each and every filesystem related function. (We already have half a ton of this in lib.sources, and the interactions between every path-like representation will make this quadratically worse)

aakropotkin · 2022-11-12T15:33:35Z

@aakropotkin

Mostly this cuts down boilerplate,

What was the boilerplate?

In practice I usually pack a number of processing routines in the base functor. For example simplifying ././../foo/../bar -> ../bar. I also have markers in the base functor for "is this subtree locked?" and "make sure nobody uses .. above basedir". I pack an instance of filter = to be used to pre-process dirs in some cases.

Having that stuff centered around a common structure was helpful to me.

Having said that those things might not be useful in Nixpkgs.

roberth · 2022-11-12T18:28:12Z

@aakropotkin You're talking about your functions, but you didn't show me any boilerplate. Should I assume that you just use the term very liberally?

For example simplifying ././../foo/../bar -> ../bar.

These are already equivalent path values. Why would they be strings?

markers in the base functor for "is this subtree locked?"

What does this mean?

"make sure nobody uses .. above basedir"

I have a well tested pathHasPrefix function for this in #112083. It does not require relative paths.

pre-process dirs

What kind of preprocessing?

infinisil · 2022-11-14T19:12:17Z

join [ "/foo/bar" "baz/qux" ]
== "/foo/bar/baz/qux"

This is not a use case.

While "joining a path" is not a use-case on its own, such an operation is needed for a lot of things. These are ones I can think of right now:

Selecting a subdirectory to build, sourceRoot
Getting NixOS modules, modulesPath + "/profiles/minimal.nix"
Getting a specific output directory, someDrv + "/etc/systemd/system"

Note that such a join operation is essentially just + but with normalisation.

split "/foo/bar"
== [ "/" "foo" "bar" ]
You're putting an absolute path in a list of path components?

This is mainly for split to be inversible and the split of a normalised path to be unique. We could have an anchor and components function that return / and [ "foo" "bar" ] respectively.

Again, this is not a use case. Why would I need this?

It's not a use case on its own no, but it's mainly there to allow people to implement their own path functions without having to use strings for that. Ideally the path library provides everything that's needed though. Can't think of a use case right now though. The design document talks about perhaps making this function internal because of that.

That's like your absolutePathComponentsBetween function.

Exactly. I didn't need a relative path library for this. Why would you make it stringly typed? That's not something we should encourage.

Even if it's represented as a list, it's still a relative path. This is definitely a use case. Note that in your source combinators PR there's a "manual" lib.concatStringsSep "/" to undo the list representation of the relative path into a string.

infinisil · 2022-11-14T19:39:46Z

Edit for clarity: This comment is about whether Nix paths or strings should be used for a path library. It's not related to the above arguments about support for relative paths.

Use Nix-provided paths first and foremost.

Here's some arguments against Nix paths (should be added to the document for sure):

They get imported into the store when interpolated
- Notably unfiltered, which is often not what is wanted
- Only makes sense for eval-time paths, so the library would be problematic for buildtime/runtime paths
  - Poor error messages when accidentally using interpolation with buildtime/runtime paths (because they might not exist)
  - And if the path does exist there's no error, which is never intended for buildtime/runtime paths
    - Could even be a security problem with secrets being accidentally imported
They can't be used with paths containing string context, so any path referring to derivations
They normalise /foo/bar/../baz to /foo/baz, which may be wrong when bar is a symlink
- Additionally note that builtins accepting paths do resolve .. correctly when passed strings
They normalise away trailing /, which is not a problem in of itself, but when doing /some/path + "/" + "foo", it turns to /some/pathfoo, which is not right

Here's the arguments I can think of for Nix paths:

Implement normalisation, so we don't have to (though it messes with ..)
It's faster than string processing, because it's builtin

roberth · 2022-11-14T20:13:27Z

This is not a use case.

While "joining a path" is not a use-case on its own, such an operation is needed for a lot of things. These are ones I can think of right now:

Selecting a subdirectory to build, sourceRoot

Already supported without a relative path library.

Getting NixOS modules, modulesPath + "/profiles/minimal.nix"

Getting a specific output directory, someDrv + "/etc/systemd/system"

+ works fine for these.

Note that such a join operation is essentially just + but with normalisation.

No, because that would let you add an absolute path after a relative path, which is not a valid operation, but with the air of a legitimate operation because we have a function for it, obscuring the nastiness that's going on. You could forbid paths starting with / on the right hand side, but that'd probably cause more pain than it solves, especially as we're already used to + "/foo".

Use Nix-provided paths first and foremost.

Here's some arguments against Nix paths (should be added to the document for sure):

It's the devil we know.

They get imported into the store when interpolated

Useful. Where it's not desirable, you're not going to fix that with a relative path library.

Notably unfiltered, which is often not what is wanted

Often not a big deal, especially since flakes filter out at least the minimum, which is the gitignored stuff. A relative path library does not fix this user error.

Only makes sense for eval-time paths, so the library would be problematic for buildtime/runtime paths

Ok, show me a real world example.

* Poor error messages when accidentally using interpolation with buildtime/runtime paths (because they might not exist)

I don't think I've encountered these. Could you explain?

  * Could even be a security problem with secrets being accidentally imported

Not a problem that a relative path library solves.

They can't be used with paths containing string context, so any path referring to derivations

Where would you use these? In IFD? Does it get complicated enough to warrant a library?

They normalise /foo/bar/../baz to /foo/baz, which may be wrong when bar is a symlink

Or it may be correct. The behavior you call correct is still very surprising, even if POSIX(?) suggests it. Most people expect to operate on file system trees, not graphs. In a tree (no symlinks), the normalisation always holds up.

Additionally note that builtins accepting paths do resolve .. correctly when passed strings

Relative paths would be strings, according to your proposal, so this is a reason not to implement a relative path library.

They normalise away trailing /, which is not a problem in of itself, but when doing /some/path + "/" + "foo", it turns to /some/pathfoo, which is not right

This is unfortunate, but with a relative path library, you'll have the opposite problem that users may append a redundant slash.

Created NixOS/nix#7301

Here's the arguments I can think of for Nix paths:

Implement normalisation, so we don't have to (though it messes with ..)

It's faster than string processing, because it's builtin

Fewer types means fewer incompatible functions

infinisil · 2022-11-14T21:05:20Z

@roberth Please note that my arguments in this comment are not related to the discussion of supporting relative paths. This is about a path library in general, irrespective of whether relative paths are supported in any way. I'm not replying to all your arguments where that's the counter.

Note that such a join operation is essentially just + but with normalisation.

No, because that would let you add an absolute path after a relative path, which is not a valid operation, but with the air of a legitimate operation because we have a function for it, obscuring the nastiness that's going on. You could forbid paths starting with / on the right hand side, but that'd probably cause more pain than it solves, especially as we're already used to + "/foo".

Yes I simplified it.. The join described in the design document always adds a / between and only allows the first argument to be an absolute path. We should err on the side of caution.

Often not a big deal, especially since flakes filter out at least the minimum, which is the gitignored stuff.

Nix can't assume Flakes is used, but yes they do make this a bit better.

Poor error messages when accidentally using interpolation with buildtime/runtime paths (because they might not exist)

I don't think I've encountered these. Could you explain?

nix-repl> "${/var/lib/someService}"
error: getting status of '/var/lib/someService': No such file or directory

This would happen if a path is accidentally interpolated when not meant to

They can't be used with paths containing string context, so any path referring to derivations

Where would you use these? In IFD? Does it get complicated enough to warrant a library?

No just a simple join [ pkgs.hello "bin/hello" ], wouldn't work if this was converted to a Nix path. This is a generic path library, it shouldn't be limited to just "eval-time paths you're intending to import"

They normalise /foo/bar/../baz to /foo/baz, which may be wrong when bar is a symlink

Or it may be correct. The behavior you call correct is still very surprising, even if POSIX(?) suggests it. Most people expect to operate on file system trees, not graphs. In a tree (no symlinks), the normalisation always holds up.

Or it may be correct, but the Nix path type has no business assuming it doesn't matter. Yes, realpath supports two modes, --logical (that's what Nix does), or --physical (the default). Strong disagree that --logical is what people expect. If anything it should be --physical because it's the default. But in any case, it's really not up for the Nix path type to determine that. The code reading from that path should determine that.

Fewer types means fewer incompatible functions

Such as?

roberth · 2022-11-14T22:22:34Z

@roberth Please note that my arguments in this comment are not related to the discussion of supporting relative paths. This is about a path library in general, irrespective of whether relative paths are supported in any way. I'm not replying to all your arguments where that's the counter.

Yeah, we're discussing multiple things at once. Maybe I was too focused on relative paths.

Yes I simplified it.. The join described in the design document always adds a / between and only allows the first argument to be an absolute path. We should err on the side of caution.

That seems like a sensible function then.

Often not a big deal, especially since flakes filter out at least the minimum, which is the gitignored stuff.

Nix can't assume Flakes is used, but yes they do make this a bit better.

True. I don't think there's much we can do on the library side to improve this unless we reject Nix paths altogether, turn them into a library implementation detail and forbid path expressions in user code. This seems excessive.

Poor error messages when accidentally using interpolation with buildtime/runtime paths (because they might not exist)

I don't think I've encountered these. Could you explain?
nix-repl> "${/var/lib/someService}"
error: getting status of '/var/lib/someService': No such file or directory

Seems fixable: NixOS/nix#7303

This would happen if a path is accidentally interpolated when not meant to

They can't be used with paths containing string context, so any path referring to derivations

Where would you use these? In IFD? Does it get complicated enough to warrant a library?

No just a simple join [ pkgs.hello "bin/hello" ], wouldn't work if this was converted to a Nix path.

Build outputs can not be converted to path values. This would tend to cause IFD for no good reason.

This is a generic path library, it shouldn't be limited to just "eval-time paths you're intending to import"

I'm ok with a path join function, but I am not convinced that we need an extensive path library.

They normalise /foo/bar/../baz to /foo/baz, which may be wrong when bar is a symlink

Or it may be correct. The behavior you call correct is still very surprising, even if POSIX(?) suggests it. Most people expect to operate on file system trees, not graphs. In a tree (no symlinks), the normalisation always holds up.

Or it may be correct, but the Nix path type has no business assuming it doesn't matter. Yes, realpath supports two modes, --logical (that's what Nix does), or --physical (the default). Strong disagree that --logical is what people expect. If anything it should be --physical because it's the default. But in any case, it's really not up for the Nix path type to determine that. The code reading from that path should determine that.

So symlinks are a leaky abstraction. This is more reason to avoid over-abstraction in the form of a path library.

Fewer types means fewer incompatible functions

Such as?

This is a general observation. If your (total) functions support one type, all functions can be combined arbitrarily. This is useful. Often this is not quite achievable, but an approximation of this ideal is already pleasant to work with. This is the "algebraic" or "combinator" flavor of libraries. On the other end of the spectrum, you have libraries with many types, and few functions that operate on each particular type. For a library that is about a "single" type, paths, the latter would be a disaster.

A concrete example would be src in stdenv.mkDerivation. It supports strings with context, or paths. If we add a third type of path, support would have to be added for this new type of path. If it's a representation of relative paths, it is not even possible to reasonably implement it, because it doesn't contain enough information. In other cases, support for a new type of path may simple not have been added yet, and the extra code complexity imposed by matching the expectations around the new path type is far from ideal. It's also possible for such functions to consume the new path type in a wrong or inconsistent way that does not match other interpretations of the new path type.

roberth · 2022-11-14T22:25:07Z

only one new function liability (lib.path.join or something) which seems manageable
two suggested Nix improvements

I like the progress. If we keep this up, we can solve our problems without creating a mess for ourselves.

infinisil · 2022-11-14T23:20:59Z

Build outputs can not be converted to path values. This would tend to cause IFD for no good reason.

There's no problem with join [ hello "bin/hello" ] if a string is returned. And there's no reason why that would cause IFD, path operations don't cause IFD. Only calling builtins that read from the filesystem can cause IFD, irrespective of whether you pass them Nix paths or strings.

I'm ok with a path join function, but I am not convinced that we need an extensive path library.

This proposal is actually kept fairly minimal imo, which functions do you think are unnecessary?

So symlinks are a leaky abstraction. This is more reason to avoid over-abstraction in the form of a path library.

Perhaps symlinks are leaky, but people are still using them and expect them to work in a certain way. There's no abstraction introduced by this current proposal, it's really just the same absolute and relative paths strings used all throughout Linux, so not sure what you mean by over-abstracted.

A concrete example would be src in stdenv.mkDerivation. It supports strings with context, or paths. If we add a third type of path, support would have to be added for this new type of path. If it's a representation of relative paths, it is not even possible to reasonably implement it, because it doesn't contain enough information. In other cases, support for a new type of path may simple not have been added yet, and the extra code complexity imposed by matching the expectations around the new path type is far from ideal. It's also possible for such functions to consume the new path type in a wrong or inconsistent way that does not match other interpretations of the new path type.

That is an argument. Specifically something like mkDerivation { src = join [ ./. "src" ]; } wouldn't work, which is unexpected if you don't know that join returns a string that isn't imported into the store. A mild counter-argument is that in most cases you don't want to import a directory unfiltered and without a constant store path name. If you use lib.sources.* this all isn't a problem since it explicitly imports the path into the store using builtins.path.

This does give me an idea though: How about making it such that this path library returns the same data type as it gets as an input. So

join [ ./. "src" ] == ./src
join [ "/var/lib" "someService" ] == "/var/lib/someService"

Then src = join [ ./. "src" ] would work again without extra effort and as expected. Similarly src = join [ "/var/lib" "someService" ] wouldn't work as expected (you're just passing a string, just like src = "/var/lib/someService").

infinisil · 2022-11-18T19:48:50Z

I almost forgot about this, since recently, Nix supports paths with interpolation:

nix-repl> /foo/${"bar"}                  
/foo/bar

Only works when starting with a / though, so if you want to append to an absolute path in a variable:

nix-repl> foo = /base

nix-repl> /${foo}/bar 
/base/bar

Not sure about this, it's not very pretty, but a / path operator also has problems (I left a comment). An explicit function for this like join might be better after all, and it wouldn't have the problem you described with the proposal from my previous comment):

join [ /foo "bar" ] -> /foo/bar
# We could also do
join /foo "bar"

aameen-tulip · 2022-11-21T15:39:58Z

One example of boilerplate is this example. Here I have wrappers around any import and read* routines to allow emulating pure and no IFD modes even if Nix is running with --impure.

  coerceJSON' = { pure ? lib.inPureEvalMode, ifd ? true }: x: let
    isPjs = ( builtins.isAttrs x ) &&
            ( ! ( ( x ? outPath ) || ( x ? __toString ) ) );
    p'   = lib.coercePath x;
    pjs  = lib.importJSON p;
    msgD = "coercePjs: Cannot read path '${p'}' when `ifd' is disable.";
    forD = if ifd then pjs else throw msgD;
    msgP = "coercePjs: Cannot read unlocked path '${p'}' in pure mode.";
    forP = if ( ! pure ) || ( lib.isStorePath p' ) then pjs else throw msgP;
  in if isPjs then x else
     if ( lib.isDerivation x ) then forD else forP;

Nothing too special, but this would be tedious to write in multiple routines, so often times I was wrapping things in a path construct.

Again though, probably not useful in Nixpkgs, just wanted to provide an example of what I was talking about.

infinisil · 2022-11-23T00:41:41Z

The current path problems related to commitIdFromGitRepo in #199772 (and its follow-ups, #201779, #202370 and some related issues) are a really good example of why a string-based path library (not doing any store imports) would be very useful.

Looking at commitIdFromGitRepo, these functions from the current proposal could be used:

join
normalise
isAbsolute
relativeTo

But I'm also just realizing that lazy trees in its current state essentially breaks such a string-based path library, since it makes it such that toString doesn't just return the string version of a path anymore:

nix run github:edolstra/nix/lazy-trees -- repl
Welcome to Nix 2.12.0pre20221116_561440b. Type :? for help.

nix-repl> normalPath = ./.

nix-repl> normalPath
/home/tweagysil/src/nix

nix-repl> toString normalPath
"/home/tweagysil/src/nix"

nix-repl> lazyPath = (builtins.fetchTree { type = "git"; url = ./.; }).outPath

nix-repl> lazyPath
/home/tweagysil/src/nix/

nix-repl> toString lazyPath
"/nix/store/virtual000000000000000000000001c-source"

This notably also breaks the normalization function you're using in the source combinators PR:

nix-repl> toString (/. + lazyPath)
"/nix/store/virtual0000000000000000000000004-source"

Why is it broken? Because the lazy trees PR deprecates anything related to the /nix/store/virtual* paths. You can't read from it directly:

nix-repl> builtins.readDir (toString lazyPath)
error: cannot decode virtual path '/nix/store/virtual0000000000000000000000006-source'

And while you seem to be able to access a subdirectory, it's deprecated:

nix-repl> builtins.readDir (toString lazyPath + "/doc")
warning: applying 'toString' to path '/home/tweagysil/src/nix/doc' and then accessing it is deprecated, at «string»:1:1
{ manual = "directory"; }

(Oh also the error is wrong, I'm applying toString to /home/tweagysil/src/nix, not the doc subdirectory)

I can't see why this should be deprecated though, I'll mention this in the lazy trees PR: NixOS/nix#6530 (comment)

aakropotkin · 2022-11-23T01:46:10Z

Earlier in the thread I was asked why I always immediately toString ./. I had no idea lazy trees were going to do this to "virtual paths", but the rationale was similar.

I simply didn't trust Nix to keep my path as a regular FS path outside of the store, but if I toString before anything else I could "${mypath}/foo" to my heart's content without being concerned.

I'm pretty glad I went that way now that I see how lazy paths are going to blow up for any "late" attempts to stringize.

EDIT: I read the other PR thread. If I'm understanding it correctly it looks like they are going to flood my log with depreciation warnings for my approach... This is all well and good as long as the brand new replacement they just took out of the oven is well documented, backwards compatible, and bug free 😅. It's been a while since I locked into an old release but I might let the dust settle on this one for a few months. Hopefully I misunderstood the PR.

roberth · 2022-11-23T14:58:43Z

The current path problems related to commitIdFromGitRepo in #199772 (and its follow-ups, #201779, #202370 and some related issues) are a really good example of why a string-based path library (not doing any store imports) would be very useful.

No, they're examples of why Nix needs to be backwards compatible.

Looking at commitIdFromGitRepo, these functions from the current proposal could be used:

join

normalise

isAbsolute

relativeTo

I do see a couple of uses for join. Could you elaborate on the need for the other functions?

isAbsolute

The idea of "absoluteness" is only relevant to file system paths, not lazy trees. isRoot could apply to both traditional and lazy tree paths, but changes between instantiated and uninstantiated lazy trees. Perhaps store paths should also be considered "root"?

lazy trees

This introduces a lot of uncertainty.
If you ask me, it must prioritize compatibility over immediate performance improvements, but even if this was the case we have to take it into consideration; not for correctness but for performance.
Without more information about lazy tree usage, performance and compatibility, this design effort is a fool's errand.
I'll talk with Eelco about this.

infinisil · 2022-11-23T17:55:58Z

The current path problems related to commitIdFromGitRepo in #199772 (and its follow-ups, #201779, #202370 and some related issues) are a really good example of why a string-based path library (not doing any store imports) would be very useful.

No, they're examples of why Nix needs to be backwards compatible.

That too. Something can be an example of multiple issues. The issue of being unsure about which paths get imported into the store being hard to reason about is something that can be seen from these PRs. E.g. this diff causes the result to be a path instead of a string which can cause it to be imported into the store when it didn't previously, which is entirely non-obvious from this code:

-      gitRepo      = "${toString ./..}/.git";
+      gitRepo      = ./.. + "/.git";

I do see a couple of uses for join. Could you elaborate on the need for the other functions?

diff --git a/lib/sources.nix b/lib/sources.nix
index 3ad7dc63355..d8d15a9333b 100644
--- a/lib/sources.nix
+++ b/lib/sources.nix
@@ -185,12 +185,12 @@ let
   # not exported, used for commitIdFromGitRepo
   _commitIdFromGitRepoOrError =
     let readCommitFromFile = file: path:
-        let fileName       = path + "/${file}";
-            packedRefsName = path + "/packed-refs";
+        let fileName       = join [ path file ];
+            packedRefsName = join [ path "packed-refs" ];
             absolutePath   = base: path:
-              if lib.hasPrefix "/" path
+              if isAbsolute path
               then path
-              else toString (/. + "${base}/${path}");
+              else join [ base path ];
         in if pathIsRegularFile path
            # Resolve git worktrees. See gitrepository-layout(5)
            then
@@ -199,12 +199,11 @@ let
                 then { error = "File contains no gitdir reference: " + path; }
                 else
                   let gitDir      = absolutePath (dirOf path) (lib.head m);
-                      commonDir'' = if pathIsRegularFile "${gitDir}/commondir"
-                                    then lib.fileContents "${gitDir}/commondir"
+                      commonDir' = if pathIsRegularFile "${gitDir}/commondir"
+                                    then normalise (lib.fileContents "${gitDir}/commondir")
                                     else gitDir;
-                      commonDir'  = lib.removeSuffix "/" commonDir'';
                       commonDir   = absolutePath gitDir commonDir';
-                      refFile     = lib.removePrefix "${commonDir}/" "${gitDir}/${file}";
+                      refFile     = relativeTo commonDir (join [ gitDir file ]);
                   in readCommitFromFile refFile commonDir
 
            else if pathIsRegularFile fileName

roberth · 2022-11-23T18:33:14Z

E.g. this diff causes the result to be a path instead of a string which can cause it to be imported into the store when it didn't previously, which is entirely non-obvious from this code:

Adding more code isn't going to make this problem go away.

E.g. this diff causes the result to be a path instead of a string which can cause it to be imported into the store when it didn't previously, which is entirely non-obvious from this code:

Conversion to a string actually is a problem in this function, but the fix wasn't complete, causing trouble.
It could be fixed and simplified by working with path values exclusively; perhaps converting to paths at the boundary.

@fricklerhandwerk

- Changes the agreed-upon design slightly to make types of functions clearer: - Previously `path.join` worked on a list of paths, but required all but the first component to relative. This is now split into two functions: - `path.append <path> <string>` takes care of appending a relative path to an absolute path. - `path.relative.join [ <string> ]` takes care of joining relative paths together - `path.normalise` -> `path.relative.normalise`, because we don't need normalisation on absolute paths, Nix already takes care of that, and we use the `path.relative` namespace for anything only relating to relative paths - Some more bikeshedding for the `relativeTo` name. I think `relativeTo` is pretty good, but @fricklerhandwerk likes other suggestions more - Adds some suggestions for partial ordering checks on paths - Adds a `difference` function, which can take care of common prefix and subpath calculations between any number of paths.

infinisil · 2022-12-02T19:30:18Z

I paired with @fricklerhandwerk today and made some changes:

Changes the agreed-upon design slightly to make types of functions
clearer:
- Previously path.join worked on a list of paths, but required all but
  the first component to relative. This is now split into two functions:
  - path.append <path> <string> takes care of appending a relative
    path to an absolute path.
  - path.relative.join [ <string> ] takes care of joining relative
    paths together
- path.normalise -> path.relative.normalise, because we don't need
  normalisation on absolute paths, Nix already takes care of that, and
  we use the path.relative namespace for anything only relating to
  relative paths
Some more bikeshedding for the relativeTo name. I think relativeTo
is pretty good, but @fricklerhandwerk likes other suggestions more
Adds some suggestions for partial ordering checks on paths
Adds a difference function, which can take care of common prefix and
subpath calculations between any number of paths.

fricklerhandwerk · 2022-12-02T20:55:14Z

Yeah, stripPrefix or removePrefix says what it does and matches with hasPrefix.

@infinisil argued this opens potential for confusion with string functions, but that's what namespaces are for.

Getting the local scope consistent and unsurprising is more important for me.

I also like his idea of structuring the namespace by purpose in a fine-grained manner, such that the symbols are self-explanatory. That is really helpful reading such code, given we don't have a type system to assist us.

roberth · 2022-12-03T01:04:16Z

lib/path-design.md

+difference { left = /foo; right = /foo; } = { commonPrefix = /foo; suffix = { left = "."; right = "."; }; }
+
+# Requires at least one path
+difference {} = <error>


This function reminds me more of the greatest common divisor than difference.

The word difference applied to paths makes me think of set difference.

tangent

..., which something I've considered implementing but purposefully haven't

source combinators: extend, trace, cutAt, focusAt and more #112083

(noting that trees are quite different from sets of paths)

Set difference is not very different from sources.filter and might suggest a usage that may not perform well.

roberth · 2022-12-03T01:08:33Z

lib/path-design.md

+```nix
+difference { path = /foo/bar } = { commonPrefix = /foo/bar; suffix = { path = "."; }; }
+difference { left = /foo/bar; right = /foo/baz; } = { commonPrefix = /foo; suffix = { left = "bar"; right = "baz"; }; }
+difference { left = /foo; right = /foo/bar; } = { commonPrefix = /foo; suffix = { left = "."; right = "bar"; }; }


Suggested change

difference { left = /foo; right = /foo/bar; } = { commonPrefix = /foo; suffix = { left = "."; right = "bar"; }; }

difference { a = /foo; b = /foo/bar; c = /foo/baz } = { commonPrefix = /foo; suffix = { a = "."; b = "bar"; c = "baz"; }; }

Illustrate that the names are up to the caller, for those who don't normally read types.

roberth · 2022-12-03T01:14:40Z

lib/path-design.md

+### `difference`
+
+```haskell
+difference :: AttrsOf Path -> { commonPrefix :: Path; suffix :: AttrsOf String; }


Ideas:

commonAncestry :: f Path -> { commonAncestor :: Path; relativePaths :: f String; } where f is one of attrsOf listOf

Since a path is a sequence of nodes in a tree, the correct term for the object of interest is "prefix", not "ancestor". The latter applies to the relations of nodes, not elements in a path.

commonPrefix :: AttrsOf Path -> { prefix :: Path; suffix :: AttrsOf String }

Using suffix does not only save us characters, but also reads more natural when accessing it:

(commonPrefix { a = /abc/def; b = /abc/xzy; }).suffix.b

I'm not the biggest fan of taking a list. With @infinisil we crystallised a few key ideas about the library, also based on your feedback (specifically: keeping the type surface small), and one of them was that it should be as general as possible such that one can implement convenience functions on top of it in a straighforward manner.

That way we can keep the library itself slim, since consumers can just make their own thing easily, and we can adopt the convenience functions into the library once they start to proliferate.

A list has strictly less information than an attrset, and can always be projected down from the attrset, but not the other way around.

{ commonPrefixList = xs: let intermediate = commonPrefix (listToAttrs (imap0 (i: x: { ${toString i} = x; }))); in { prefix = intermediate.prefix; suffix = attrValues intermediate.suffix; }; }

Sure that's slow, but right now I don't even know if we have a use case for that interface.

Also lays down the assumptions we're making about paths, assumptions which notably also make the library work with the lazy trees Nix PR (without relying or interfering with any of its bugs)

infinisil · 2022-12-06T21:41:26Z

lib/path-design.md

+This library makes only these assumptions about paths and no others:
+- `dirOf path` returns the path to the parent directory of `path`, unless `path` is the filesystem root, in which case `path` is returned
+  - There can be multiple filesystem roots: `p == dirOf p` and `q == dirOf p` does not imply `p == q`
+    - While there's only a single filesystem root in stable Nix, the [lazy trees PR](https://github.com/NixOS/nix/pull/6530) introduces [additional filesystem roots](https://github.com/NixOS/nix/pull/6530#discussion_r1041442173)
+- `path + ("/" + string)` returns the path to the `string` subdirectory in `path`
+  - If `string` contains no `/` characters, then `dirOf (path + ("/" + string)) == path`
+  - If `string` contains no `/` characters, then `baseNameOf (path + ("/" + string)) == string`
+- `path1 == path2` returns true only if `path1` points to the same filesystem path as `path2`
+
+Notably we do not make the assumption that we can turn paths into strings using `toString path`.


I updated the PR with such a commonAncestry function now (just picked @roberth's recent suggestion for now, I don't have a strong opinion on the name yet).

I also updated the PR to lay down the assumptions about the path type that the library is based on. Notably also including one small assumption which makes it harmonize with the eventual lazy trees PR.

As a next step I'll split this PR into smaller parts

infinisil · 2022-12-08T21:33:50Z

I created a smaller PR for just the relativeNormalise function along with the design document now: #205190

Adds initial work towards a `lib.path` library Originally proposed in #200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io>

Adds initial work towards a `lib.path` library Originally proposed in #200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io> Co-Authored-By: Robert Hensing <robert@roberthensing.nl>

infinisil · 2023-01-04T22:29:04Z

#205190 is now merged with a much better design document and initial lib.path.subpath.{isValid,normalise} functions. Based on this I now opened #208887 for lib.path.append and #209099 for lib.path.subpath.join.

I might use some API descriptions and implementations from this draft PR, but this one can be closed.

Adds initial work towards a `lib.path` library Originally proposed in NixOS/nixpkgs#200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io> Co-Authored-By: Robert Hensing <robert@roberthensing.nl>

lib: Initial path library design document

ca1a7bf

infinisil requested a review from fricklerhandwerk November 11, 2022 15:34

ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Nov 11, 2022

infinisil requested a review from roberth November 11, 2022 15:57

roberth mentioned this pull request Nov 23, 2022

how should paths be used in the context of lazy trees NixOS/nix#7335

Closed

roberth reviewed Dec 3, 2022

View reviewed changes

Update code to match the design document

2981d87

Also lays down the assumptions we're making about paths, assumptions which notably also make the library work with the lazy trees Nix PR (without relying or interfering with any of its bugs)

infinisil force-pushed the lib.path branch from f7dbc27 to 2981d87 Compare December 6, 2022 21:41

infinisil commented Dec 6, 2022

View reviewed changes

infinisil mentioned this pull request Dec 8, 2022

lib.path.subpath.{isValid,normalise}: init #205190

Merged

3 tasks

This was referenced Jan 3, 2023

lib.path.append: init #208887

Merged

lib.path.subpath.join: init #209099

Merged

infinisil closed this Jan 4, 2023

infinisil deleted the lib.path branch January 4, 2023 22:29

infinisil mentioned this pull request Jan 6, 2023

lib.path.difference: init #209375

Closed

2 tasks

This was referenced Jan 12, 2023

lib.path.hasPrefix, lib.path.removePrefix: init #210423

Closed

Path library functions #210426

Closed

infinisil added the significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Path library design #200718

Path library design #200718

infinisil commented Nov 11, 2022 •

edited

Loading

roberth commented Nov 11, 2022

infinisil commented Nov 11, 2022 •

edited

Loading

aakropotkin commented Nov 11, 2022

infinisil commented Nov 11, 2022 •

edited

Loading

aakropotkin commented Nov 12, 2022

roberth commented Nov 12, 2022

roberth commented Nov 12, 2022

roberth commented Nov 12, 2022

aakropotkin commented Nov 12, 2022 •

edited

Loading

roberth commented Nov 12, 2022

infinisil commented Nov 14, 2022

infinisil commented Nov 14, 2022 •

edited

Loading

roberth commented Nov 14, 2022

infinisil commented Nov 14, 2022 •

edited

Loading

roberth commented Nov 14, 2022

roberth commented Nov 14, 2022

infinisil commented Nov 14, 2022 •

edited

Loading

infinisil commented Nov 18, 2022

aameen-tulip commented Nov 21, 2022

infinisil commented Nov 23, 2022 •

edited

Loading

aakropotkin commented Nov 23, 2022 •

edited

Loading

roberth commented Nov 23, 2022

infinisil commented Nov 23, 2022

roberth commented Nov 23, 2022

infinisil commented Dec 2, 2022

fricklerhandwerk commented Dec 2, 2022

roberth Dec 3, 2022

roberth Dec 3, 2022

roberth Dec 3, 2022 •

edited

Loading

fricklerhandwerk Dec 8, 2022 •

edited

Loading

infinisil Dec 6, 2022 •

edited

Loading

infinisil commented Dec 8, 2022

infinisil commented Jan 4, 2023

	difference { left = /foo; right = /foo/bar; } = { commonPrefix = /foo; suffix = { left = "."; right = "bar"; }; }
	difference { a = /foo; b = /foo/bar; c = /foo/baz } = { commonPrefix = /foo; suffix = { a = "."; b = "bar"; c = "baz"; }; }

Path library design #200718

Path library design #200718

Conversation

infinisil commented Nov 11, 2022 • edited Loading

Description of changes

roberth commented Nov 11, 2022

infinisil commented Nov 11, 2022 • edited Loading

aakropotkin commented Nov 11, 2022

infinisil commented Nov 11, 2022 • edited Loading

aakropotkin commented Nov 12, 2022

roberth commented Nov 12, 2022

roberth commented Nov 12, 2022

roberth commented Nov 12, 2022

aakropotkin commented Nov 12, 2022 • edited Loading

roberth commented Nov 12, 2022

infinisil commented Nov 14, 2022

infinisil commented Nov 14, 2022 • edited Loading

roberth commented Nov 14, 2022

infinisil commented Nov 14, 2022 • edited Loading

roberth commented Nov 14, 2022

roberth commented Nov 14, 2022

infinisil commented Nov 14, 2022 • edited Loading

infinisil commented Nov 18, 2022

aameen-tulip commented Nov 21, 2022

infinisil commented Nov 23, 2022 • edited Loading

aakropotkin commented Nov 23, 2022 • edited Loading

roberth commented Nov 23, 2022

infinisil commented Nov 23, 2022

roberth commented Nov 23, 2022

infinisil commented Dec 2, 2022

fricklerhandwerk commented Dec 2, 2022

roberth Dec 3, 2022

Choose a reason for hiding this comment

roberth Dec 3, 2022

Choose a reason for hiding this comment

roberth Dec 3, 2022 • edited Loading

Choose a reason for hiding this comment

fricklerhandwerk Dec 8, 2022 • edited Loading

Choose a reason for hiding this comment

infinisil Dec 6, 2022 • edited Loading

Choose a reason for hiding this comment

infinisil commented Dec 8, 2022

infinisil commented Jan 4, 2023

infinisil commented Nov 11, 2022 •

edited

Loading

infinisil commented Nov 11, 2022 •

edited

Loading

infinisil commented Nov 11, 2022 •

edited

Loading

aakropotkin commented Nov 12, 2022 •

edited

Loading

infinisil commented Nov 14, 2022 •

edited

Loading

infinisil commented Nov 14, 2022 •

edited

Loading

infinisil commented Nov 14, 2022 •

edited

Loading

infinisil commented Nov 23, 2022 •

edited

Loading

aakropotkin commented Nov 23, 2022 •

edited

Loading

roberth Dec 3, 2022 •

edited

Loading

fricklerhandwerk Dec 8, 2022 •

edited

Loading

infinisil Dec 6, 2022 •

edited

Loading