Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

%<...>% path expansion notation #2927

Closed
wants to merge 1 commit into from
Closed

Conversation

dra27
Copy link
Member

@dra27 dra27 commented May 3, 2017

There are many instances in packaging of the form %{some-directory-variable}%/more/path/components which result in an amateur/confused-looking mixed back/forward-slash path on Windows.

I propose adding the notation %<subst>% such that subst is expanded with / converted to the path separator for the OS. %{...}% may be nested inside %<...>% notation, allowing variables to be used within a path expansion. Thus %<%{lib}%/stublibs>% on Windows is expanded to the value of
lib with \\stublibs appended.

There are many occasions in packaging of the form
%{some-directory-variable}%/more/path/components which result in an
amateur/confused-looking mixed back/forward-slash path on Windows.
String interpolation is extended such that %<subst>% has subst expanded
with / converted to the path separator for the OS. %{...}% may be nested
inside %<...>% notation, allowing normal variables to be used within a
path expansion. Thus %<%{lib}%/stublibs>% is expanded to the value of
lib with "\\stublibs" on Windows.

Signed-off-by: David Allsopp <david.allsopp@metastack.com>
@dra27
Copy link
Member Author

dra27 commented May 3, 2017

This will obviously also require a docs update, but I'll do that after the notation is agreed!

let expand_string ?(partial=false) ?default env text =
let default fident = match default, partial with
(* Resolves ["%{x}%"] and ["%<x>%"] string interpolations *)
let rec expand_string_aux ?(regex=path_interp_regex) ?(partial=false) ?default env text =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the use of ~regex means that this only introduces one recursive call - expansion of %{...}% within %<...>%. It doesn't add recursive expansion of variables by accident!

@dbuenzli
Copy link
Contributor

dbuenzli commented May 3, 2017

This is going to make things unreadable and people will always forget to do it. Why doesn't opam simply converts path variables to forward slashes on all platforms ? Consumers of these paths can then do the right thing according to the platform (e.g. by using a suitable path library).

@AltGr
Copy link
Member

AltGr commented May 3, 2017

I share some of @dbuenzli's concern, in particular about forgetting to add the delimiters. I am not sure it's wise to limit the expansion to forward slashes when it will be consumed by Windows commands, though, but I don't really have a good grasp of how this ends up being handled on Windows.

Since these strings are used as arguments to arbitrary commands, we have no way of knowing wether they are paths or not, so the only two solutions I can imagine (if dropping backward slashes altogether, like @dbuenzli suggests, is not a viable solution) are:

  • special syntax, as proposed. Something more light-weight could be possible, e.g. a whole-string qualifier rather than an expansion (I don't remember the details well but we've discussed it already). It could make things more readable, but remains the "forgetting" issue
  • make the / character special, meaning "the path separator", and rewrite it in all strings, unless it's escaped as \/. This may not be as crazy as it sounds, as I don't believe we have many uses of that char outside of paths in practice. Other uses would be in informative strings were it wouldn't break. Of course, this opens a different treatment on Unix and Windows that could lead to stuff breaking...

@dra27
Copy link
Member Author

dra27 commented May 3, 2017

I am virtually certain both from a presentational and technical perspective that forcing forward-slashes on Windows is a mistake (even if a mistake made by a lot of other "cross-platform" projects) and we should be sticking to the actual convention. As a tiny bit of background, while there are various layers of compatibility, the separator at the kernel level is backslash only (it is API functions which do the conversion, but that's at the subsystem level). More relevantly, there are quite a few user-level places where forward-slashes do not work (and these are not particularly well-defined places - i.e. there are shell commands where some forward-slashes work and others do not). Most importantly, there are no places where back-slashes will not work and you have to use forward-slashes...

Some further thoughts on both comments:

  • I hope that %<...>% is preferred to my original prototype, which was %{<$lib$/stublibs}% 😄
  • I also tried @AltGr's suggestion (ages ago) of a special literal syntax for paths, though I gave up because the impact on opam's own type system was vast and it was also hard to ensure that a path type didn't leak back to a string (OpamFilename.t is now even more pervasive than it was then, so this is probably less of a problem). However, while this made things neater for some cases, there still remained what happened if you wanted, say, to embed a path in a call to a Microsoft C compiler invocation, say, /FeC:\Users\DRA\Documents\.opam\default\.opam-switch\build\package\foo.exe where the /Fe must be part of the switch (i.e. you have to embed the path in a string, it can't exist on its own). That example is of course not commonplace, but that's not the point.
  • The suggestion of making / special could be very interesting and now would be the time to do it - the / prefix is for many Windows commands the equivalent of - or -- (Microsoft's compiler tools almost universally accept either / or -, but other Windows commands are not as liberal) - but if this were changed now, then it would be Windows which would be affected by the behaviour and it would be Windows where breakages would be likely to take place (e.g. cmd /c rd /s/q C:\some-temp-directory fails noisily if you forget to escape the forward-slashes). Also, we should already be suspicious of / in a prefix position, as absolute paths should not be being embedded in opam files.
  • As regards forgetting to use the %<...>% notation - a corollary of Louis's suggestion is that uses of / outside %<...>% could reasonably be reported by opam lint as probably wrong...

@dbuenzli
Copy link
Contributor

dbuenzli commented May 3, 2017

I am virtually certain both from a presentational and technical perspective that forcing forward-slashes on Windows is a mistake (even if a mistake made by a lot of other "cross-platform" projects) and we should be sticking to the actual convention. As a tiny bit of background, while there are various layers of compatibility, the separator at the kernel level is backslash only (it is API functions which do the conversion, but that's at the subsystem level).

This misses my point. I think that your proposal introduces a complexity that we could eschew. More specifically here's what I think should be done.

  1. opam peforms path variable substitution in opam files using forward slashes, the build
    systems should behave correctly and opam file writers should use forward slashes in their constants. It should be the duty of their build system to render and use them according to the platform conventions.
  2. opam when otherwise provided with path information should always accept either the platform convention or forward slashes.
  3. opam when otherwise queried for path information should always render it according to the platform conventions.

Now you may see 1. as already being a heresy but I think it's will make our life simpler without adding more idiosyncrasy to the already confusing opam language.

@dra27
Copy link
Member Author

dra27 commented May 3, 2017

For me, it escapes heresy by not requiring or displaying the forward-slashes! Actually I think we're in closer agreement than I possibly realised, but I'm not convinced that a key part of (1) is presently doable (@AltGr may have a different opinion). The problem is that there is no actual notion of a "path variable" (at present) - that's partly what this syntax introduces, though it's of course not the only way. Storing forward-slashes internally also shifts considerable complexity back into opam itself.

While it would still add complexity to the language, @AltGr's suggestion of a whole string qualifier would at least get rid of the commonplace %<%{ prefix (there would be a small, but probably insignificant, loss of expressivity with that change).

Overall, I'm afraid I see as requirement what you see as idiosyncrasy. It's also potentially a big (though not insurmountable) ask to expect all build systems (or, in general, commands being executed) to convert the forward slashes (shell invocations, where necessary, become harder - there are build systems, such as nmake, which cannot cope with them.

@dbuenzli
Copy link
Contributor

dbuenzli commented May 3, 2017

For me, it escapes heresy by not requiring or displaying the forward-slashes!

The log of invocations would still be heretical.

The problem is that there is no actual notion of a "path variable" (at present)

Aren't the %{bin}%, %{lib}%, etc. variables typed internally ?

Storing forward-slashes internally also shifts considerable complexity back into opam itself.

You don't store them forwardly. When you perform substitution in the build: section you force rendering of the path to string with the forward slash convention. But of course you need the variables to be typed to know on which ones you have to do this.

@AltGr
Copy link
Member

AltGr commented May 4, 2017

Aren't the %{bin}%, %{lib}%, etc. variables typed internally ?

No, the variables are only booleans or strings at the moment. Besides, that could trigger changes within the variable expansion, but not for the rest of the string, e.g. in the right part of "%{lib}%/foobar/".

I still only see "forward slashes everywhere" or "'/' is a special char meaning <os-specific-path-separator> everywhere" as solutions. The major pitfall of the second solution is that it still means '/' on most OSes, so people won't notice any mistake on their part...

@dbuenzli
Copy link
Contributor

dbuenzli commented May 4, 2017

I still only see "forward slashes everywhere" or "'/' is a special char meaning everywhere" as solutions.

I dislike both, especially the second. Isn't / anyways forbidden as variable names ? We could then simply have the notation %{bin/foo/bar}% do the path translations, i.e. the first segment defines a variable the rest after the first / is an arbitrary string on which path translations are performed.

If we want that to be more obvious we could use any other character that is not allowed in variable names, e.g. |, %{bin|foo/bar}% or even a wink to windows user with \, %{bin\foo/bar}%.

@AltGr
Copy link
Member

AltGr commented May 5, 2017

@dbuenzli I thought "forward slashes everywhere" was what you initially suggested ?
%{bin/foo}% is indeed lighter-weight than the original suggestion; it's also closer to @dra27's initial prototype, which used %{<$lib$/stublibs}% !
It's less expressive though: what about literal paths that don't use variables ? %{/foo}% or %{./foo/bar}% ? It's weird and confusing...

@dbuenzli
Copy link
Contributor

dbuenzli commented May 5, 2017

@dbuenzli I thought "forward slashes everywhere" was what you initially suggested ?

No I suggested 1. + 2. + 3. Now since it seems you don't have the information internally about what is a path and what is not it seems that if we have 1. we cannot have 3. which I see as an ux issue.

@AltGr AltGr added the PR: OUTDATED This PR might still have meaningful content, but is getting stale and might be closed label Mar 11, 2020
@AltGr
Copy link
Member

AltGr commented Mar 11, 2020

Note: we are now attempting to lock the opam file-format (https://github.com/ocaml/opam-file-format/) to allow for tighter integration with dune... so we should consider this shortly if something is to be done.
Cc. @dra27

@dra27
Copy link
Member Author

dra27 commented Nov 14, 2023

This is resolved in a much more principled, elegant and less syntactically-invasive way in #5636 🥳

@dra27 dra27 closed this Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: OUTDATED This PR might still have meaningful content, but is getting stale and might be closed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants