Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctx.actions.symlink doesn't make relative symlink, contrary to docs #14224

Open
pauldraper opened this issue Nov 4, 2021 · 21 comments · May be fixed by #24213
Open

ctx.actions.symlink doesn't make relative symlink, contrary to docs #14224

pauldraper opened this issue Nov 4, 2021 · 21 comments · May be fixed by #24213
Labels
not stale Issues or PRs that are inactive but not considered stale P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: bug

Comments

@pauldraper
Copy link
Contributor

pauldraper commented Nov 4, 2021

ctx.actions.symlink always makes absolute symlinks when target_path is used, regardless of whether target_path is absolute.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

WORKSAPCE.bazel

BUILD.bazel

load(":rules.bzl", "example")

example(
  name = "example",
)

rules.bzl

def _example_impl(ctx):
    example1 = ctx.actions.declare_symlink("example1")
    ctx.actions.symlink(output = example1, target_path = "../..")

    example2 = ctx.actions.declare_symlink("example2")
    ctx.actions.run(executable = "ln", arguments = ["-s", "../..", example2.path], outputs = [example2])

    return DefaultInfo(
        files = depset([example1, example2])
    )

example = rule(
  implementation = _example_impl
)
$ bazel build --experimental_allow_unresolved_symlinks :example
$ ls -l bazel-bin/
lrwxrwxrwx 1 paul paul 68 Nov  4 01:06 example1 -> /home/paul/.cache/bazel/_bazel_paul/fc54cd89f1e9b3d4a2b86f2842cad4ca
lrwxrwxrwx 1 paul paul  5 Nov  4 01:06 example2 -> ../..

I expected both symlinks to be ''../.." but only the latter (using executable ln) is.

Documentation: https://docs.bazel.build/versions/main/skylark/lib/actions.html#symlink

(Experimental) The exact path that the output symlink will point to. No normalization or other processing is applied. Access to this feature requires setting --experimental_allow_unresolved_symlinks.

What operating system are you running Bazel on?

Ubuntu 18.04

What's the output of bazel info release?

4.2.1

@philsc
Copy link
Contributor

philsc commented Nov 5, 2021

I ran into this as well. The issue is that, despite the documentation saying otherwise, the first thing SymlinkAction does is to turn the unresolved path into a PathFragment. The first thing a PathFragment does is to normalize the path.

I don't know what the right answer to fix this is, but I think that's the reason for the issue.

@pauldraper
Copy link
Contributor Author

My fix was to use the ln executable, and everything worked fine. (Platform-specific of course.)

@brandjon brandjon changed the title ctx.actions.symlink doesn't make relative symlink ctx.actions.symlink doesn't make relative symlink, contrary to docs Dec 2, 2021
@brandjon brandjon added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Dec 2, 2021
@brandjon
Copy link
Member

brandjon commented Dec 2, 2021

This is surprising and contrary to documentation, though I will note that making symlinks to paths (as opposed to artifacts) is still experimental.

@fmeum
Copy link
Collaborator

fmeum commented Jun 26, 2022

@lberki @brandjon I'm currently working on resolving the cluster of issues around symlinks. Would you support adding a new basic operation to Bazel's file system abstraction that allows creating unnormalized symlinks (taking a String rather than a PathFragment)? Without this, I don't see a way to achieve the correct behavior

@lberki
Copy link
Contributor

lberki commented Jun 27, 2022

@fmeum I think there isn't another way to fix this bug, think so please go ahead (unless @haxorz objects but I don't think he will)

My guess is that this comes from the time when PathFragment wasn't forcibly normalized and thus it worked at that time, and we didn't test this case so it got accidentally broken at some point in time.

I'd much rather you replace FileSystem.createSymbolicLink(PathFragment) than add an overload. That will make merging a bit more difficult (we have an internal implementation of FileSystem), but carries less risk of having a more complicated interface forever.

@fmeum
Copy link
Collaborator

fmeum commented Jun 27, 2022

@lberki The only alternative I see is to specify the allowed operations on a PathFragment created with createAlreadyNormalized and mandate implementations of createSymbolicLink to use only these operations. That approach may not even require any changes to code, at the cost of encoding usage restrictions in documentation only. What do you think?

@lberki
Copy link
Contributor

lberki commented Jun 27, 2022

Yeah, strictly speaking, that's an alternative, but that one has a larger blast radius than a change localized to FileSystem.

The normalization was forcibly added in a729b9b with the reasoning that it would help RAM use. I don't know more history. @meisterT maybe you remember something I don't?

There is also 775d3a9 which replaced PathFragment with String specifically to support non-normalized paths, which makes me thing that doing so it the right thing to do here, too.

@fmeum
Copy link
Collaborator

fmeum commented Jun 29, 2022

@meteorcloudy While working on @lberki's recommendation to make FileSystem.createSymbolicLink take target in as a String rather than a PathFragment, I stumbled upon 6c07525 as well as the file/directory type check. Do you see any problems with the following approach that would allow for non-normalized symlink targets?

  1. If createSymbolicLinks is true, let WindowsFileSystem.createSymbolicLink use WindowsFileOperations.createSymlink directly without any file type or existence checks.
  2. If it is not set, proceed with the current logic to either create a junction or a copy.

Do you know whether using symlinks for directories (e.g., under the Bazel convenience symlinks) instead of junctions has any drawbacks?

@fmeum
Copy link
Collaborator

fmeum commented Jun 30, 2022

While working on a PR for this, I noticed https://cs.opensource.google/bazel/bazel/+/5d75c08d9542d35d90047369059cad2e03c6a1d4:src/main/java/com/google/devtools/build/lib/windows/WindowsFileOperations.java;l=229, which is an error raised when trying to create a symlink to a directory. https://cs.opensource.google/bazel/bazel/+/cffd417a4c4ec4d082f815f57a771ad1010972dd does not contain any explanation for why that should be an error though.

@meteorcloudy
Copy link
Member

@fmeum Please see https://docs.google.com/document/d/17YIqUdffxpwcKP-0whHM6TFELN8VohTpjiiEIbbRfts/edit#heading=h.hw82qobcal5z

Basically, creating symlinks on Windows requires some extra privilege, so we decided to always use junction for "directory symlink" and only when --windows_enable_symlinks is on, we try to create file symlink (otherwise, we copy the file).

In generally, I think it's safe to change the behaviour to what you described (IIUC):

  • If --windows_enable_symlinks is on, create symlink for both file and directory.
  • If --windows_enable_symlinks is off, create junction for directory and copy file.

@fmeum
Copy link
Collaborator

fmeum commented Jul 4, 2022

Thanks for the information, I got with that approach then.

@meteorcloudy Do you know whether symlinks on Windows can use forward slash paths? If not, then I should probably translate the path style in WindowsFileSystem.{create,read}SymbolicLink or it will be very difficult to make declare_symlink work cross-platform.

@meteorcloudy
Copy link
Member

@fmeum I think the CreateSymbolicLink API itself doesn't support forward slash paths. But we do translate the paths first before passing to the JNI functions. See https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/windows/WindowsFileOperations.java;l=225

@fmeum
Copy link
Collaborator

fmeum commented Jul 5, 2022

@meteorcloudy Thanks for pointing me to that function, that was the missing piece.

Unfortunately, I don't think that the current ctx.actions.symlink(target_path=...) API can work on Windows. According to the CreateSymbolicLink docs and as verified by my tests, the type of the eventual target of a symlink has to be specified during creation. Since it isn't known to Bazel in the case of an unresolved symlink, I think that we would have to introduce an additional boolean flag indicating the desired type of the target.

@meteorcloudy
Copy link
Member

meteorcloudy commented Jul 5, 2022

I think that we would have to introduce an additional boolean flag indicating the desired type of the target.

Can we use Files.isDirectory(target.toRealPath());
at https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/windows/WindowsFileSystem.java;l=90 to tell if the resolved path is a directory or not?

Related: https://docs.oracle.com/javase/7/docs/api/java/nio/file/Path.html#toRealPath(java.nio.file.LinkOption...)

@fmeum
Copy link
Collaborator

fmeum commented Jul 5, 2022

We could if we wanted to create a resolvable symlink, but in this particular case we might very well want to create a symlink pointing to ../../node-fetch without that path even existing yet.

I actually think that we should just commit to the following restriction: On Windows, ctx.actions.symlink(target_path=...) will always create a directory symlink if the target doesn't exist. This is because:

  1. It doesn't require adding a type attribute to ctx.actions.symlink and, more importantly, the remote execution API, just to deal with a Windows oddity.
  2. Given how recent of an addition symlinks are on Windows, allowing a strict superset of what dangling junctions could do (namely, allowing relative paths) should cover the needs of Windows specific tooling.
  3. Directories seem like much more interesting targets for unresolved symlinks to begin with (see PNPM, which is my main driver for working on this feature) and directory symlinks can at least somewhat emulate file symlinks.

@gregmagolan Die PNPM rely on unresolved file symlinks in any way?

@meteorcloudy
Copy link
Member

I did some experiments with mklink

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# touch file

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# mklink /D link_s file
symbolic link created for link_s <<===>> file

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# mklink /J link_j file
Junction created for link_j <<===>> file

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# echo hi > file

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# cat link_s
hi

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# cat link_j
hi

pcloudy@PCLOUDY2-W C:\Users\pcloudy\tmp
# dir
...
 Directory of C:\Users\pcloudy\tmp

07/06/2022  11:30 AM    <DIR>          .
07/06/2022  11:30 AM    <DIR>          ..
07/06/2022  11:30 AM                 7 file
07/06/2022  11:30 AM    <JUNCTION>     link_j [C:\Users\pcloudy\tmp\file]
07/06/2022  11:30 AM    <SYMLINKD>     link_s [file]
               1 File(s)              7 bytes

It looks like it actually works if a directory symlink or junction pointing to a file. So I guess yes, I agree with "On Windows, ctx.actions.symlink(target_path=...) will always create a directory symlink if the target doesn't exist". But maybe we keep the current behaviour of creating a junction since it doesn't require admin right?

@fmeum
Copy link
Collaborator

fmeum commented Jul 6, 2022

@meteorcloudy Thanks for the experiment, that's really helpful.

But maybe we keep the current behaviour of creating a junction since it doesn't require admin right?

Can a junction point to a relative path though? I thought that wasn't possible, but could you maybe give it a try (I don't have access to a Windows machine).

Admin access should no longer be required if we use SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE, at least using developer mode.

@meteorcloudy
Copy link
Member

Oh, yeah, junction can only take absolute path, see https://superuser.com/questions/343074/directory-junction-vs-directory-symbolic-link

Admin access should no longer be required if we use SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE, at least using developer mode.

Correct, but IIRC turning on developer mode requires admin right 😅.

Anyway, I'm fine with creating directory symlink if --windows_enable_symlinks is on. But if that flag is off, we have to fallback to create junction I guess? (which means ctx.actions.symlink cannot work correctly on Windows).

fmeum added a commit to fmeum/bazel that referenced this issue Jul 24, 2022
fmeum added a commit to fmeum/bazel that referenced this issue Jul 24, 2022
The documentation of `ctx.actions.symlink(target_path = ...)` states
that the given path is used as the symlink target without any changes,
but in reality this path has so far always been converted into an
absolute path by prepending the exec root. This is changed by this
commit, which gets very close to the documented behavior by preserving
the target path as is except for the standard normalization applied by
`PathFragment`. Improving the situation even further would require
modifying or adding to Bazel's core file system API, which may be done
in a follow-up PR.

Since relative symlinks only resolve correctly at the location they were
originally created, the logic for staging symlinks as runfiles in the
sandbox necessarily becomes more complicated: Analogously to the case of
local unsandboxed execution, symlinks in runfiles are now staged as
symlinks to the original exec path of the relative symlink under the
sandboxed exec root as well as the relative symlink itself in that
location.

As a side-effect of the switch to relative symlinks, this commit
resolves a non-hermeticity issue observed in
bazelbuild#10298 (comment)
Integration tests are added to verify that symlinks staged in the
sandbox are no longer resolved non-hermetically, including the case
where they are contained in runfiles.

Fixes bazelbuild#14224
fmeum added a commit to fmeum/bazel that referenced this issue Jul 26, 2022
The documentation of `ctx.actions.symlink(target_path = ...)` states
that the given path is used as the symlink target without any changes,
but in reality this path has so far always been converted into an
absolute path by prepending the exec root. This is changed by this
commit, which gets very close to the documented behavior by preserving
the target path as is except for the standard normalization applied by
`PathFragment`. Improving the situation even further would require
modifying or adding to Bazel's core file system API, which may be done
in a follow-up PR.

Since relative symlinks only resolve correctly at the location they were
originally created, they have to be handled specially when staged as
runfiles. This commit adds a new `declared_symlinks` parameter to the
rule context's `runfiles` method that takes in symlink artifacts
declared via `ctx.actions.declare_symlink`. These symlinks are staged at
their runfiles path directly, with no further processing of their target
and without any intermediate runfiles pointing back to the artifact's
location under the exec root. This has to main benefits:
* With local execution, symlinks are resolved within the runfiles tree,
  which is more hermetic than following the runfiles symlink back into
  the exec root and resolving the symlink artifact there.
* Actions can expect symlink artifacts to be stages as is without
  intermediate symlinks with local and sandboxed execution, both inside
  and outside the runfiles tree. This is important for packaging actions
  as well as rulesets sensitive to symlinks (e.g. rules_js).

As a side-effect of the switch to relative symlinks, this commit
resolves a non-hermeticity issue observed in
bazelbuild#10298 (comment)
Integration tests are added to verify that symlinks staged in the
sandbox are no longer resolved non-hermetically.

Fixes bazelbuild#14224
fmeum added a commit to fmeum/bazel that referenced this issue Jul 26, 2022
The documentation of `ctx.actions.symlink(target_path = ...)` states
that the given path is used as the symlink target without any changes,
but in reality this path has so far always been converted into an
absolute path by prepending the exec root. This is changed by this
commit, which gets very close to the documented behavior by preserving
the target path as is except for the standard normalization applied by
`PathFragment`. Improving the situation even further would require
modifying or adding to Bazel's core file system API, which may be done
in a follow-up PR.

Since relative symlinks only resolve correctly at the location they were
originally created, they have to be handled specially when staged as
runfiles. This commit adds a new `declared_symlinks` parameter to the
rule context's `runfiles` method that takes in symlink artifacts
declared via `ctx.actions.declare_symlink`. These symlinks are staged at
their runfiles path directly, with no further processing of their target
and without any intermediate runfiles pointing back to the artifact's
location under the exec root. This has to main benefits:
* With local execution, symlinks are resolved within the runfiles tree,
  which is more hermetic than following the runfiles symlink back into
  the exec root and resolving the symlink artifact there.
* Actions can expect symlink artifacts to be stages as is without
  intermediate symlinks with local and sandboxed execution, both inside
  and outside the runfiles tree. This is important for packaging actions
  as well as rulesets sensitive to symlinks (e.g. rules_js).

As a side-effect of the switch to relative symlinks, this commit
resolves a non-hermeticity issue observed in
bazelbuild#10298 (comment)
Integration tests are added to verify that symlinks staged in the
sandbox are no longer resolved non-hermetically.

Fixes bazelbuild#14224
fmeum added a commit to fmeum/bazel that referenced this issue Jul 26, 2022
The documentation of `ctx.actions.symlink(target_path = ...)` states
that the given path is used as the symlink target without any changes,
but in reality this path has so far always been converted into an
absolute path by prepending the exec root. This is changed by this
commit, which gets very close to the documented behavior by preserving
the target path as is except for the standard normalization applied by
`PathFragment`. Improving the situation even further would require
modifying or adding to Bazel's core file system API, which may be done
in a follow-up PR.

Since relative symlinks only resolve correctly at the location they were
originally created, they have to be handled specially when staged as
runfiles. This commit adds a new `declared_symlinks` parameter to the
rule context's `runfiles` method that takes in symlink artifacts
declared via `ctx.actions.declare_symlink`. These symlinks are staged at
their runfiles path directly, with no further processing of their target
and without any intermediate runfiles pointing back to the artifact's
location under the exec root. This has to main benefits:
* With local execution, symlinks are resolved within the runfiles tree,
  which is more hermetic than following the runfiles symlink back into
  the exec root and resolving the symlink artifact there.
* Actions can expect symlink artifacts to be stages as is without
  intermediate symlinks with local and sandboxed execution, both inside
  and outside the runfiles tree. This is important for packaging actions
  as well as rulesets sensitive to symlinks (e.g. rules_js).

As a side-effect of the switch to relative symlinks, this commit
resolves a non-hermeticity issue observed in
bazelbuild#10298 (comment)
Integration tests are added to verify that symlinks staged in the
sandbox are no longer resolved non-hermetically.

Fixes bazelbuild#14224
fmeum added a commit to fmeum/bazel that referenced this issue Jul 26, 2022
The documentation of `ctx.actions.symlink(target_path = ...)` states
that the given path is used as the symlink target without any changes,
but in reality this path has so far always been converted into an
absolute path by prepending the exec root. This is changed by this
commit, which gets very close to the documented behavior by preserving
the target path as is except for the standard normalization applied by
`PathFragment`. Improving the situation even further would require
modifying or adding to Bazel's core file system API, which may be done
in a follow-up PR.

Since relative symlinks only resolve correctly at the location they were
originally created, they have to be handled specially when staged as
runfiles. This commit adds a new `declared_symlinks` parameter to the
rule context's `runfiles` method that takes in symlink artifacts
declared via `ctx.actions.declare_symlink`. These symlinks are staged at
their runfiles path directly, with no further processing of their target
and without any intermediate runfiles pointing back to the artifact's
location under the exec root. This has to main benefits:
* With local execution, symlinks are resolved within the runfiles tree,
  which is more hermetic than following the runfiles symlink back into
  the exec root and resolving the symlink artifact there.
* Actions can expect symlink artifacts to be stages as is without
  intermediate symlinks with local and sandboxed execution, both inside
  and outside the runfiles tree. This is important for packaging actions
  as well as rulesets sensitive to symlinks (e.g. rules_js).

As a side-effect of the switch to relative symlinks, this commit
resolves a non-hermeticity issue observed in
bazelbuild#10298 (comment)
Integration tests are added to verify that symlinks staged in the
sandbox are no longer resolved non-hermetically.

Fixes bazelbuild#14224
@brandjon brandjon added untriaged team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts and removed team-Build-Language labels Nov 4, 2022
@comius comius removed the untriaged label Aug 22, 2023
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Oct 26, 2024
@tjgq tjgq added not stale Issues or PRs that are inactive but not considered stale and removed stale Issues or PRs that are stale (no activity for 30 days) labels Oct 27, 2024
@tomdegoede
Copy link
Contributor

We are running into issues due to this as well. The symlinks created by rules_js are invalidated for each bazel run. This is because of a mismatch between bazels internal relative model, and the actual filesystem state:

image

We only run into this problem for bazel run commands because run ignores --noexperimental_check_output_files. See my comment on this pr: #20853

@fmeum
Copy link
Collaborator

fmeum commented Nov 5, 2024

It looks like Bazel ends up creating symlinks as absolute paths on Windows, which is not the same issue, but still a bug.

Cc @tjgq for your linked PR

@fmeum fmeum linked a pull request Nov 5, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not stale Issues or PRs that are inactive but not considered stale P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: bug
Projects
None yet
10 participants