Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opam 2.1~beta2] Sandboxed opam var fail when jobs > 1(?) #4354

Closed
kit-ty-kate opened this issue Sep 13, 2020 · 13 comments · Fixed by #4467
Closed

[opam 2.1~beta2] Sandboxed opam var fail when jobs > 1(?) #4354

kit-ty-kate opened this issue Sep 13, 2020 · 13 comments · Fixed by #4467
Assignees
Projects
Milestone

Comments

@kit-ty-kate
Copy link
Member

Using opam 2.1.0~beta2, I sometimes get failures when compiling ocaml-freestanding and/or mirage-crypto which both use opam var during their compilation.
For instance if I compile mirage-crypto alone, it compiles fine:

$ opam install mirage-crypto
The following actions will be performed:
  ∗ install mirage-crypto 0.8.5
Do you want to continue? [Y/n] y

<><> Processing actions ><><><><><><><><><><><><><><><><><><><><><><><><><><><>
⬇ retrieved mirage-crypto.0.8.5  (cached)
∗ installed mirage-crypto.0.8.5
Done.

But if I compile both ocaml-freestanding and mirage-crypto at the same time, opam fails during the opam var call:

$ opam reinstall mirage-crypto ocaml-freestanding
The following actions will be performed:
  ↻ recompile ocaml-freestanding 0.6.2
  ↻ recompile mirage-crypto      0.8.5
  ↻ recompile hacl_x25519        0.2.0 [uses ocaml-freestanding]
  ↻ recompile fiat-p256          0.2.1 [uses ocaml-freestanding]
===== ↻ 4 =====
Do you want to continue? [Y/n] y

<><> Processing actions ><><><><><><><><><><><><><><><><><><><><><><><><><><><>
⬇ retrieved fiat-p256.0.2.1  (cached)
⬇ retrieved hacl_x25519.0.2.0  (cached)
⬇ retrieved ocaml-freestanding.0.6.2  (cached)
⬇ retrieved mirage-crypto.0.8.5  (cached)
⊘ removed   fiat-p256.0.2.1
⊘ removed   hacl_x25519.0.2.0
⊘ removed   mirage-crypto.0.8.5
⊘ removed   ocaml-freestanding.0.6.2
∗ installed ocaml-freestanding.0.6.2
∗ installed fiat-p256.0.2.1
∗ installed hacl_x25519.0.2.0
[ERROR] The compilation of mirage-crypto.0.8.5 failed at "dune build -p mirage-crypto -j 4".

#=== ERROR while compiling mirage-crypto.0.8.5 ================================#
# context     2.1.0~beta2 | linux/x86_64 | ocaml-base-compiler.4.11.1 | https://opam.ocaml.org/#706bf1d6
# path        ~/.opam/4.11/.opam-switch/build/mirage-crypto.0.8.5
# command     ~/.opam/opam-init/hooks/sandbox.sh build dune build -p mirage-crypto -j 4
# exit-code   1
# env-file    ~/.opam/log/mirage-crypto-554126-ad6566.env
# output-file ~/.opam/log/mirage-crypto-554126-ad6566.out
### output ###
#       cflags freestanding/cflags-freestanding.sexp (exit 1)
# (cd _build/default/freestanding && ./cflags.sh) > _build/default/freestanding/cflags-freestanding.sexp
# Fatal error:
# /usr/local/bin/opam: "open" failed on /home/kit_ty_kate/.opam/4.11/.opam-switch/packages/cache: Read-only file system



<><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
┌─ The following actions failed
│ λ build mirage-crypto 0.8.5
└─ 
┌─ The following changes have been performed
│ ⊘ remove    mirage-crypto      0.8.5
│ ↻ recompile fiat-p256          0.2.1
│ ↻ recompile hacl_x25519        0.2.0
│ ↻ recompile ocaml-freestanding 0.6.2
└─ 
@AltGr
Copy link
Member

AltGr commented Sep 14, 2020

Ouch, I see what happens here. This file is a new cache that can get written to after loading the installed opam packages files.
When a package gets installed or removed, opam wipes that cache to invalidate it, and then the next call to opam var will try to rewrite it. The package builds should really be calling opam var --readonly, though, in general any opam command that is possibly called from within opam should have --readonly.

Do you think we should push for more use of --readonly ? Solving this would otherwise be as simple as ignoring the error...

@AltGr
Copy link
Member

AltGr commented Sep 23, 2020

Same issue happens with opam install alt-ergo-lib.2.3.3
Cf. OCamlPro/alt-ergo#364

We should probably just silence the error for now.

@kit-ty-kate
Copy link
Member Author

I'm still unsure why opam var isn't readonly by default though. Is there a reason behind this? I think the mirage team (cc @hannesm) already asked this question somewhere but just in case

@AltGr
Copy link
Member

AltGr commented Oct 6, 2020

Well, opam keeps some internal caches in sync, and opam var might trigger an update of those in some cases. There would be no technical difficulty in making it read-only, but that might increase the cost of repeated calls in a normal (non reentrant) setting.

@dra27 dra27 added this to the 2.1.0~beta3 milestone Oct 30, 2020
@dra27 dra27 added this to To do in Opam 2.1.x via automation Oct 30, 2020
@kit-ty-kate
Copy link
Member Author

There would be no technical difficulty in making it read-only, but that might increase the cost of repeated calls in a normal (non reentrant) setting.

I don't think this is an issue. I don't think anyone use opam var outside of scripts to be used within opam itself.

kit-ty-kate added a commit to kit-ty-kate/ocaml-solo5 that referenced this issue Nov 6, 2020
@hannesm
Copy link
Member

hannesm commented Nov 6, 2020

I don't know much about these bits in opam, but I'm curious why this is only an issue in 2.1...

I understand "opam keeps some internal caches in sync, and opam var might trigger an update of those in some cases" - but I do not know the full background story: why are there caches that may be updated? Do they improve the execution time of sth tremendously?

From a user perspective, I'd have never guessed that opam var may lead to a disk write anywhere. Since opam 2.0 did not lead to the observation of the behaviour described above, I'd be in favor to revert to the 2.0 behaviour.

@dra27
Copy link
Member

dra27 commented Nov 13, 2020

Curiously, this should have been addressed #4405?

@rjbou rjbou linked a pull request Nov 13, 2020 that will close this issue
@rjbou rjbou moved this from To do to In Progress in Opam 2.1.x Nov 13, 2020
Opam 2.1.x automation moved this from In Progress to Done Dec 10, 2020
@kit-ty-kate
Copy link
Member Author

As noted in #4429, the fix doesn't fully work. Could this issue be reopened?

@rjbou rjbou reopened this Dec 10, 2020
Opam 2.1.x automation moved this from Done to To do Dec 10, 2020
@AltGr
Copy link
Member

AltGr commented Dec 15, 2020

@hannesm well, computing the environment is quite involved, since it requires listing all installed packages, getting their definitions, etc. so caching this means a much welcome speedup (esp. when used in the shell prompt, emacs startup script, etc.)

Now when you are in the middle of changing the state of that switch, the cache needs to be invalidated; the default in case of a missing cache is for opam var to update it after computing its value, unless --readonly was specified. The proposed fix is to just give up on the cache update if you happen not to be able to write at that point.

@AltGr
Copy link
Member

AltGr commented Dec 15, 2020

@kit-ty-kate can't manage to reproduce on master ; could you give a backtrace or at least the precise error you are getting ?

@hannesm
Copy link
Member

hannesm commented Dec 16, 2020

@AltGr thanks for explanation -- I'm now convinced that --readonly should be added to the invocations when reading a variable -- since when is --readonly available in opam, was it already in 2.0.0 (or even 1.x)?

@kit-ty-kate
Copy link
Member Author

@hannesm --readonly is available since 2.0.0, if you wish to be 1.x-compatible you might want to use --safe instead.

@dra27 dra27 modified the milestones: 2.1.0~beta4, 2.1.0~beta5 Jan 8, 2021
@kit-ty-kate
Copy link
Member Author

I haven't seen these errors for a while. Closing. I'll reopen the issue or recreate it if I see the issue again.

Opam 2.1.x automation moved this from To do to Done Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Opam 2.1.x
  
Done
Development

Successfully merging a pull request may close this issue.

5 participants