Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with .opam/download-cache and packages which depend on git information #4751

Closed
Alizter opened this issue Jul 14, 2021 · 17 comments
Closed
Assignees

Comments

@Alizter
Copy link

Alizter commented Jul 14, 2021

I have a problem with the download-cache folder of opam. To replicate the behaviour try the following:

  1. Get the coq dev packages: opam repo add coq-core-dev https://coq.inria.fr/opam/core-dev
  2. Create a switch and install coq.8.13.dev
  3. Now create another switch and install coq.dev
  4. The installation of coq will fail due to some confusion with the git history in download-cache.

I know that this is not a problem with the coq packaging, because if you delete the download-cache folder between steps 2 and 3 it will work. For some reason, opam is not doing something right with the git stuff here.

# opam config report
# opam-version         2.1.0~rc2 
# self-upgrade         no
# system               arch=x86_64 os=linux os-distribution=ubuntu os-version=18.04
# solver               builtin-mccs+glpk
# install-criteria     -removed,-count[avoid-version,changed],-count[version-lag,request],-count[version-lag,changed],-count[missing-depexts,changed],-changed
# upgrade-criteria     -removed,-count[avoid-version,changed],-count[version-lag,solution],-count[missing-depexts,changed],-new
# jobs                 15
# repositories         4 (http) (default repo at 8c58ed60)
# pinned               1 (version)
# current-switch       coq.dev
# ocaml:native         true
# ocaml:native-tools   true
# ocaml:native-dynlink true
# ocaml:stubsdir       /mnt/sdd1/.opam/coq.dev/lib/ocaml/stublibs:/mnt/sdd1/.opam/coq.dev/lib/ocaml
# ocaml:preinstalled   false
# ocaml:compiler       4.12.0

@kit-ty-kate
Copy link
Member

would you be able to post the error messages for the error?

@Alizter
Copy link
Author

Alizter commented Jul 14, 2021

<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><><><>
[ERROR] The installation of coq failed at "make install".

#=== ERROR while installing coq.dev ===========================================#
# context     2.1.0~rc2 | linux/x86_64 |  | pinned(git+https://github.com/coq/coq.git#master#f684f34e)
# path        /mnt/sdd1/.opam/coq.dev/.opam-switch/build/coq.dev
# command     /mnt/sdd1/.opam/opam-init/hooks/sandbox.sh install make install
# exit-code   2
# env-file    /mnt/sdd1/.opam/log/coq-19150-d81af9.env
# output-file /mnt/sdd1/.opam/log/coq-19150-d81af9.out
### output ###
# [...]
# DUNE      sources
# dune install --display quiet  --mandir=""/mnt/sdd1/.opam/coq.dev/man"" --prefix="/mnt/sdd1/.opam/coq.dev" coq-core
# Installing /mnt/sdd1/.opam/coq.dev/lib/coq-core/META
#          git (internal) (exit 128)
# /usr/bin/git describe --always --dirty > /opam-tmp/dunecc2a25.output
# error: object directory /mnt/sdd1/.opam/download-cache/git/objects does not exist; check .git/objects/info/alternates.
# fatal: bad object HEAD
# Makefile.install:56: recipe for target 'install-dune' failed
# make[1]: *** [install-dune] Error 1
# make[1]: Leaving directory '/mnt/sdd1/.opam/coq.dev/.opam-switch/build/coq.dev'
# Makefile.make:122: recipe for target 'submake' failed
# make: *** [submake] Error 2



<><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
┌─ The following actions failed
│ ∗ install coq dev
└─ 

@dra27
Copy link
Member

dra27 commented Jul 15, 2021

Thanks, @Alizter - please could you give the precise instructions you are running to do this. For example, there are additional repositories and possibly pins which you haven't mentioned.

@Alizter
Copy link
Author

Alizter commented Jul 15, 2021

@dra27 I haven't pinned anything in either switch. You need the coq dev packages for opam:

opam repo add coq-core-dev https://coq.inria.fr/opam/core-dev

After which, I create a switch and install coq.8.13.dev for example:

opam switch create myswitch ocaml-option-flambda.1
opam install coq.8.13.dev

AFAICT opam is cloning the git repo for coq in the background and putting it in download-cache, with the 8.13.dev branch checked out.

Now making another switch and installing the latest dev version of coq

opam switch create myotherswitch ocaml-option-flambda.1
opam install coq.dev

will fail at some point during the build. This is because the dune setup in coq uses the commit hash and something isn't being found. The error I posted above gets displayed.

The workaround is to delete download-cache which necessarily causes a fresh copy of the coq repo to be cloned. Therefore doing

opam install coq.dev

works according to plan.

I'm not sure how I can be more precise here?

@dra27
Copy link
Member

dra27 commented Jul 15, 2021

Thanks - those commands are more detailed than the original report. In terms of being more precise, they could be commands which actually work! 😉 opam repo add doesn't update new switch selections, so the switches you create you won't include coq-core-dev.

This Dockerfile is working for me (technically it's using the tip of the 2.1 opam branch, so it reports 2.1.0 rather than 2.1.0~rc2, but there're no relevant changes, as we're not using depexts here):

FROM ocaml/opam
RUN sudo apt-get install -yy libgmp-dev && \
    cd ~/opam-repository && \
    git pull origin master && \
    sudo ln -f /usr/bin/opam-2.1 /usr/bin/opam && \
    opam update
RUN opam --version

RUN opam repo add coq-core-dev https://coq.inria.fr/opam/core-dev --all-switches --set-default
RUN opam switch create myswitch ocaml-option-flambda
RUN opam install coq.8.13.dev
RUN opam switch create myotherswitch ocaml-option-flambda
RUN opam install coq.dev

That builds.

AFAICT opam is cloning the git repo for coq in the background and putting it in download-cache, with the 8.13.dev branch checked out.

There shouldn't be a checkout in the download-cache - are you sure that's what you've got? There should be a bare git repository in .opam/download-cache/git containing branches for each cached clone:

opam@312a669bd5ea:~/.opam/download-cache/git$ git branch -a
  remotes/70a69f9730c6b0a33869c5b29b3cd1ce
  remotes/fb9e53fc5266f3285fd2364f0292dd62

and those are used to populate the sources directory in the switches themselves.

@Alizter
Copy link
Author

Alizter commented Jul 16, 2021

OK here is another way to get my issue:

opam switch create myswitch ocaml-option-flambda.1
opam repo add coq-core-dev https://coq.inria.fr/opam/core-dev
opam install coq.8.13.dev
opam install coq.dev

Perhaps I am misdiagnosing download-cache as being the issue here.

The way to fix the above is to delete download-cache however.

@dra27
Copy link
Member

dra27 commented Jul 19, 2021

That's working for me in Docker, as well (modulo adding --all-switches --set-default to your opam repo add command).

@dra27
Copy link
Member

dra27 commented Jul 19, 2021

Would you be able to try doing it with a fresh opam root - so doing opam init /mnt/sdd1/.opam-temp and either exporting OPAMROOT=/mnt/sdd1/.opam-temp, temporarily adding it to all commands or using --root=/mnt/sdd1/.opam-temp (i.e. please don't erase /mnt/sdd1/.opam!)

@Alizter
Copy link
Author

Alizter commented Jul 19, 2021

So I tried doing a fresh opam root, and my instructions don't seem to work. More rather, everything works as intended. The reason I didn't think this was to do with my particular opam root is because I did a fresh root very recently. I had a .opam from before 2.1 which I update to be compatible, and I got the error above. Thinking I corrupted my root somehow, I created a fresh one but ran into the same issue, hence submitted the bug report.

I initially brought this up with the coq developers, but they said that it was probably an opam issue, and suggested that I file a report here. Now going back to my original problem, everything works as intended. So it probably was a Coq bug like I suspected in the beginning. I haven't tracked down where it happened, but I am more confident that it has nothing to do with opam.

I am therefore closing this issue. Thanks for all your help! And also for showing me how to do temp .opam dirs.

@Alizter Alizter closed this as completed Jul 19, 2021
@dra27
Copy link
Member

dra27 commented Jul 19, 2021

No problem, @Alizter! The ocaml/opam docker images are very useful for putting reproduction cases together, as is the temporary opam root trick. If this does spring back up, do of course re-open/create a fresh issue.

@Alizter
Copy link
Author

Alizter commented Jul 19, 2021

I had a lot of trouble running docker locally, but not enough time to investigate it. Specifically the sudo commands were causing problems.

@Alizter Alizter reopened this Jul 19, 2021
@Alizter
Copy link
Author

Alizter commented Jul 19, 2021

I just ran into this:

The following actions will be performed:
  ↻ recompile coq      dev [upstream or system changes]
  ↻ recompile coqide   dev [upstream or system changes]
  ↻ recompile coq-hott dev [upstream or system changes]
===== ↻ 3 =====
Do you want to continue? [Y/n] y

<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><><><>
⬇ retrieved coq.dev  (no changes)
⬇ retrieved coqide.dev  (no changes)
⬇ retrieved coq-hott.dev  (no changes)
⊘ removed   coq-hott.dev
⊘ removed   coqide.dev
⊘ removed   coq.dev
[ERROR] The installation of coq failed at "make install".

#=== ERROR while installing coq.dev ===========================================#
# context     2.1.0~rc2 | linux/x86_64 | ocaml-option-flambda.1 | https://coq.inria.fr/opam/core-dev#2021-07-12 15:20
# path        /mnt/sdd1/.opam/coq-hott-dep/.opam-switch/build/coq.dev
# command     /mnt/sdd1/.opam/opam-init/hooks/sandbox.sh install make install
# exit-code   2
# env-file    /mnt/sdd1/.opam/log/coq-466155-1e9b04.env
# output-file /mnt/sdd1/.opam/log/coq-466155-1e9b04.out
### output ###
# [...]
# make --warn-undefined-variable --no-builtin-rules -f Makefile.build install
# make[1]: Entering directory '/mnt/sdd1/.opam/coq-hott-dep/.opam-switch/build/coq.dev'
# DUNE      sources
# dune install --display quiet  --mandir=""/mnt/sdd1/.opam/coq-hott-dep/man"" --prefix="/mnt/sdd1/.opam/coq-hott-dep" coq-core
# Installing /mnt/sdd1/.opam/coq-hott-dep/lib/coq-core/META
#          git (internal) (exit 128)
# /usr/bin/git describe --always --dirty > /opam-tmp/dune0a8ea9.output
# error: object directory /mnt/sdd1/.opam/download-cache/git/objects does not exist; check .git/objects/info/alternates
# fatal: bad object HEAD
# make[1]: *** [Makefile.install:56: install-dune] Error 1
# make[1]: Leaving directory '/mnt/sdd1/.opam/coq-hott-dep/.opam-switch/build/coq.dev'
# make: *** [Makefile.make:122: submake] Error 2



<><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
┌─ The following actions failed
│ ∗ install coq dev
└─ 
┌─ The following changes have been performed (the rest was aborted)
│ ⊘ remove coq      dev
│ ⊘ remove coq-hott dev
│ ⊘ remove coqide   dev
└─ 

The former state can be restored with:
    /usr/bin/opam switch import
"/mnt/sdd1/.opam/coq-hott-dep/.opam-switch/backup/state-20210719183632.export"
Or you can retry to install your package selection with:
    /usr/bin/opam install --restore

It's a bit tricky to test, but I suspect that upgrading a dev version of a package (in this case coq) doesn't change which branch is checked out, but something is getting left behind.

@dra27 Any idea how I might test this?

@dra27
Copy link
Member

dra27 commented Jul 27, 2021

We discussed this again in last Friday's dev meeting and finally figured it out. Thanks for persisting, @Alizter!

It turns out that it's a bug in the sandbox which is why we're all struggling to reproduce it in Docker (where the sandbox is disabled, as Docker's already a sandbox...).

The issue is that you have your .opam root in a "funny" place - /mnt is not available in the sandbox. opam uses git alternates to share commits between clones (that's obviously a big win when you pin a single Git repository with more than one package in it). So what's happening is that the Git clone of the coq repo references the download cache which is then not found.

The fix you were seeing by deleting the download-cache is because opam doesn't set up the alternates mechanism for the git clone when there isn't a download cache (it just does a normal clone).

Do you have OPAMROOT=/mnt/sdd1/.opam or is $HOME set to /mnt/sdd1? Either way, we should ensure that at least the download-cache (and probably the entire opamroot) is always mounted read-only in the sandbox.

There is a workaround at the moment, which it'd be handy if you could try just to confirm that we've definitely nailed the bug: running with OPAM_USER_PATH_RO=/mnt/sdd/.opam should fix the problem.

@Alizter
Copy link
Author

Alizter commented Jul 27, 2021

Do you have OPAMROOT=/mnt/sdd1/.opam or is $HOME set to /mnt/sdd1?

I believe I just syslinked the default .opam to live in sdd1.

I'll try what you suggested next week when I have aceess to my machine again.

@kit-ty-kate
Copy link
Member

Fix proposed in #4795. Could you double check if it fixes your issue?

@AltGr
Copy link
Member

AltGr commented Nov 8, 2021

Closing as "probably fixed"; feel free to reopen if not, and thanks again for the report and help tracking the issue.

@AltGr AltGr closed this as completed Nov 8, 2021
@Alizter
Copy link
Author

Alizter commented Nov 9, 2021

Yep that's fine. I don't have access to the original machine where .opam was being installed to an external drive anymore so I wasn't able to test the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants