Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Ctrl-C" during installation quits opam but the build continues #4400

Closed
RalfJung opened this issue Oct 20, 2020 · 10 comments · Fixed by #4530
Closed

"Ctrl-C" during installation quits opam but the build continues #4400

RalfJung opened this issue Oct 20, 2020 · 10 comments · Fixed by #4530
Assignees
Milestone

Comments

@RalfJung
Copy link

If your issue concerns a package not building, please report to
https://github.com/ocaml/opam-repository/issues or to the package maintainer
unless you are confident it is an issue in the opam tool itself.

I just noticed that when I cancel opam via Ctrl-C during the build of something that needs a long time to build, opam quits immediately -- but the build actually continues in the background, still taking up CPU and memory.

I would expect Ctrl-C to also cancel the running build.

# opam config report
# opam-version      2.0.7 
# self-upgrade      no
# system            arch=x86_64 os=linux os-distribution=debian os-version=testing
# solver            builtin-mccs+glpk
# install-criteria  -removed,-count[version-lag,request],-count[version-lag,changed],-changed
# upgrade-criteria  -removed,-count[version-lag,solution],-new
# jobs              8
# repositories      4 (http), 3 (local), 2 (version-controlled) (default repo at 6bbebb34)
# pinned            1 (git)
# current-switch    /home/r/Dokumente/Unisachen/perennial/coq
@rjbou
Copy link
Collaborator

rjbou commented Oct 20, 2020

Opam relay signal, but if the process don't catch them, opam can't really do something... See discussion in #3373

@RalfJung
Copy link
Author

RalfJung commented Oct 20, 2020

After the Ctrl-C, I saw a make process somewhere, sending that a SIGTERM worked fine. make reacts to Ctrl-C so I assume SIGINT would also have worked. So given that make handles Ctrl-C just fine, I think it is pretty clear that something on the opam side is not properly forwarding the signal to the actual build.

Maybe it is swallowed by bubblewrap?

@rjbou rjbou added this to the 2.1.0~beta3 milestone Oct 20, 2020
@AltGr
Copy link
Member

AltGr commented Oct 22, 2020

That should absolutely not be the case in general, so I would be extremely interested if you could provide a reproducible case, or at least more details on your build and how it happened; what often happens is that a process (typically, it's difficult to handle C-c properly in OCaml) eats the signal. But yes, probably not in the case of make...

In fact, in any case, opam is expected to wait for the child processes to exit.

So indeed bubblewrap may be the one causing trouble here, they have e.g. this open issue

@RalfJung
Copy link
Author

That should absolutely not be the case in general, so I would be extremely interested if you could provide a reproducible case,

opam pin add -nq coq-perennial.dev git+https://github.com/mit-pdos/perennial#coq/tested
opam install coq-perennial
# wait a minute
# Ctrl-C
# the build will go on for another hour...

@AltGr
Copy link
Member

AltGr commented Oct 22, 2020

Thanks!

@AltGr
Copy link
Member

AltGr commented Oct 22, 2020

Could indeed reproduce, thanks! Some early findings:

  • upon exiting opam, the make -j3 process has been adopted by init
  • this means that the bubblewrap process was properly terminated
  • sending SIGINT to that make process has no effect; I had to send SIGTERM to end it.

So what I am assuming is that bubblewrap propagates the signal correctly but terminates immediately; opam is only waiting for its direct child bwrap so doesn't see any problem and closes; but for some reason that I have not figured yet, coq's Makefile doesn't seem to exit on SIGINT.

@RalfJung
Copy link
Author

RalfJung commented Oct 22, 2020

sending SIGINT to that make process has no effect; I had to send SIGTERM to end it.

I have seen that behavior in the past as well, but when I just tried to reproduce it, a SIGINT to make worked as expected and terminated the build.

@dra27 dra27 removed this from the 2.1.0~beta4 milestone Nov 13, 2020
@LasseBlaauwbroek
Copy link
Contributor

I haven't tested this, but adding the --die-with-parent option to bwrap calls should fix this:

       --die-with-parent
           Ensures child process (COMMAND) dies when bwrap's parent dies. Kills (SIGKILL) all
           bwrap sandbox processes in sequence from parent to child including COMMAND process
           when bwrap or bwrap's parent dies. See prctl, PR_SET_PDEATHSIG.

@dra27 dra27 added this to the 2.1.0~rc milestone Jan 22, 2021
@dra27 dra27 modified the milestones: 2.1.0~rc, 2.2.0~alpha Feb 5, 2021
@avsm
Copy link
Member

avsm commented Feb 5, 2021

Dev meeting: we may be able to fix this with the bwrap option above, but it requires more work for bubblewrap version detection. Pushing this out to post 2.1.0 as it can go into a bugfix release and is not a critical bug for the 2.1.0 release.

@dra27 dra27 modified the milestones: 2.2.0~alpha, 2.1.1 Feb 5, 2021
@kit-ty-kate
Copy link
Member

kit-ty-kate commented Feb 7, 2021

The --die-with-parent option exists since bubblewrap 0.1.8 (https://github.com/containers/bubblewrap/releases/tag/v0.1.8).
Looking at the distributions support it seems that only Debian 9 (EOL since July 2020) has still not made it to 0.1.8: https://pkgs.org/search/?q=bubblewrap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants