Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if pidlock process is still running #50213

Closed
maleadt opened this issue Jun 19, 2023 · 1 comment · Fixed by #50214
Closed

Check if pidlock process is still running #50213

maleadt opened this issue Jun 19, 2023 · 1 comment · Fixed by #50214

Comments

@maleadt
Copy link
Member

maleadt commented Jun 19, 2023

I somehow interrupted a precompilation process, resulting in a stale pidlock somewhere in my depot. This causes precompilation to hang, which for all intents and purposes looks like a plain hang (IIUC JuliaLang/Pkg.jl#3519 wouldn't help here, because this is triggered by using Package in code, not by Pkg.precompile()).

Setting JULIA_DEBUG shows that its stuck waiting for a pidlock, as does the backtrace when interrupting:

┌ Debug: Waiting for another process (pid: 5719) to finish precompiling CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]
└ @ Base loading.jl:2831
^C
[7169] signal (2): Interrupt
in expression starting at /home/tim/Julia/pkg/CUDA/wip.jl:1
epoll_wait at /usr/lib/libc.so.6 (unknown line)
uv__io_poll at /workspace/srcdir/libuv/src/unix/epoll.c:236
uv_run at /workspace/srcdir/libuv/src/unix/core.c:400
ijl_task_get_next at /home/tim/Julia/src/julia/src/partr.c:436
unknown function (ip: 0x7f59f774c93e)
unknown function (ip: 0x7f59f75d646c)
unknown function (ip: 0x7f59f77a27c5)
unknown function (ip: 0x7f59f704af20)
unknown function (ip: 0x7f59f6fa0c03)
watch_file at /home/tim/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/FileWatching/src/FileWatching.jl:778 [inlined]
JuliaLang/Pkg.jl#13 at /home/tim/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/FileWatching/src/pidfile.jl:257
unknown function (ip: 0x7f5a0d07daf9)
_jl_invoke at /home/tim/Julia/src/julia/src/gf.c:2870 [inlined]
ijl_apply_generic at /home/tim/Julia/src/julia/src/gf.c:3071
jl_apply at /home/tim/Julia/src/julia/src/julia.h:1963 [inlined]
start_task at /home/tim/Julia/src/julia/src/task.c:1238
unknown function (ip: (nil))
Allocations: 2836 (Pool: 2827; Big: 9); GC: 0

Crucially, process 5719 just doesn't exist anymore. Instead of waiting for the pidlock to go stale, which seems like ages (it's been almost 10 minutes), shouldn't we first check if the process even exists and if not break the lock?

@IanButterworth IanButterworth transferred this issue from JuliaLang/Pkg.jl Jun 19, 2023
@IanButterworth
Copy link
Sponsor Member

The lock should be broken if the process doesn't exist anymore but I forgot to set stale_age which disabled all that.

function isvalidpid(hostname::AbstractString, pid::Cuint)
# can't inspect remote hosts
(hostname == "" || hostname == gethostname()) || return true
# pid < 0 is never valid (must be a parser error or different OS),
# and would have a completely different meaning when passed to kill
!iswindows() && pid > typemax(Cint) && return false
# (similarly for pid 0)
pid == 0 && return false
# see if the process id exists by querying kill without sending a signal
# and checking if it returned ESRCH (no such process)
return ccall(:uv_kill, Cint, (Cuint, Cint), pid, 0) != UV_ESRCH
end

Fixed in #50214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants