You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did some digging. The process runs fine for several hundred kernels, then the JIT call to GCC fails for no reason. The kernel where compilation fails is unremarkable and varies between runs.
I noticed the following warning when starting the run:
--------------------------------------------------------------------------
A process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
Local host: [[14689,0],0] (PID 32590)
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
It seems like fork and thus subprocess::Popen is not supported from applications running through MPI. Could this be causing the problem?
I tried running Bohrium on multiple nodes on the cluster, but it crashes with
This only happens when using more than 1 node (multiple processes on the same node work fine), so it might be another filesystem issue (#598)?
I tried disabling the persistent cache, to no avail.
The text was updated successfully, but these errors were encountered: