Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running rMATS on HPC #457

Open
FarzanehRah opened this issue Nov 28, 2024 · 5 comments
Open

Running rMATS on HPC #457

FarzanehRah opened this issue Nov 28, 2024 · 5 comments

Comments

@FarzanehRah
Copy link

Hi,

I ran rMATS on an HPC cluster, and there are two versions available as modules: rmats/4.1.2 and rmats/4.1.1. I also installed v4.3.0 from GitHub on the cluster (our clusters do not support conda). However, I’m encountering the same issue with all three versions:

traceback (most recent call last):
  File "~/rmats-turbo-master/rMATS_P/FDR.py", line 53, in <module>
    ifile = open(sys.argv[1]); title = ifile.readline()
FileNotFoundError: [Errno 2] No such file or directory: '../../output_11_24_comparison2/tmp/JC_SE/rMATS_result_P-V.txt'
paste: ../../output_11_24_comparison2/tmp/JC_SE/rMATS_result_FDR.txt: No such file or directory
........

I tried using the same versions of dependencies mentioned in the documentation, but for some, I wasn't able to use the specified versions (e.g., python/3.8.10 instead of python/3.6.12, proj4-fortran/1.0 ( proj4-fortran is a f77 and f90 wrappers for proj4) instead of gfortran (Fortran 77)).
I am getting this error when running the following command (v4.3.0):

gdb ~/rmats-turbo-master/rMATS_C/rMATSexe 
r -i ../../output_11_24_comparison2/JC.raw.input.SE.txt -t 1 -o ../../output_11_24_comparison2/tmp/JC_SE/rMATS_result_P-V.txt -c 0.0001

~/rmats-turbo-master/rMATS_C/rMATSexe: error while loading shared libraries: libgfortran.so.3: cannot open shared object file: No such file or directory
[Inferior 1 (process 1718094) exited with code 0177]

Is there any solution to this on HPC?

Thank you!

@EricKutschera
Copy link
Contributor

From ~/rmats-turbo-master/rMATS_C/rMATSexe: error while loading shared libraries: libgfortran.so.3: cannot open shared object file: No such file or directory
It looks like you were able to compile rMATSexe and it is looking for a specific version of libgfortran.so. The build uses -lgfortran and I think it would have needed to find libgfortran.so somewhere on your system for the build to succeed: https://github.com/Xinglab/rmats-turbo/blob/v4.3.0/rMATS_C/Makefile#L12
Maybe you have libgfortran.so.3 and just need to set LD_LIBRARY_PATH to let it find it

You could try installing gfortran and then building from the source code again: https://gcc.gnu.org/wiki/GFortran

@FarzanehRah
Copy link
Author

Thank you so much for your quick reply, I really appreciate it. I will try installing gfortran and building rmats from source code again.

@FarzanehRah
Copy link
Author

Hi Eric,
I was able to resolve the initial error related to libgfortran.so.3 (i.e., the library loading issue), I am still encountering a segmentation fault when I try to run rmats.

I’ve tried debugging the issue using gdb and valgrind, but I haven’t been able to resolve it. valgrind suggests that there might be memory-related issues, but I cannot pinpoint the exact cause. Below is the output, there might be an issue related to memory handling or an incompatibility with the HPC environment?

gdb /cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Compiler/gcc9/rmats/4.1.2/rMATS_C/rMATSexe
run ~/easybuild/software/2020/avx2/Compiler/gcc9/rmats/4.1.2/rMATS_C/rMATSexe -i ../a_test/JC.raw.input.SE.txt -o ../a_test/tmp/JC_SE/rMATS_result_P-V.txt -c 0.0001
[Thread debugging using libthread_db enabled]
Using host libthread_db library "~/gentoo/2020/lib64/libthread_db.so.1".
number of thread=1; input file=../a_test/JC.raw.input.SE.txt; output folder=../a_test/tmp/JC_SE/rMATS_result_P-V.txt; cutoff=0.0001;
Testing 189

Program received signal SIGSEGV, Segmentation fault.
0x000015554ba5f431 in mkl_blas_def_xdcopy () from ~/easybuild/software/2020/Core/imkl/2020.1.217/mkl/lib/intel64/libmkl_def.so`
valgrind --leak-check=full --track-origins=yes --show-reachable=yes --verbose ~/easybuild/software/2020/avx2/Compiler/gcc9/rmats/4.1.2/rMATS_C/rMATSexe -i ../a_test/JC.raw.input.SE.txt -o ../a_test/tmp/JC_SE/rMATS_result_P-V.txt -c 0.0001
valgrind: m_mallocfree.c:305 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 112, hi = 3545286390305929524.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.

host stacktrace:
--2476336-- VALGRIND INTERNAL ERROR: Valgrind received a signal 7 (SIGBUS) - exiting
--2476336-- si_code=128;  Faulting address: 0x0;  sp: 0x1008ca87a0

valgrind: the 'impossible' happened:
   Killed by fatal signal

host stacktrace:
Segmentation fault (core dumped)

Any advice or possible fixes would be greatly appreciated.
Thank you

@EricKutschera
Copy link
Contributor

The error message from gdb shows an error with a blas library call, but it doesn't show where in the rmats code it happens

Program received signal SIGSEGV, Segmentation fault.
0x000015554ba5f431 in mkl_blas_def_xdcopy () from ~/easybuild/software/2020/Core/imkl/2020.1.217/mkl/lib/intel64/libmkl_def.so`

You could try compiling with debugging info: https://github.com/Xinglab/rmats-turbo/blob/v4.3.0/rMATS_C/Makefile#L13

change -O2 to -O0 -ggdb and recompile

Then if you run in gdb like before, when it shows the segfault you can run bt to get a back trace. Maybe you'll also be able to get gdb to show the variable values in the line of rmats code that made the library call

@FarzanehRah
Copy link
Author

I was not able to figure out the issue on the HPC clusters, but successfully ran rMATS on Mac.
Thanks again for your time and help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants