Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submitting multi-threaded jobs using slurm-drmaa #2

Closed
BrunoGrandePhD opened this issue Jul 17, 2017 · 7 comments
Closed

Submitting multi-threaded jobs using slurm-drmaa #2

BrunoGrandePhD opened this issue Jul 17, 2017 · 7 comments

Comments

@BrunoGrandePhD
Copy link

I realize it's strange to ask a question like this on a repository, but I've spent the past hour trying to figure it out on my own to no avail. I thought that you might be able to answer it in 30 seconds. I would greatly appreciate any help!

In essence, how do you submit multi-threaded jobs using slurm-drmaa? To be clear, I want the job to run on one node (i.e. --ntasks=1). I use the --cpus-per-task option with srun or sbatch, but this option isn't available in the native specification for slurm-drmaa.

I've tried different combinations of --mincpus, --nodes, --ntasks-per-node and --ntasks, but they either allow jobs to be split across multiple nodes or they fail. I've looked through the code for galaxyproject/galaxy and galaxyproject/pulsar, but I couldn't find any hints.

@natefoo
Copy link
Owner

natefoo commented Jul 20, 2017

I use --nodes=1 --ntasks=N, does this work for you?

@BrunoGrandePhD
Copy link
Author

Unfortunately, those parameters don't prevent jobs from being split across multiple nodes, at least on our SLURM cluster. This is despite explicitly specifying --nodes=1. I'm not sure what to make of this. Do you have any ideas?

JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
32070	 all run_test  bgrande  RUNNING       0:47 50-00:00:00      6 n[106,317-321]
32071	 all run_test  bgrande  RUNNING       0:47 50-00:00:00      8 n[311-315,330-332]
32072	 all run_test  bgrande  RUNNING       0:17 50-00:00:00     12 n[109,123,141,143,145,209,211,223,227,243-244,324]
32073	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32074	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32075	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32076	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32077	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32078	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)
32079	 all run_test  bgrande  PENDING       0:00 50-00:00:00      1 (Priority)

@kevins-repo
Copy link

kevins-repo commented Aug 26, 2017

Hi, I found the same issue and then tried this drmaa supported option:
--mincpus=\n Minimum number of logical processors (threads) per node
as noted here:
http://apps.man.poznan.pl/trac/slurm-drmaa/wiki/WikiStart#native_specification

This option will result in the following:
-N 1 -n 1 --mincpus=4 --mem=16000

NumNodes=1 NumCPUs=4 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0::
TRES=cpu=4,mem=16000M,node=1

Has there been any update or ideas to add support for --cpus-per-task?
Thanks.

@natefoo
Copy link
Owner

natefoo commented Sep 5, 2017

I believe --cpus-per-task was added in 8acc159, have you tested it? You can grab a development "release" tarball that includes it on the releases tab.

@natefoo
Copy link
Owner

natefoo commented Nov 15, 2017

@BrunoGrande I realized I never followed up on your question. I believe it works for me because I have MaxNodes=1 set on my partition.

@natefoo
Copy link
Owner

natefoo commented Nov 15, 2017

It should be possible to use --nodes=1-1 for this, but slurm-drmaa doesn't currently support it. I've created issue #4 to implement it.

@natefoo
Copy link
Owner

natefoo commented Nov 16, 2017

--nodes=1-1 works now (it was actually already implemented, but as --nodes=<minnodes[=maxnodes]>). I created a new release with the corrected delimiter.

@natefoo natefoo closed this as completed Nov 16, 2017
wm75 added a commit to AG-Boerries/MIRACUM-Pipe-Galaxy that referenced this issue Sep 23, 2020
The version of the drmaa library used by Galaxy 19.01 does not yet
support the --cpus-per-task option.

Compare: natefoo/slurm-drmaa#2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants