Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check resource queue size before submitting #5

Open
soichih opened this issue Sep 28, 2017 · 0 comments
Open

Check resource queue size before submitting #5

soichih opened this issue Sep 28, 2017 · 0 comments

Comments

@soichih
Copy link
Contributor

soichih commented Sep 28, 2017

PBS has ridiculously small job queue..

From Jeff Gronek

For the preempt queue, maximum per user is 200.
For normal/serial there's a maximum of 2000 total, 400 per user.

I can keep track of how many jobs that brlife has submitted, but not the absolute max (2000).

I believe that, abcd hook that's installed on each resource should do this check but amaretti currently doesn't re-try starting job in case of start hook failure (job is set to failed). Maybe I should reconsider this and make it to keep retrying? If we do, then we won't need to do any queue size checking - it will just keep retrying qsub until succeeds.

I need to think through the side-effect of keep retrying startup hook, however. I feel that it could create more problem than it solves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant