Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system scheduler doesn't stop on "exit 0" #3822

Closed
jaychris opened this issue Jan 31, 2018 · 2 comments
Closed

system scheduler doesn't stop on "exit 0" #3822

jaychris opened this issue Jan 31, 2018 · 2 comments

Comments

@jaychris
Copy link

jaychris commented Jan 31, 2018

If filing a bug please include the following:

Nomad version

Nomad v0.5.6

Operating system and Environment details

Container Linux by CoreOS stable (1353.6.0)

Issue

The system scheduler appears to enter a loop state on exit status 0 - for example, I'm using the system scheduler to run a small raw_exec script on all of the nodes in my cluster (because batch scheduler makes it difficult to address ALL nodes in the cluster and system doesn't support the periodic function).

The script executes fine and exits with exit 0. However, the job sits in pending state as Nomad just loops it over and over.

The expectation is that the script should run once and exit, and the nomad job should transition to a stopped/dead state.

Reproduction steps

  1. Write a job using the system scheduler
  2. Use raw_exec to do something simple and exit 0
  3. Job should enter a loop state, sitting in "pending" status

Nomad Client logs (if appropriate)

ID                  = 7741410d
Eval ID             = d23dd8ac
Name                = docker-custodian.servers[0]
Node ID             = ec2cc329
Job ID              = docker-custodian
Client Status       = pending
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created At          = 01/31/18 12:19:51 PST

Task "docker-cleanup" is "pending"
Task Resources
CPU      Memory   Disk     IOPS  Addresses
100 MHz  256 MiB  300 MiB  0

Recent Events:
Time                   Type        Description
01/31/18 12:21:07 PST  Restarting  Task restarting in 16.303683234s
01/31/18 12:21:07 PST  Terminated  Exit Code: 0
01/31/18 12:21:07 PST  Started     Task started by client
01/31/18 12:20:51 PST  Restarting  Task restarting in 15.285776237s
01/31/18 12:20:51 PST  Terminated  Exit Code: 0
01/31/18 12:20:51 PST  Started     Task started by client
01/31/18 12:20:25 PST  Restarting  Exceeded allowed attempts, applying a delay - Task restarting in 26.729869459s
01/31/18 12:20:25 PST  Terminated  Exit Code: 0
01/31/18 12:20:24 PST  Started     Task started by client
01/31/18 12:20:09 PST  Restarting  Task restarting in 15.309343829s
@chelseakomlo
Copy link
Contributor

Hi, thanks for the question and detailed information. Batch system jobs are not currently supported but are on Nomad's future roadmap. See here for further reference: #2527.

@github-actions
Copy link

github-actions bot commented Dec 4, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants