system scheduler doesn't stop on "exit 0" #3822

jaychris · 2018-01-31T21:03:21Z

If filing a bug please include the following:

Nomad version

Nomad v0.5.6

Operating system and Environment details

Container Linux by CoreOS stable (1353.6.0)

Issue

The system scheduler appears to enter a loop state on exit status 0 - for example, I'm using the system scheduler to run a small raw_exec script on all of the nodes in my cluster (because batch scheduler makes it difficult to address ALL nodes in the cluster and system doesn't support the periodic function).

The script executes fine and exits with exit 0. However, the job sits in pending state as Nomad just loops it over and over.

The expectation is that the script should run once and exit, and the nomad job should transition to a stopped/dead state.

Reproduction steps

Write a job using the system scheduler
Use raw_exec to do something simple and exit 0
Job should enter a loop state, sitting in "pending" status

Nomad Client logs (if appropriate)

ID                  = 7741410d
Eval ID             = d23dd8ac
Name                = docker-custodian.servers[0]
Node ID             = ec2cc329
Job ID              = docker-custodian
Client Status       = pending
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created At          = 01/31/18 12:19:51 PST

Task "docker-cleanup" is "pending"
Task Resources
CPU      Memory   Disk     IOPS  Addresses
100 MHz  256 MiB  300 MiB  0

Recent Events:
Time                   Type        Description
01/31/18 12:21:07 PST  Restarting  Task restarting in 16.303683234s
01/31/18 12:21:07 PST  Terminated  Exit Code: 0
01/31/18 12:21:07 PST  Started     Task started by client
01/31/18 12:20:51 PST  Restarting  Task restarting in 15.285776237s
01/31/18 12:20:51 PST  Terminated  Exit Code: 0
01/31/18 12:20:51 PST  Started     Task started by client
01/31/18 12:20:25 PST  Restarting  Exceeded allowed attempts, applying a delay - Task restarting in 26.729869459s
01/31/18 12:20:25 PST  Terminated  Exit Code: 0
01/31/18 12:20:24 PST  Started     Task started by client
01/31/18 12:20:09 PST  Restarting  Task restarting in 15.309343829s

The text was updated successfully, but these errors were encountered:

chelseakomlo · 2018-01-31T21:25:10Z

Hi, thanks for the question and detailed information. Batch system jobs are not currently supported but are on Nomad's future roadmap. See here for further reference: #2527.

github-actions · 2022-12-04T02:16:39Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

chelseakomlo closed this as completed Jan 31, 2018

github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

system scheduler doesn't stop on "exit 0" #3822

system scheduler doesn't stop on "exit 0" #3822

jaychris commented Jan 31, 2018 •

edited

Loading

chelseakomlo commented Jan 31, 2018

github-actions bot commented Dec 4, 2022

system scheduler doesn't stop on "exit 0" #3822

system scheduler doesn't stop on "exit 0" #3822

Comments

jaychris commented Jan 31, 2018 • edited Loading

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Nomad Client logs (if appropriate)

chelseakomlo commented Jan 31, 2018

github-actions bot commented Dec 4, 2022

jaychris commented Jan 31, 2018 •

edited

Loading