Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of reserved_ports causes job-register evaluations to fail without a reason #1617

Closed
bagelswitch opened this issue Aug 18, 2016 · 23 comments

Comments

@bagelswitch
Copy link

Nomad version

Nomad v0.4.1-rc1 ('c7bbcb56905c90a7567183c0c6dbffc050f52567+CHANGES')

Operating system and Environment details

Ubuntu 14.04 LTS, kernel 3.13.0-93-generic on AWS t2.medium VMs

Issue

I believe I am seeing the same behavior described in #1046 while running v0.4.1-rc1

In a small cluster with 3 servers and 5 clients, while running a system job, if any client config specifies any reserved ports, job-register evaluations fail with no allocation on that particular client - successful allocations do occur for other clients that don't specify reserved ports. This occurs for service jobs as well, but is easier to see with system jobs due to the predictable placement.

Reproduction steps

Create one or more server nodes with config like:

consul {
    "server_service_name" = "nomad"
    "server_auto_join" = true
    "client_service_name" = "nomad-client"
    "client_auto_join" = true
    "address" = "192.168.110.71:8500"
}

server {
    enabled = true
    bootstrap_expect = 3 
    rejoin_after_leave = true
}

log_level = "DEBUG"
bind_addr = "192.168.110.71"
data_dir = "/usr/local/nomad/data"

Create one or more client nodes and configure at least one to specify reserved ports:

consul {
    "server_service_name" = "nomad"
    "server_auto_join" = true
    "client_service_name" = "nomad-client"
    "client_auto_join" = true
    "address" = "192.168.110.55:8500"
}


client {
    enabled = true
    node_class = "runner"
    network_interface = "eth0"
    reserved {
        reserved_ports = "20000-21999,23000-65535"
    }   
    meta {
        "runner" = true
    }
}

log_level = "DEBUG"
bind_addr = "192.168.110.55"
data_dir = "/usr/local/nomad/data"

Create a job specifying any network resources (problem occurs with either static or dynamic ports, in my case I'm trying a job with only static ports), see job file below.

Run the job, in this case I have 5 client nodes, 3 are of the appropriate node class for the job, and one of the 3 has reserved ports configured:

ncohensm@base-tempregistry-553-awsphoenix:~$ nomad run forwardproxy.nomad 
==> Monitoring evaluation "3be6a17e"
    Evaluation triggered by job "forwardproxy"
    Allocation "1b9540ef" created: node "425ff758", group "forwardproxy"
    Allocation "f573d240" created: node "426db31e", group "forwardproxy"
    Evaluation status changed: "pending" -> "failed"
==> Evaluation "3be6a17e" finished with status "failed" but failed to place all allocations:
    Task Group "forwardproxy" (failed to place 2 allocations):
      * Class "edge" filtered 1 nodes
      * Constraint "${node.class} = runner" filtered 1 nodes

Note the 2 rather than the expected 3 allocations. Job status shows:

ncohensm@base-tempregistry-553-awsphoenix:~$ nomad status -verbose forwardproxy
ID          = forwardproxy
Name        = forwardproxy
Type        = system
Priority    = 50
Datacenters = dc1
Status      = running
Periodic    = false

Summary
Task Group    Queued  Starting  Running  Failed  Complete  Lost
forwardproxy  2       1         1        0       0         0

Evaluations
ID                                    Priority  Triggered By    Status    Placement Failures
3be6a17e-3b64-2cc7-2030-05172c42c5aa  50        job-register    failed    true

Allocations
ID                                    Eval ID                               Node ID                               Task Group    Desired  Status
1b9540ef-c49d-2009-cd95-4307bcd14c90  3be6a17e-3b64-2cc7-2030-05172c42c5aa  425ff758-d590-c676-5601-951b005be986  forwardproxy  run      pending
f573d240-b4c2-2c84-e91c-a1bb8bf6762c  3be6a17e-3b64-2cc7-2030-05172c42c5aa  426db31e-062b-7b6b-50b4-56630f4bc3a0  forwardproxy  run      running

Leader server log is below, logs on the client with reserved ports show nothing at all.
If the reserved port config is removed from this client, and it is restarted, the same job can be run and it will receive an allocation, with no other changes to config or job definition.

Nomad Server logs (if appropriate)

Leader server log:

   2016/08/18 17:48:10.816035 [DEBUG] worker: dequeued evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:10.836757 [DEBUG] http: Request /v1/jobs?region=global (125.013563ms)
    2016/08/18 17:48:10.876944 [DEBUG] http: Request /v1/evaluation/631bec5d-6a6d-fdc7-0cf3-ceca885cd266?region=global (37.042277ms)
    2016/08/18 17:48:10.897251 [DEBUG] http: Request /v1/evaluation/631bec5d-6a6d-fdc7-0cf3-ceca885cd266/allocations?region=global (16.950503ms)
    2016/08/18 17:48:10.937825 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 2) (migrate 0) (stop 0) (ignore 0) (lost 0)
    2016/08/18 17:48:10.937993 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: 2 in-place updates of 2
    2016/08/18 17:48:11.165491 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.166079 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.323357 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.324035 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 2) (lost 0)
    2016/08/18 17:48:11.424904 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.424957 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.424994 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.425069 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 2) (lost 0)
    2016/08/18 17:48:11.528182 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.607839 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.608210 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.608572 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 2) (lost 0)
    2016/08/18 17:48:11.753930 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.753993 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.754043 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.754130 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 2) (lost 0)
    2016/08/18 17:48:11.794423 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.799002 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.799252 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.799580 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 2) (lost 0)
    2016/08/18 17:48:11.819371 [DEBUG] worker: submitted plan for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:11.819904 [DEBUG] worker: refreshing state to index 5964 for "631bec5d-6a6d-fdc7-0cf3-ceca885cd266"
    2016/08/18 17:48:11.820273 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: refresh forced
    2016/08/18 17:48:11.820679 [DEBUG] sched: <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>: setting status to failed
    2016/08/18 17:48:11.834137 [DEBUG] worker: updated evaluation <Eval '631bec5d-6a6d-fdc7-0cf3-ceca885cd266' JobID: 'forwardproxy'>
    2016/08/18 17:48:11.835118 [DEBUG] worker: ack for evaluation 631bec5d-6a6d-fdc7-0cf3-ceca885cd266
    2016/08/18 17:48:12.069488 [DEBUG] http: Request /v1/evaluation/631bec5d-6a6d-fdc7-0cf3-ceca885cd266?region=global (168.629863ms)
    2016/08/18 17:48:12.131386 [DEBUG] http: Request /v1/evaluation/631bec5d-6a6d-fdc7-0cf3-ceca885cd266/allocations?region=global (1.030509ms)

Nomad Client logs (if appropriate)

Job file (if appropriate)

job "forwardproxy" {
    type = "system"

    region = "global"
    datacenters = ["dc1"]

    meta {
        domain = "phoenix.koalabait.com"
        consuldc = "awsphoenix"
    }

    update {
        stagger = "10s"
        max_parallel = 1
    }

    group "forwardproxy" {
        count = 1 

        restart {
            attempts = 10
            interval = "5m"
            delay = "25s"
            mode = "delay"
        }

        # Define a task to run
        task "forwardproxy" {
            constraint {
                attribute = "${node.class}"
                value = "runner"
            }

            driver = "docker"

            # Configure Docker driver with the image
            config {
                image = "base-tempregistry-553-awsphoenix-admin.koalabait.com:5000/forwardproxy"
                network_mode = "host"
            }

            service {
                name = "${TASKGROUP}"
                tags = ["dev", "forwardproxy", "http"]
                port = "http"
                check {
                        name = "forwardproxy-alive"
                        type = "script"
                        command = "/check.sh"
                        interval = "10s"
                        timeout = "2s"
                }
            }

            service {
                name = "${TASKGROUP}-admin"
                tags = ["dev", "forwardproxy", "http", "admin"]
                port = "admin"
                check {
                        name = "forwardproxy-admin-alive"
                        path = "/"
                        type = "http"
                        protocol = "http"
                        interval = "10s"
                        timeout = "2s"
                }
            }

            resources {
                cpu = 1000 
                memory = 2048 
                network {
                    mbits = 100 
                    port "http" {
                        static = 80
                    }
                    port "admin" {
                        static = 9090
                    }
                }
            }
        }
    }
}
@dadgar dadgar added this to the v0.5.0 milestone Aug 19, 2016
@schmichael schmichael self-assigned this Aug 31, 2016
@parasyte
Copy link
Contributor

parasyte commented Sep 8, 2016

We hit this issue when trying to configure reserved_ports on 0.4.1. The behavior was as described above.

@jshaw86
Copy link

jshaw86 commented Sep 8, 2016

Here is the exact error from our application code:

Error: listen EADDRINUSE :::45437
    at Object.exports._errnoException (util.js:907:11)
    at exports._exceptionWithHostPort (util.js:930:20)
    at Server._listen2 (net.js:1250:14)
    at Server._listen2 (/usr/lib/pn-blocks/src/node_modules/continuation-local-storage/node_modules/async-listener/index.js:51:23)
    at listen (net.js:1286:10)
    at Server.listen (net.js:1382:5)
    at Object.<anonymous> (/usr/lib/pn-blocks/src/main.js:343:8)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)

@parasyte
Copy link
Contributor

parasyte commented Sep 9, 2016

Some background:

We schedule services with dynamic port allocation on the exec driver. The stack trace pasted by @jshaw86 is from the service failing to listen on the port dynamically allocated by Nomad. The port is already in use by another process, most likely another instance of this service that has opened an outbound socket. The kernel allocates local ports for outbound sockets from the same port range that Nomad uses for allocating dynamic ports.

With enough services running, the chances of local port collisions gets pretty high. We were hoping that the reserved_ports config would help us resolve this, but after further investigation (and this bug), that may not actually be the right solution.

(FWIW, we are considering setting up a secondary interface for all outbound sockets. It should be a usable workaround in the meantime.)

@dadgar
Copy link
Contributor

dadgar commented Sep 19, 2016

@bagelswitch @parasyte Could you guys give a runnable job file and client configs that will reproduce this (hopefully just using the vagrant box). I could not reproduce by having three nodes with one have reserved port and the others not running a system job.

@kaskavalci
Copy link
Contributor

We run CentOS on AWS and encountered the same problem when using ReservedPorts. Could it be related to AWS?

@jshaw86
Copy link

jshaw86 commented Sep 21, 2016

@kaskavalci the above example is on vagrant so unrelated to AWS. It's a function of the total number of jobs schedule and total amount of network being used on the nomad agent. If you don't do a lot of network then you can schedule more jobs before you see the issue. If you do a lot of network then you can schedule less jobs before you see the issue.

@dadgar
Copy link
Contributor

dadgar commented Sep 21, 2016

@jshaw86 That actually looks like a different issue than what @bagelswitch reported. His problem was that the scheduler wouldn't even create the allocation. This is a failure once it is on the client. Could you create a new issue with the details you provided. Titled Port conflict on client with high network utilization

@dadgar
Copy link
Contributor

dadgar commented Sep 21, 2016

@kaskavalci Which problem did you hit the one jshaw86 showed or the original problem? I have a good idea about jshaws but trying to reproduce the original one!

@kaskavalci
Copy link
Contributor

I do not have logs since the cluster is now terminated. I can describe our issue though. When nomad clients are configured with ReservedPorts on AWS, Nomad Client will not return an evaluation. There are no info messages such as exhausted disk or anything, just max try attempts exceeded (or something like this). We deleted the Reserved Ports and restarted Nomad clients, we were able to get evaluations and allocations. I hope this helps.

@jshaw86
Copy link

jshaw86 commented Sep 21, 2016

@dadgar sure np here you go #1728

@parasyte
Copy link
Contributor

Yes, there are two different issues; the port collision in #1728 is different.

We also experienced this issue (no allocations are created when reserved_ports is configured) on AWS. It was not reproducible in vagrant with the -dev flag. The dev mode flag is the only notable difference in our tests, at least as far as we could tell.

@bagelswitch
Copy link
Author

FWIW, my case was also on AWS.

re. job file, I can repro w/ just the example job created by nomad init - it doesn't matter whether the job specifies static or dynamic ports - client/server configs are as originally posted, although I'm reasonably certain nothing matters other than the use of reserved port ranges in the client config.

@dadgar
Copy link
Contributor

dadgar commented Sep 22, 2016

@bagelswitch Can you give your client config?

@kaskavalci
Copy link
Contributor

Here is the allocation log for my side. It goes on like this. Nothing on nomad servers on INFO level unfortunately. If you need DEBUG traces, I can turn them on.

   {
        "AnnotatePlan": false,
        "BlockedEval": "3913a02f-1f02-d193-cc09-d89a63dfbc2c",
        "ClassEligibility": null,
        "CreateIndex": 24,
        "EscapedComputedClass": true,
        "FailedTGAllocs": {
            "vora-dlog-servers": {
                "AllocationTime": 34847,
                "ClassExhausted": null,
                "ClassFiltered": null,
                "CoalescedFailures": 1,
                "ConstraintFiltered": {
                    "distinct_hosts": 2,
                    "${node.unique.name} regexp ^ec2-1234.compute-1.amazonaws.com$|^ec2-5678.compute-1.amazonaws.com$": 2
                },
                "DimensionExhausted": null,
                "NodesAvailable": {
                    "dc1": 4
                },
                "NodesEvaluated": 4,
                "NodesExhausted": 0,
                "NodesFiltered": 4,
                "Scores": null
            }
        },
        "ID": "6f87198b-fdab-53f6-a4a5-f8b7c5ef4837",
        "JobID": "vora-dlog",
        "JobModifyIndex": 10,
        "ModifyIndex": 27,
        "NextEval": "",
        "NodeID": "",
        "NodeModifyIndex": 0,
        "PreviousEval": "175b5bdc-8472-cf84-66a0-86c80e2a1915",
        "Priority": 50,
        "SnapshotIndex": 24,
        "Status": "failed",
        "StatusDescription": "maximum attempts reached (5)",
        "TriggeredBy": "max-plan-attempts",
        "Type": "service",
        "Wait": 0
    }

Client configuration:

{
  "enable_syslog":true,
  "leave_on_terminate":true,
  "bind_addr":"0.0.0.0",
  "advertise":{
    "http":"10.0.0.89:5646"
  },
  "leave_on_interrupt":true,
  "data_dir":"/tmp/client",
  "name":"ec2-1234.compute-1.amazonaws.com",
  "ports":{
    "http":5646
  },
  "client":{
    "reserved":{
      "reserved_ports":"0-19999,30000-39999,40000"
    },
    "enabled":true,
    "options":{
      "driver.raw_exec.enable":"1"
    }
  },
  "disable_update_check":true,
  "consul":{
    "address":"localhost:8500"
  }
}

@bagelswitch
Copy link
Author

My client config:

consul {
    "server_service_name" = "nomad"
    "server_auto_join" = true
    "client_service_name" = "nomad-client"
    "client_auto_join" = true
    "address" = "192.168.110.55:8500"
}


client {
    enabled = true
    node_class = "runner"
    network_interface = "eth0"
    reserved {
        reserved_ports = "20000-21999,23000-65535"
    }   
    meta {
        "runner" = true
    }
}

log_level = "DEBUG"
bind_addr = "192.168.110.55"
data_dir = "/usr/local/nomad/data"

@schmichael
Copy link
Member

schmichael commented Nov 2, 2016

I failed to reproduce this on nomad 0.5-rc1. I posted my client configs in a gist.

Everything worked as expected:

  • client0 did not run the task because it lacked the proper node class
  • client1 did run the task
  • client2 did run the task -- despite having reserved ports configured

I unset the region, removed your custom service checks, and switched the docker container to redis -- hopefully none of those could possibly trigger what you saw.

Let me know if you see any other differences between my setup and yours. I'll try to downgrade my cluster to 0.4.1 and try again.

This is a Debian Jessie (8.6) cluster in GCE on n1-standard-2 vms (3 servers + 3 clients). Kernel 3.16.0 and docker 1.12.3.

@schmichael
Copy link
Member

Started with a fresh 0.4.1 cluster, removed TLS settings, and still could not reproduce the bug.

nomad run produces normal output and I verified the job ran on the 2 nodes with the runner node class:

schmichael@server0:~$ nomad run job.hcl
==> Monitoring evaluation "12bf83f2"
    Evaluation triggered by job "forwardproxy"
    Allocation "071cbf26" created: node "fea8bee0", group "forwardproxy"
    Allocation "7d57ada8" created: node "03c933f9", group "forwardproxy"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "12bf83f2" finished with status "complete" but failed to place all allocations:
    Task Group "forwardproxy" (failed to place 1 allocation):
      * Constraint "${node.class} = runner" filtered 1 nodes

Are you still encountering this issue reliably? Can you try upgrading to 0.5-rc1 on the off chance the behavior has at least changed to give us some idea of where a problem might lie.

@bagelswitch
Copy link
Author

@schmichael - note that I can only repro this on AWS - I believe that may be the case for the others reporting in this thread as well, based on the comments above.

I do not see the problem in an otherwise identical setup, running on an internal OpenStack environment.

@schmichael
Copy link
Member

@bagelswitch Argh, shoot. VPC or EC2 classic?

@bagelswitch
Copy link
Author

VPC

@dadgar
Copy link
Contributor

dadgar commented Nov 7, 2016

Hey,

Reproduced this. Will try to get a fix out soon

@parasyte
Copy link
Contributor

parasyte commented Nov 8, 2016

Hooray! Thanks for the update Alex. We have a hack in our local nomad build to workaround it for the moment. Looking forward to the fix.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants