Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking seems broken when using exec driver. #1123

Closed
c4milo opened this issue Apr 23, 2016 · 10 comments
Closed

Networking seems broken when using exec driver. #1123

c4milo opened this issue Apr 23, 2016 · 10 comments

Comments

@c4milo
Copy link
Contributor

c4milo commented Apr 23, 2016

Nomad version

Nomad v0.3.2-rc2 ('8411b0b3bc167dde5d0edc20220266609d6d93bb')

Operating system and Environment details

CoreOS stable

Issue

I'm trying to launch Vault using exec driver but it fails trying to resolve a domain name.

caused by: Head https://hooklift-vault.s3.amazonaws.com/: dial tcp: lookup hooklift-vault.s3.amazonaws.com on [::1]:53: read udp [::1]:37340->[::1]:53: read: connection refused
Error initializing backend of type s3: unable to access bucket 'hooklift-vault': RequestError: send request failed
caused by: Head https://hooklift-vault.s3.amazonaws.com/: dial tcp: lookup hooklift-vault.s3.amazonaws.com on [::1]:53: read udp [::1]:35858->[::1]:53: read: connection refused
Error initializing backend of type s3: unable to access bucket 'hooklift-vault': RequestError: send request failed
caused by: Head https://hooklift-vault.s3.amazonaws.com/: dial tcp: lookup hooklift-vault.s3.amazonaws.com on [::1]:53: read udp [::1]:54317->[::1]:53: read: connection refused
Error initializing backend of type s3: unable to access bucket 'hooklift-vault': RequestError: send request failed
caused by: Head https://hooklift-vault.s3.amazonaws.com/: dial tcp: lookup hooklift-vault.s3.amazonaws.com on [::1]:53: read udp [::1]:50965->[::1]:53: read: connection refused

Reproduction steps

Just try to launch Vault using the exec driver on CoreOS, ideally on AWS using IAM roles, and a similar configuration to the following:

backend "s3" {
    bucket = "hooklift-vault"
    access_key = ""
    secret_key = ""
}

ha_backend "consul" {
    address = "127.0.0.1:8500"
    path = "vault"
    advertise_addr = "v1.hooklift.io:443"
}

listener "tcp" {
    address = "0.0.0.0:443"
    tls_disable = "true"
    tls_min_version = "tls12"
    tls_key_file = "vault.key"
    tls_cert_file = "vault.crt"
}

telemetry {
    statsite_address = "127.0.0.1:8125"
    disable_hostname = false
}

Job file

job "vault" {
    datacenters = ["us-east-1"]

    type = "service"
    priority = 1

    constraint {
        attribute = "${attr.kernel.name}"
        value = "linux"
    }

    group "instances" {
        count = 2

        constraint {
            distinct_hosts = true
        }

        restart {
            interval = "1m"
            attempts = 2
            delay = "15s"
            mode = "delay"
        }

        task "server" {
            driver = "exec"

            config {
                command = "vault"
                args = [
                    "server",
                    "-config=${NOMAD_TASK_DIR}/vault.hcl",
                    "-log-level=info"
                ]
            }

            artifact {
                source = "my-bucket.s3.amazonaws.com/containers/vault.tar.gz"
            }

            artifact {
                source = "my-bucket.s3.amazonaws.com/certificates/vault.zip"
            }

            service {
                name = "vault"
                tags = ["vault", "web", "platform"]
                port = "https"
                check {
                    type = "http"
                    path = "/sys/health"
                    interval = "10s"
                    timeout = "2s"
                }
            }

            resources {
                cpu = 1024 # Mhz
                memory = 1024 # MB
                network {
                    mbits = 100
                    port "https" {
                        static = 443
                    }
                }
            }
        }
    }
}

Desired result

I was expecting the exec driver to use the host's network stack.

@dadgar
Copy link
Contributor

dadgar commented Apr 23, 2016

It does use host networking. Can you try running it in raw exec. Looks like it is a DNS resolution error.

@c4milo
Copy link
Contributor Author

c4milo commented Apr 23, 2016

I did run it on the host and it worked well.

@dadgar
Copy link
Contributor

dadgar commented Apr 24, 2016

@c4milo Thanks for reporting. Will look into it, my guess is that the resolv.conf file is not in the chroot.

@c4milo
Copy link
Contributor Author

c4milo commented Apr 24, 2016

It would be very helpful if Nomad provided an easy way of entering a container to further inspect when issues like this show up.

@diptanu
Copy link
Contributor

diptanu commented Apr 24, 2016

@c4milo if you are using th exec driver you can use the nomad fs suite of commands to introspect the file system of the chroot.

@c4milo
Copy link
Contributor Author

c4milo commented Apr 24, 2016

I had something different in mind, like being able to run one-off containers, attached to the terminal. To run for example db migrations, test networking, test container isolation, etc.

@sean- sean- added sync and removed sync labels Apr 29, 2016
@diptanu
Copy link
Contributor

diptanu commented Jun 12, 2016

@c4milo Can you please tell us if this has been resolved, and if resolv.conf is present in the chroot?

@c4milo
Copy link
Contributor Author

c4milo commented Jun 14, 2016

Last time I checked, yes, I was still experiencing the issue. I couldn't check if resolv.conf was present, I changed strategy and ended up writing a custom driver because of this issue and several other custom needs.

@diptanu
Copy link
Contributor

diptanu commented Aug 11, 2016

@c4milo We have recently merged #1518 which would allow you to specify the host directories you want in your chroot so please configure the chroot_env block in your nomad client config to add the resolv.conf in your environment.

Closing the ticket, please re-open the ticket if that doesn't solve your needs.

@diptanu diptanu closed this as completed Aug 11, 2016
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants