Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

address_mode=driver and group-level networks #8615

Closed
mswart opened this issue Aug 9, 2020 · 4 comments
Closed

address_mode=driver and group-level networks #8615

mswart opened this issue Aug 9, 2020 · 4 comments

Comments

@mswart
Copy link

mswart commented Aug 9, 2020

Nomad version

Nomad v0.12.1 (14a6893a250fc5f652eee4c7b52d8f95c3185aef)

Operating system and Environment details

Ubuntu 18.04.4 LTS and Ubuntu 20.04.1 LTS
docker version 19.03.6 and 19.03.8

Issue

Define a service/system job that uses group-level network declaration and let nomad register services with address_mode = driver.

Expectation: Service is registered with internal network address (like from a CNI plugin).

Observation: Service is registered without address (as in address_mode = host).

Alternative possible scenario: Display an error message / warning that address_mode = driver is not supported anymore.

Motivation: Do not use the deprecated way to declare networks, utilize CNI plugins and other runtime behaviors and still support direct communications.

Steps to reproduce

  1. Start nomad with consul integration
  2. Run job
  3. Check consul like:
> for type in group-driver group-host task-driver; do curl --silent http://127.0.0.1:8500/v1/catalog/service/bug-$type | jq --raw-output \"$type': \(.[0].ServiceAddress):\(.[0].ServicePort)"'; done
group-driver: 127.0.0.1:5678
group-host: 127.0.0.1:23346
task-driver: :5678

Example job file

job "bug" {
  datacenters = ["dc1"]
  type = "service"

  group "copies" {
    network {
      mode = "bridge"

      port "port" {}
    }

    service {
      name = "${JOB}-group-host"
      address_mode = "host"
      port = "port"
    }

    service {
      name = "${JOB}-group-driver"
      address_mode = "driver"
      port = "5678"
    }

    task "app" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo"
        args = ["-text=\"hello world\""]
      }

      service {
        name = "${JOB}-task-driver"
        address_mode = "driver"
        port = "5678"
      }
    }
  }
}

Experiment

I checked the following patch applied on today's master version:

diff --git a/client/allocrunner/networking_cni.go b/client/allocrunner/networking_cni.go
index 8dfea82c7..bc421b535 100644
--- a/client/allocrunner/networking_cni.go
+++ b/client/allocrunner/networking_cni.go
@@ -89,8 +89,8 @@ func (c *cniNetworkConfigurator) Setup(ctx context.Context, alloc *structs.Alloc
        const retry = 3
        var firstError error
        for attempt := 1; ; attempt++ {
-               //TODO eventually returning the IP from the result would be nice to have in the alloc
-               if _, err := c.cni.Setup(ctx, alloc.ID, spec.Path, cni.WithCapabilityPortMap(getPortMapping(alloc, c.ignorePortMappingHostIP))); err != nil {
+               result, err := c.cni.Setup(ctx, alloc.ID, spec.Path, cni.WithCapabilityPortMap(getPortMapping(alloc, c.ignorePortMappingHostIP)));
+               if err != nil {
                        c.logger.Warn("failed to configure network", "err", err, "attempt", attempt)
                        switch attempt {
                        case 1:
@@ -103,6 +103,7 @@ func (c *cniNetworkConfigurator) Setup(ctx context.Context, alloc *structs.Alloc
                        time.Sleep(time.Second + (time.Duration(c.rand.Int63n(1000)) * time.Millisecond))
                        continue
                }
+               alloc.AllocatedResources.Shared.Networks[0].IP = result.Interfaces["eth0"].IPConfigs[0].IP.String()
                break
        }

It gets better at least for this use-case:

> for type in group-driver group-host task-driver; do curl --silent http://127.0.0.1:8500/v1/catalog/service/bug-$type | jq --raw-output \"$type': \(.[0].ServiceAddress):\(.[0].ServicePort)"'; done               
group-driver: 172.26.64.9:5678
group-host: 172.26.64.9:28530
task-driver: :5678

I would prepare a PR etc but I do not understand Nomad's architecture (and how information are passed around) good enough to identify how to fix this the correct way.

@Legogris
Copy link

@nickethier Could this be another aspect of reports in #8432?

@tgross
Copy link
Member

tgross commented Oct 26, 2020

Another case, with a repro, in #9177 (comment)

@mswart
Copy link
Author

mswart commented Dec 10, 2020

My issues/use-case was duplicated in #8801 that has been fixed in v1.0.0 - apparently I should have described the high-level use-case and not the way how it was done in older versions. I will close this issue to reflect that.
@tgross @nickethier I am not sure what your plan with this issue was - (it is assigned but nothing visual happened in four month). I am not sure whether #9177 is still an issue/fixed, too - based on my understand it is about something else. Let it to you/the original reporter to reopen it if needed.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants