Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting consul client url in mesh-init #91

Closed
guoqikai opened this issue May 23, 2022 · 12 comments
Closed

Allow setting consul client url in mesh-init #91

guoqikai opened this issue May 23, 2022 · 12 comments

Comments

@guoqikai
Copy link

guoqikai commented May 23, 2022

Description

Currently for the mesh-init comand, the consul client url is always set to 127.0.0.1:8500.
However in our use case the client is listening on a different ip address

Use Cases

Sharing a consul client across multiple ecs tasks

Alternative Solutions

There's no alternative solutions

Additional context

IMO simply adding cfg.Address = config.ClientAddress on line 64 and update the connect envoy command accordingly should do the job

@pglass
Copy link

pglass commented May 23, 2022

Hi, @guoqikai! Thanks for the issue.

Can you share more about your configuration? Are you using the EC2 or Fargate launch type with ECS? Where are you running the Consul client?

@guoqikai guoqikai reopened this May 23, 2022
@guoqikai
Copy link
Author

Hi @pglass, Thanks for following up!

We are using the EC2 launch type and the consul client is running as a daemon service in host network mode. The consul client is listening on the docker0 interface so other containers should be able to reach it via docker.internal.host:8500.

@pglass
Copy link

pglass commented May 23, 2022

@guoqikai Thanks! You can try setting the CONSUL_HTTP_ADDR environment variable to docker.internal.host:8500 in the container definition for mesh-init in the meantime. The mesh-init command should respect this environment variable.

However, if you are using the Terraform mesh-task module, there's not currently a setting for this, so let us know.


I also have a few more questions.

  • What version of mesh-init you are using (e.g. v0.4.1)?
  • How are you deploying ECS tasks? Are you using HashiCorp's mesh-task Terraform module?
  • You are using host network mode for the Consul client containers. What network mode are you using for the service containers (i.e. host, bridge, or awsvpc)?
  • Our currently supported architecture for Consul on ECS requires the awsvpc network mode and requires running one Consul client container in each task. We chose this architecture to support both the EC2 and Fargate launch types in the same way. Are you able to adopt this architecture, or could you explain why not?

Thank you!

@guoqikai
Copy link
Author

guoqikai commented May 23, 2022

@pglass Thanks!

  1. we are using v0.4.1
  2. we are using aws cdk for deploying tasks so I guess we can just try setting the environment variable.
  3. We cannot always use awsvpc network mode since we have some streaming services that require lots of ports(~10000) therefore can only be run in host network mode. In addition, we have many fairly lightweight services and we don't feel it's ideal to have a 'dedicate' consul agent running for every task in a container instance

@pglass
Copy link

pglass commented May 23, 2022

@guoqikai

we don't feel it's ideal to have a 'dedicate' consul agent running for every task in a container instance

👍

We cannot always use awsvpc network mode since we have some streaming services that require lots of ports(~10000) therefore can only be run in host network mode.

I'm wondering if you could explain this a little more: Why is host network mode necessary? Are you running one task which listens on ~10000 ports? Or are you running ~10000 tasks which each listens on a unique port?

The awsvpc network mode provides each task a unique ENI with a unique IP. Compared to host network mode, each task in awsvpc mode has a unique port space, so you wouldn't have to worry about port collisions on the EC2 host. Otherwise, the networking performance should be similar.

Thanks for all the info, too. This is really helpful for us.

@guoqikai
Copy link
Author

guoqikai commented May 23, 2022

@pglass The streaming task needs to open ~10000 ports. The awsvpc network mode still requires us to expose all the require ports in port mapping and there's a limit on how many ports a single task can expose, while the host mode exposes all ports implicitly. You can check out this issue for more details aws/containers-roadmap#194

@pglass
Copy link

pglass commented May 23, 2022

@guoqikai

Ah, so it's my understanding that if your service binds to the task IP address (or 0.0.0.0), then you don't need any port mappings in AWS VPC mode. (If the service listens on only localhost, then the port mappings are required.)

This is how we start both Envoy and Consul client in AWS VPC mode, by binding directly to the task IP (terraform example - there are no port mappings listed and, it's not obvious, but Envoy binds to 0.0.0.0:20000 for its public_listener).

@guoqikai
Copy link
Author

guoqikai commented May 23, 2022

@pglass
Sorry I think I forgot to clarify that the stream service uses the 10000 ports to forward media traffic(i.e RTP packages) from one client to another so these 10000 ports need to be reachable outside the task. The envoy proxy is used for http traffic only. Therefore we need to expose port 20000 + other 10000 ports in this case

@pglass
Copy link

pglass commented May 23, 2022

these 10000 ports need to be reachable outside the task

Right, understood. To clarify, if your service binds to the task IP (or 0.0.0.0) in AWS VPC mode, then any port the service listens on is reachable outside the task, without needing port mappings in the task definition.

@guoqikai
Copy link
Author

Wow thanks! I've never tried this and there're tons of posts suggest that only ports specified in port mapping can be reachable... I was following the ecs-consul-mesh-extension implementation and they include port 20000 in port mapping of the envoy proxy.

@pglass
Copy link

pglass commented May 23, 2022

Glad to be able to help!

That AWS CDK construct is based on an older version of our mesh-task Terraform module, which had those port mappings early on, but we removed them after figuring this out.

For the future, we're looking at alleviating the need for a Consul client container per task.


Back to the original issue, let us know how setting CONSUL_HTTP_ADDR works for you.

@guoqikai
Copy link
Author

Gotcha! Thanks for the support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants