-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node unable to join cluster #2182
Comments
first thought: both Kong nodes think they run on 127.0.0.1, albeit the Docker one is abstracted through an overlay/nat network |
@Tieske yes they are. But the config were just fine in 0.9.8. Should I change anything in 0.10's config? Anyway, I will try. Thanks a lot! |
you probably need to tweak the cluster listen and cluster advertise properties, see https://getkong.org/docs/0.10.x/network/ |
@Tieske ok, problem solved. I just figured out that the
The
The command is from this answer on stackoverflow. |
I just ran into this testing 0.10 as well. The docs and kong behavior are out of sync, not sure which is correct.
Specifically "the first local, non-loopback IPv4 address will be advertised" However, with cluster_advertise unset, the node is advertising itself as 127.0.0.1 (I can see it in the "nodes" DB table). My nodes are VM's, not Docker containers, and each has a 10.x.x.x IP available but it is not the IP Kong picked to advertise. As @tumluliu said, this exact config did the right thing in terms of clustering in 0.9.8. |
@jhenry82 I had exactly the same idea with yours just 5 minutes ago, then I saw your reply 😆 I really think this is a bug that existed from 0.10 rc3 as #2037 mentions. I did a little bit digging, but could not find where this
btw, in 0.9.8, I don't have such issue. |
just notice that there is a lua-ip project from your team to get the first non-loopback ipv4 address. why not using it in Kong? |
I believe this behavior came directly from Serf, and Kong 0.10.0 bumps the Serf version from 0.8.0 to 0.8.1, so that could be a reason why. More details:
This is legacy, untested, and also undesired as it introduces one more of those C module dependencies we are trying to get rid of. |
@thibaultcha thanks a lot for your information. For the moment, I can only provide the
In the docker container of the
Unfortunately, I haven't found an appropriate machine to test the 0.9.8. I tried installing it on my macbook with the pkg package. But serf can not find the private ip address with this error message:
BTW, the serf version for my kong-0.10 is 0.8.0, while for the kong-0.9.8 is 0.7.0. When I find another machine can test 0.9.8, I will post the |
I have got the
The ipv4 address |
Hi everyone. I'm using docker-compose for local dev and we use AWS ECS as our production docker clustering solution. This makes working out what the cluster_listen address is problematic. Until this bug is fixed we've come up with the following workaround: I ended up writing a thin wrapper that we use before we start kong. It requires the net-tools package be installed on top of the official kong one (yum install -y net-tools) this works out the correct address to use at runtime #!/bin/sh
IP_ADDR=`ifconfig eth0 | awk '$1 == "inet" {gsub(/\/.*$/, "", $2); print $2}'`
echo "SETTING IP_ADDR FOR KONG CLUSTERING TO: ${IP_ADDR}"
export KONG_CLUSTER_LISTEN="${IP_ADDR}:7946"
<start kong> |
Append solution on Kong/kong#2182 for clustering...
i think hashicorp/memberlist#102 is the underlying issue and serf 0.8.2 should address this |
as we're planning to remove the Serf dependency all together, we're now leaning towards reverting Serf back to version 0.7 for the next release (0.10.1) |
@Tieske what are you planning to replace serf with? Hitting the database more often? I am wondering because we are looking into deployment of Kong for more endpoints |
@TransactCharlie's trick worked. Though, since I'm using it inside an AWS AMI, I found EDIT: On ECS, use 0.10.1, and set the network mode to |
Leaving here as a note for other Docker users. Most of the times, Kong would not properly start as soon as we introduced a persistent postgres volume. Long story short, we were also publishing a port the docker service which causes the service containers to have two networks: user-overlay network and the default ingress network. Now with both networks attached, the container will have at least two interfaces with non-default IPV4 addresses; which causes serf to randomly pick one of them on startup. Again keeping it short, but this resulted in sometimes the ingress address to be selected which resulted in even weirder behavior (e.g: kong startup would just hang). Anyway, in docker world we were able to use the following solution (without resorting to net-tools and grep):
The above as added to a custom |
Considering this resolved as Kong 0.10.01 was shipped with a Serf downgrade back to 0.7.0. Future versions of Kong will not even need Serf anymore :) Thanks! |
@thibaultcha This is reproducible with kong 0.9.9 which comes with serf 0.7.0. Faced this clustering issue Kong/docker-kong#93 with kong in docker swarm Workaround #2182 (comment) by @saamalik fixed it. Thanks! |
Summary
After upgrading to 0.10. The two nodes can not see each other in one cluster :(
Steps To Reproduce
deb
(let me call itmaster node
although I know there is no master/slave differences in Kong)slave node
)Additional Details & Logs
Both of the two nodes are healthy. And I can see them in the database's
nodes
table as follows:But they just can not see each other. When executing
curl http://127.0.0.1:8001/cluster
, I got the messages as follows for each of them:master node
slave node
The
curl http://127.0.0.1:8001/cluster/nodes/
returns nothing butAnd the
kong cluster members
return only one alive node on both of them. The config related to clustering on themaster node
:The docker start command for the
slave node
:What's the possible problem? Any ideas? Thanks a lot!
BTW, they can surely see each other in 0.9.8 before this upgrading
The text was updated successfully, but these errors were encountered: