-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support setting LocalAddr in peer communication - with e2e tests #17661
Conversation
Hi @flawedmatrix. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1ddb114
to
8c46359
Compare
/ok-to-test |
Thank you @ahrtr for the very thorough review. I've made the suggested changes to my PR. |
cc @fuweid @jmhbnz @serathius PTAL |
/retest |
@vivekpatani @ivanvc @fuweid @jmhbnz @serathius Please take a look at this PR. It resolve a real production issue #17068 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @flawedmatrix and @ahrtr. Some questions below.
server/etcdmain/help.go
Outdated
@@ -107,6 +107,8 @@ Member: | |||
Clustering: | |||
--initial-advertise-peer-urls 'http://localhost:2380' | |||
List of this member's peer URLs to advertise to the rest of the cluster. | |||
--set-member-localaddr 'false' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note so we don't forget to do follow-up pr for website for this new clustering flag in https://etcd.io/docs/v3.6/op-guide/configuration.
server/etcdmain/help.go
Outdated
@@ -107,6 +107,8 @@ Member: | |||
Clustering: | |||
--initial-advertise-peer-urls 'http://localhost:2380' | |||
List of this member's peer URLs to advertise to the rest of the cluster. | |||
--set-member-localaddr 'false' | |||
Enable to have etcd use the first specified and non-loopback host from initial-advertise-peer-urls as the local address when communicating with a peer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to say local address
rather than host
? Basing this on the test case above where local IPv4 is prioritised over hostname.
Enable to have etcd use the first specified and non-loopback host from initial-advertise-peer-urls as the local address when communicating with a peer. | |
Enable to have etcd use the first specified and non-loopback local address from initial-advertise-peer-urls as the local address when communicating with a peer. |
Alternatively we might just want to state what the priority order is so users aren't left to figure this out on their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's just me, but I feel like mentioning etcd in this case is redundant (without addressing the priority order) something like:
Enable using the first specified and non-loopback local address from initial-advertise-peer-urls as the local address when communicating with a peer.
Sounds more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for invoking me. I left a couple of comments. I hope they're relevant ✌️
server/etcdmain/help.go
Outdated
@@ -107,6 +107,8 @@ Member: | |||
Clustering: | |||
--initial-advertise-peer-urls 'http://localhost:2380' | |||
List of this member's peer URLs to advertise to the rest of the cluster. | |||
--set-member-localaddr 'false' | |||
Enable to have etcd use the first specified and non-loopback host from initial-advertise-peer-urls as the local address when communicating with a peer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's just me, but I feel like mentioning etcd in this case is redundant (without addressing the priority order) something like:
Enable using the first specified and non-loopback local address from initial-advertise-peer-urls as the local address when communicating with a peer.
Sounds more straightforward.
5a19043
to
4ea35aa
Compare
Thanks all for the suggestions. I've made the changes, added more unit test cases and added some logging to |
/retest |
d5f26d0
to
39453ca
Compare
Hi @ahrtr, I just rebased again to fix the merge conflict. Could you please take a look? |
Hi @flawedmatrix, I suggest squashing the commits in preparation for merging this PR. |
Hi @ivanvc, I've squashed the commits that were my contributions. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. If we are considering backporting this PR, I think it will be easier if you squash the commits into one. You can still co-author the commit with the original contributor (refer to https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). Thanks, @flawedmatrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks for your work on this @flawedmatrix
tests/e2e/etcd_config_test.go
Outdated
// node 0 (127.0.0.1) does not set localaddr, while nodes 1 and nodes 2 (both use host's IP) do. | ||
// The other two nodes will reject connections from node 0 warning that node 0's certificate is valid only for | ||
// 127.0.0.1, not the host IP, since node 0 will try to connect to the other peers with the host IP | ||
// as the client address. | ||
// Node 0 will not reject connections from the other nodes since they will | ||
// use the host's IP to connect (due to --experimental-set-member-localaddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// node 0 (127.0.0.1) does not set localaddr, while nodes 1 and nodes 2 (both use host's IP) do. | |
// The other two nodes will reject connections from node 0 warning that node 0's certificate is valid only for | |
// 127.0.0.1, not the host IP, since node 0 will try to connect to the other peers with the host IP | |
// as the client address. | |
// Node 0 will not reject connections from the other nodes since they will | |
// use the host's IP to connect (due to --experimental-set-member-localaddr) | |
// node 0 (127.0.0.1) does not set `--experimental-set-member-localaddr`, | |
// while nodes 1 and nodes 2 do. | |
// | |
// node 0's peer certificate is signed on 127.0.0.1, but it uses host IP | |
// (by default) to communicate with peers, so they don't match, accordingly | |
// other peers will reject connections from node 0. | |
// | |
// Both node 1 and node 2's peer certificates are signed on host IP, and | |
// they also communicate with peers using host IP (explicitly set using | |
// flag `--experimental-set-member-localaddr`), so there is no issue. | |
// | |
// Refer to https://github.com/etcd-io/etcd/issues/17068. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've updated the comment now.
Overall looks good to me with a minor comment, to ensure the e2e test is easier to understand. |
Which sets the LocalAddr to an IP address from --initial-advertise-peer-urls. Also adds e2e test that requires this flag to succeed. Co-authored-by: HighPon <s.shiraki.business@gmail.com> Signed-off-by: Edwin Xie <edwin.xie@broadcom.com>
Thanks @ivanvc. I didn't initially intend on backporting this change, but to future-proof this PR I've decided to squash the commits anyways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the fix.
…s https/unixs Backports a657f06, 22f20a8, 497f1a4 and 3e86af6 from etcd-io#18186. Also backports required test util elements of 4c77726 from etcd-io#17661. Signed-off-by: James Blair <mail@jamesblair.net>
…s https/unixs Backports a657f06, 22f20a8, 497f1a4 and 3e86af6 from etcd-io#18186. Also backports required test util elements of 4c77726 from etcd-io#17661. Signed-off-by: James Blair <mail@jamesblair.net>
…s https/unixs Backports a657f06, 22f20a8, 497f1a4 and 3e86af6 from etcd-io#18186. Also backports required test util elements of 4c77726 from etcd-io#17661. Signed-off-by: James Blair <mail@jamesblair.net>
…s https/unixs Backports a657f06, 22f20a8, 497f1a4 and 3e86af6 from etcd-io#18186. Also backports required test util elements of 4c77726 from etcd-io#17661. Signed-off-by: James Blair <mail@jamesblair.net>
This PR builds upon #17085 in addressing issue #17068.
The e2e test added includes a positive test that --set-member-localaddr is needed for an etcd node to come up, and a test that without it, etcd will fail to connect to peers due to the certificate being issued for a different IP.
I did have to add in some cert generation to create certificates for the host IP where the test is running.
There is an implementation detail here that
--set-member-localaddr
will set the LocalAddr only if there is a specified non-loopback IP address in the--initial-advertise-peer-urls
list. But I don't know if this is too restrictive.