Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi-automatic egress IP #16561

Merged
merged 3 commits into from
Sep 30, 2017

Conversation

danwinship
Copy link
Contributor

This is the first half of the automatic egress IP work; it handles routing traffic through egress IPs, but not the assigning of IPs to nodes, which must be done manually by the administrator at this point (by editing the HostSubnet records; in the future the SDN master will do this automatically based on annotations on the Node records).

The set of Egress IPs active on a node was stored in HostSubnet rather than Node because (a) there are no upstream compatibility issues to deal with that way, and (b) nodes already have a HostSubnet watch, but not a Node watch. (And Node statuses change constantly, while HostSubnets are much more low-bandwidth.)

Oh, the other thing that's there for future expansion is that a NetNamespace can have an array of EgressIPs, but only the first one gets used. In the future we will likely support multiple IPs, either at the same time, or sequentially failing over.

@openshift/sig-networking PTAL

@openshift-ci-robot openshift-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 26, 2017
Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff Dan. It's clear and well written. Thanks

otx.AddFlow("table=100, priority=%d,tcp,tcp_dst=53,nw_dst=%s,actions=output:2", networkapi.EgressNetworkPolicyMaxRules+1, nodeIP)
otx.AddFlow("table=100, priority=%d,udp,udp_dst=53,nw_dst=%s,actions=output:2", networkapi.EgressNetworkPolicyMaxRules+1, nodeIP)
// Table 100: egress routing; edited by UpdateNamespaceEgressRules()
// eg, FIXME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor omission

@knobunc
Copy link
Contributor

knobunc commented Sep 27, 2017

/test all

@knobunc
Copy link
Contributor

knobunc commented Sep 27, 2017

@liggitt, @smarterclayton: Can I get an API review on this one too please?

@knobunc
Copy link
Contributor

knobunc commented Sep 27, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2017

for i, ip := range netnamespace.EgressIPs {
if net.ParseIP(ip) == nil {
allErrs = append(allErrs, field.Invalid(field.NewPath("egressIPs").Index(i), ip, "invalid IP address"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this allowed to be localhost? Multicast? Any implications of those? (extra validation is needed only if it there is no clear use case for ever allowing those)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no use case for setting it to either of those, but there's no danger either. (It would just fail, just like if they tried to claim a random far-away IP address; EgressIPs have to be on the same subnet as some node's primary IP.) We don't check for semantically bad IPs in validation anywhere else, so I think just letting it through here and then failing later makes sense.

@knobunc
Copy link
Contributor

knobunc commented Sep 27, 2017

@smarterclayton since you've looked over it, can you approve it if it meets with your approval? Thanks


if nodeIP == eip.localIP {
if err := eip.claimEgressIP(ip, hex); err != nil {
glog.Errorf("Error claiming Egress IP %q: %v", ip, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we continue at this point? Or update the rules anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update the rules, but we should update them to drop the packets (so the code is currently wrong)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fixed)

@soltysh
Copy link
Member

soltysh commented Sep 28, 2017

/unassign

@eparis
Copy link
Member

eparis commented Sep 28, 2017

/retest

@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2017
@danwinship
Copy link
Contributor Author

@smarterclayton the bot added needs-api-review back after you removed it...

@eparis
Copy link
Member

eparis commented Sep 28, 2017

/lgtm
the 'needs review' doesn't matter. You have the approval.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2017
@eparis eparis added the kind/bug Categorizes issue or PR as related to a bug. label Sep 28, 2017
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2017
@openshift-merge-robot openshift-merge-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 28, 2017
@eparis
Copy link
Member

eparis commented Sep 28, 2017

/lgtm
after the rebases...

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eparis, knobunc

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@danwinship
Copy link
Contributor Author

/retest

1 similar comment
@eparis
Copy link
Member

eparis commented Sep 29, 2017

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@danwinship
Copy link
Contributor Author

/retest

1 similar comment
@danwinship
Copy link
Contributor Author

/retest

@@ -68,6 +68,9 @@ type HostSubnet struct {
HostIP string `json:"hostIP" protobuf:"bytes,3,opt,name=hostIP"`
// Subnet is the CIDR range of the overlay network assigned to the node for its pods
Subnet string `json:"subnet" protobuf:"bytes,4,opt,name=subnet"`

// EgressIPs is the list of automatic egress IP addresses currently hosted by this node
EgressIPs []string `json:"egressIPs" protobuf:"bytes,5,rep,name=egressIPs"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty?


// EgressIPs is a list of reserved IPs that will be used as the source for external traffic coming from pods in this namespace. (If empty, external traffic will be masqueraded to Node IPs.)
// +optional
EgressIPs []string `json:"egressIPs" protobuf:"bytes,4,rep,name=egressIPs"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same omitempty?

for i, egressIP := range hs.EgressIPs {
if net.ParseIP(egressIP) == nil {
allErrs = append(allErrs, field.Invalid(field.NewPath("egressIPs").Index(i), egressIP, "invalid IP address"))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect egress IP to work for IPV6 as well? If not, may be we should add ipv4 check here and in netnamespace.EgressIPs validation.

}
}

func (eip *egressIPWatcher) Start(networkClient networkclient.Interface, iptables *NodeIPTables) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this method will not return an error may be this method can have no return value.

for _, link := range links {
addrs, err := netlink.AddrList(link, syscall.AF_INET)
if err != nil {
return fmt.Errorf("could not get addresses of interface %q while adding egress IP: %v", link.Attrs().Name, err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed link may not have desired IP address, we could log an error and continue

iptables *NodeIPTables

// from HostSubnets
nodesByNodeIP map[string]*nodeEgress

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already know the nodeIP (map key), don't need to duplicate in the map value
nodesByNodeIP map[string]*sets.String ?


// from HostSubnets
nodesByNodeIP map[string]*nodeEgress
nodesByEgressIP map[string]*nodeEgress

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly this could be nodesByEgressIP map[string]string

nodesByEgressIP map[string]*nodeEgress

// From NetNamespaces
namespacesByVNID map[uint32]*namespaceEgress

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vnid may be duplicate here but considering most of the fields are also used by namespacesByEgressIP, I think this is a good compromise.

}

if ns, exists := eip.namespacesByEgressIP[ip]; exists {
if ns.assignedIP == "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If assignedIP is already set, then drop the traffic for this namespace?

}
ns = &namespaceEgress{vnid: vnid}
eip.namespacesByVNID[vnid] = ns
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When namespace is deleted, where are we deleting vnid from namespacesByVNID map?

@danwinship
Copy link
Contributor Author

/retest

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit 060b43b into openshift:master Sep 30, 2017
@danwinship danwinship deleted the auto-egress-ip branch October 2, 2017 20:43
@@ -332,7 +335,9 @@ func (node *OsdnNode) Start() error {
if err != nil {
return err
}

if err = node.egressIP.Start(node.networkClient, nodeIPTables); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not call this for single-tenant/subnet plugin as it depends on NetNamespaces.
I think, we should move this to node.policy.StartEgressIP?

@danwinship danwinship mentioned this pull request Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. needs-api-review queue/multiple-rebases size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants