-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EVPN Gateway API #276
EVPN Gateway API #276
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution.
I’m not commenting on content yet.
From style perspective put several comments inside. We are compliant with AIP. When you run make, the linters will fail. Please check the comments and run linters to verify.
rpc CreateLB(LogicalBridgeInfo) returns (CreateLBResult); | ||
rpc ListLB(google.protobuf.Empty) returns (ListLBResult); | ||
rpc GetLB(LogicalBridgeId) returns (GetLBResult); | ||
rpc DeleteLB(LogicalBridgeId) returns (DeleteLBResult); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty result is expected from delete https://google.aip.dev/135
} | ||
|
||
message ListLBResult { | ||
repeated LogicalBridgeInfo logical_bridges = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repeated name is not consistent with message name
|
||
service LogicalBridgeService { | ||
// Management of LogicalBridge Resources | ||
rpc CreateLB(LogicalBridgeInfo) returns (CreateLBResult); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object itself and not result is expected from create method see https://google.aip.dev/133
// Management of LogicalBridge Resources | ||
rpc CreateLB(LogicalBridgeInfo) returns (CreateLBResult); | ||
rpc ListLB(google.protobuf.Empty) returns (ListLBResult); | ||
rpc GetLB(LogicalBridgeId) returns (GetLBResult); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object itself and not result is expected from get method see https://google.aip.dev/131
@@ -0,0 +1,155 @@ | |||
/*- | |||
* ============LICENSE_START======================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use short spdx license style one liner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use github mermaid to create a diagram inside markdown
for example https://mermaid.js.org/intro/#class-diagram
network/evpn-gw/README.md
Outdated
|
||
The API works on four object classes: VRF (Virtual Routing Funtions), SVI (Switched Virtual Interface), LogicalBridge and BridgePort. Their relationship is illustrated in the API data model below. | ||
|
||
<p align="center"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Embedded htlm is not allowed. You can insert and image in md style.
@@ -0,0 +1,155 @@ | |||
/*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IPU is not appropriate in the name for the proto file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this comment, I would ask that we hold all of the checkins to the same threshold where a single DPU or IPU is not correct. It should be applied evenly across.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud APIs as it is defined today, allows a multi-tenant BGP Based route/mac distribution and define following objects:
- SVI object, defined as subnet object
- VRF object, defined as VPC object
- BP object, defined as VNIC (and port) objec that can belong to a subnet/VPC
- Finally logical bridge is no different from subnet object if l3 routing is not enabled
Furthermore BGP object defines peering with the adjacent router (or RR) to distribute BGP routes via EVPN (type2, type3) routes.
Having said that I am not sure excactly how this alings with K8s work happening in parallel.
|
||
EVPN is one of the most well known technologies when it comes to data center networks. It is a Service Provider VPN Technology defined in RFC 8365 as an Internet Standard by the IETF's "BGP Enabled Services" working group. Its flexibility and scalability characteristics allow the interconnection of geographically dispersed sites using layer 2 and layer 3 networking instances. | ||
|
||
We present a lean and tailor-made EVPN Gateway API for provisioning integrated L2 and L3 network instances on an xPU, with the option to connect these to L2- and/or L3-VPN instances through an EVPN GW function hosted on the xPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A picture would be good here describing the deployment and use case on how EVPN is deployed with xPUs in the data center. I could guess this happening on the top of racks and switching devices trying to do L2/L3 forwarding within the DC (using overlays on top of an L3 network). In line wit that, would this use case move the EVPN boundary to the server building an iBGP running on each DPU to RRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picture has been provided with extensive explanation.
|
||
Tenant networks are provisioned on the xPU through the EVPN Gateway API, which is realized through gRPC and specified through the Protobuf definitions. | ||
|
||
The EVPN Gateway API in the Kubernetes context is used for network provisioning before and after the cluster instantiation, as well as interface configuration during pod creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EVPN with K8s - do you know how it would interact/work with popular CNIs? Would this be similar to what Calico uses BGP to distribute pod routes via BGP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this has been answered allready in the a previous use case and API working group meeting. Also detailed explanation has been added together with the architecture design image. If more explanation is needed let me know.
|
||
* **Bridge Port (BP)** | ||
|
||
A bridge port connects a tenant to one or more Logical Bridges. The model supports both access and trunk ports. An access port receives untagged frames from tenants and maps frames to the configured VLAN of a single logical bridge. Trunk ports forward tagged frames transparently for a set of specified VLANs. Within the xPU the packets are forwarded according to their VLAN tag. A trunk port is hence associated with multiple Logical Bridges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By tenant
you mean a pod
or a vm
? e.g. what is the other end of the access port you are alluding to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of K8s we are refering to Pods were the CNI creates the bridge ports. Of course the same concept can be used in VMs too outside of K8s context where a BridgePort is created using the evp gw API the same way that we use it to create VRFs, LogicalBridges etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe everything is explained to the different presentations in the use case and API workgroup meetings and also in the extensive explanation of the Arch picture that has been added to this README. If you have more questions please let me know
@jainvipin maybe if you can add link to the proto file, it would be helpful |
maybe this kinda picture @jainvipin had in mind that Intel and Ericsson presented in their demo: |
this pic is very helpful in understanding the implementation, we should add this in the PR. I went over the demo recording for this and there was also a use case slide that is worth included for anyone wanting to get more context on this use case. |
VPC (aka vrf object): https://github.com/opiproject/opi-api/blob/main/network/cloud/v1alpha1/vpc.proto Cloud objects do have a whole bunch of other things that are perhaps not needed in this PR. So we can make a call if we want to keep it separate for now and look at common elements later, or do it now. |
My opinion - we need to at least try to find common objects now and split instead of duplicating. @mardim91 can you please look at @jainvipin links and see if those can fit your needs instead of defining a new ones... ? |
Hello @glimchb and @jainvipin it will take some time to review the cloud API and find common ground (if any) with our EVPN GW API. Is there any recording or slideck that the cloud API has been presented so we can get more context on how this API is implemented, what is the use case in mind and how the different objects are getting created and connected to eachother ? |
A couple of things to consider:
Is it possible to leverage the applicable models to align with existing solutions that exist today for Network configuration operations. |
// SPDX-License-Identifier: Apache-2.0 | ||
// Copyright (C) 2023 Nordix Foundation. | ||
|
||
syntax = "proto3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should move inside v1alpha1 folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glimchb yes I know. The only reason I have putted it there for now is to make the linter run only in the l2_xpu file and not l3_xpu so I do not get overwhelmed from the linting errors.
Please take a look to the l2_xpu_infra_mgr.proto to check if all is ok with this file at least. Now is free from linitng errors
dbc326c
to
8cc5c42
Compare
@sandersms We for sure would like in the future to take a look and study the openconfig part and see if we can align our API with the openconfig standards. For the meanwhile we would like to merge this API as it is and then study openconfig and see if we can align. I hope this is ok with you. {Update} Openconfig is a standaridized model for configuring network devices such as routers or switches but doesn't fit very well with the model of OPI. The grpc reource based approach that is followed by the OPI where the different networking objects are getting created by well defined grpc CRUD calls (e.g. create LogicalBridge, create VRF etc...) doesn't fit very well with the gNMI generic GRPC approach where the different gNMI calls are happening by using generic requests to specific openconfig paths. Also the CNI that will be used in order to execute calls towards the EVPN-GW API in the context of Kubernetes is not at all alligned with the gNMI basics but is more alligned to a grpc CRUD operation basics. We think that is an overkill to try to allign our EVPN-GW API to the openconfig standards currenlty but if anybody from the OPI community would like to run this we are not gonna raise any objections as openconfig is generally supported by many vendors and that can add some value at the end of the day. |
@mardim91 please note MD linter errors |
<p align="center"> | ||
<img width="500" height="400" src="./images/data-model-evpn-gw-api.png"> | ||
</p> | ||
```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mardim91 for submitting this PR. I have put in some comments here. we should discuss some of them a bit more. There are some small comments but the big ones are:
- Lack of
Spec
andStatus
is very clearly seen as fields are getting mixed up - Opaque keys (aka IDs in various other OPI object definitions) alllows identification and reference to the objects very cleanly. This is not the case here, in some cases field values are used as refrence, in some cases names. Not only this lacks consistency but breaks abstraction. Moving to a consistent key-ids can fix this issue.
- There definitely seems to be a lot of side-band fixup needed for these APIs to work i.e. we'll need to introduce some more objects, especially route-table (to control linux routing perhaps), BGP (no RR and/or local/bgp-peer configuration), underlay configuration is not there, VNIC (as vport_id is very opaque w.r.t. what user really wants e.g. VF or netdev virtual interface), etc.
- Finally, underlying implementation being linux (vs. if it is P4) is evident from this API. I was hoping GRPC layer abstracts out the implementation.
} | ||
|
||
// LogicalBridge structure | ||
message LogicalBridge { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I am curious, why not call it just a Bridge
(vs LocalBridge
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason that we have called it LogicalBridge is because each LogicalBridge forms a L2Forwarding Domain based on VLANs so we think that the term LogicalBridge is more suitable in this case.
option (google.api.resource) = { | ||
type: "library.googleapis.com/LogicalBridge" | ||
pattern: "logicalBridges/logicalBridge" | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the object identifier i.e. unique way of identifying the object? should it have a key (like, other objects) to help identify the object instance?
pattern: "logicalBridges/logicalBridge" | ||
}; | ||
// Free text description | ||
string name = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is a description, maybe just rename as description.
// VXLAN VNI for L2 EVPN. Also used as EVPN route target | ||
// (-- api-linter: core::0141::forbidden-types=disabled | ||
// aip.dev/not-precedent: vni cannot be negative number. --) | ||
uint32 vni = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VNI suggests that this is a distributed bridge and not lcoal bridge - goes with previous comment about not naming calling this object localBridge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are calling it logicalBridge not localbridge. Why are you reffering to a localBridge ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, logicalBridge is good!
// Key | ||
// (-- api-linter: core::0141::forbidden-types=disabled | ||
// aip.dev/not-precedent: vlan cannot be negative number. --) | ||
uint32 vlan_id = 2 [(google.api.field_behavior) = REQUIRED]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if vlan_id can be key, because this will be overlapping across multiple VNIs - I mean clearly there are more than 4k bridges in the entire multi-tenant network. Then, GetLogicalBridgeRequest() can't be using just the vlan-id to identify the bridge.
Also, can some bridges be just configured with VNIs and not have vlans at all. Ultimately vlan is only significant if traffic from the workload (e.g. a K8s pod's VF) is tagged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Vlan id has only local signification. Will not span accross different nodes. This mean that different Vlans in different nodes can belong to the same VNI but because VLAN has Local significance then vlan_id can be used as key.
// IPv4 or IPv6 loopback address. Also serves as basis for RD in FRR | ||
string loopback_ip = 4 [(google.api.field_behavior) = REQUIRED]; | ||
// IPv4 or IPv6 IP address for the VXLAN TEP | ||
string vtep_ip = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should VTEP/Tunnel be a different object - I mean why combine it with VRF object? Can there be multiple VTEPs per VRF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a reference to an allready existing Tunnel endpoint that is configured on the xPU. This is outside the scope of the current API.
// Read-only: Local AS configured for VRF | ||
// (-- api-linter: core::0141::forbidden-types=disabled | ||
// aip.dev/not-precedent: local_as cannot be negative number. --) | ||
uint32 local_as = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local_as
, rd
are BGP attributes - shoudl bgp attributes be separated (and also add remainder of bgp configuration such as local bgp and peer bgp (or RR) configuration. This would allow any controller based EVPN implementation to work with this too where there is no explicit BGP configuration is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not configuring them. We can put them in the Status fields.
uint32 local_as = 6; | ||
// Read-only: Route distinguisher chosen by FRR | ||
string rd = 7; | ||
// Read-only: Router MAC address of the VRF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read only object => Status
// Read-only: List of connected LogicalBridges | ||
// (-- api-linter: core::0141::forbidden-types=disabled | ||
// aip.dev/not-precedent: vlan cannot be negative number. --) | ||
repeated uint32 vlan_ids = 11; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of LocalBridges
, shouldn't this be a reference to SVI objects instead. I mean VRF is a routing construct, and without SVIs this is meaningless. Where as SVI is truly a routed interface and therefore this should be what we refer here. In that sense, it would be nice to have SVI object referred to using key-ids (like rest of the OPI objects) which then can be referrerd to here.
repeat comment - read only field can be in status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can put it in the Status and we can remove the Vlan ids and replace it with SVI Ids.
// (-- api-linter: core::0141::forbidden-types=disabled | ||
// aip.dev/not-precedent: vlan cannot be negative number. --) | ||
uint32 vlan_id = 2 [(google.api.field_behavior) = REQUIRED]; | ||
// Use "aa:bb:cc:dd:ee:ff" format. Randomly assigned if not specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the router MAC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the MAC of the SVI interface
|
e731721
to
5903666
Compare
// holds the type of the bridge port | ||
BridgePortType ptype = 4 [(google.api.field_behavior) = REQUIRED]; | ||
// The resource name of the Bridge Port. | ||
// Format: bridge_ports/{bridge_port} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also add a bit more text here, something like this for example:
// name is an opaque object handle that is not user settable.
// name will be returned with created object
// user can only set {resource}_id on the Create request object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
a9e6fcd
to
7abebe2
Compare
EVPN Gateway API for provisioning integrated L2 and L3 network instances on an xPU Signed-off-by: Dimitrios Markou <dimitrios.markou@est.tech>
Signed-off-by: Dimitrios Markou <dimitrios.markou@est.tech>
Signed-off-by: Dimitrios Markou <dimitrios.markou@est.tech>
Signed-off-by: Dimitrios Markou <dimitrios.markou@est.tech>
Signed-off-by: Dimitrios Markou <dimitrios.markou@est.tech>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good in my review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually the URLs and references are not case sensitive....making them case sensitive can create incompatibilities. Suggest changing all camelCase nameing to all lowercase naming.
I have just tried to follow the google standards. You can see in the image below that the resource reference has been written with a capital letter Also when it comes to collections IDs in the URLs I believe the below rules applies the "logicalBridges" for instance is a collectios ID and I believe according to the above explanation lowercamelCase is allowed https://cloud.google.com/apis/design/resource_names#collection_id |
LGTM @mardim91 !! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per API meeting 7/13/2023 approved and merged
EVPN Gateway API for provisioning integrated L2 and L3 network instances on an xPU