-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antrea secondary network implementation for SRIOV #2651
Antrea secondary network implementation for SRIOV #2651
Conversation
build/yamls/antrea.yml
Outdated
@@ -3246,6 +3246,11 @@ metadata: | |||
app: antrea | |||
name: antrea-agent | |||
rules: | |||
- apiGroups: ["k8s.cni.cncf.io"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas , @jianjuns Yaml file patch using generate-manifest.sh is not working good for RBAC changes. I would like to discuss with both of you to see how we can handle the antea.yml changes required for k8s.cni.cncf.io apiGroups access (required for net-attach-def access from client side)
@@ -0,0 +1,12 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch needs review and get it addressed as in a way to patch properly at antrea.yml. (As of now, it replaces entire RBAC section for antrea-agent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this doesn't need to be done conditionally though patch, we can give the required permissions (to access Network Attachment Definitions) to the Agent by modifying: https://github.com/antrea-io/antrea/blob/main/build/yamls/base/agent-rbac.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. thanks for the suggestion. taken care.
|
||
set -u -e | ||
|
||
# Inspired by: https://github.com/intel/multus-cni/blob/83556f49bd6706a885eda847210b542669279cd0/images/entrypoint.sh#L161-L222 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required to install kubeconfig for whereabouts to talk to API server. this script will be called as part of init_container scope to be installed at the defined location (as per whereabouts requirements).
NOTE: Script copied/taken from whereabouts-cni repo.
cmd/antrea-agent/agent.go
Outdated
@@ -309,6 +322,30 @@ func run(o *Options) error { | |||
|
|||
go networkPolicyController.Run(stopCh) | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport) { | |||
//TBD: Initialize Secondary Network Infra here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secondary network infra code changes will be merged separate (which is responsible to create the OVS bridges and config - not required for SRIOV based secondary network config).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this block to a separate func.
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package cnipodcache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CNI cacheStore implementation to cache CNIserver information of a pod, which is required to be used by podwatch pkg while configuring secondary network
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package ipamglobal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global IPAM implementation to interface with whereabouts-cni config and assign Cluster wide unique IP address (maintained by whereabouts at api server or etcd - configurable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just name the package "ipam", and the file ipam_delegator.go? I feel no need to emphasize on "global".
pkg/agent/cniserver/server.go
Outdated
@@ -105,7 +107,7 @@ type CNIServer struct { | |||
hostProcPathPrefix string | |||
kubeClient clientset.Interface | |||
containerAccess *containerAccessArbitrator | |||
podConfigurator *podConfigurator | |||
PodConfigurator *podConfigurator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a temporary/interim change until we redo the cniserver package and split them into 3 different pkg, make it reusable for primary and secondary interface needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid touch too many code, could you add a func like GetPodConfigurator() to CNIServer instead?
pkg/agent/cniserver/server.go
Outdated
@@ -461,6 +463,15 @@ func (s *CNIServer) CmdAdd(ctx context.Context, request *cnipb.CniCmdRequest) (* | |||
klog.Infof("CmdAdd for container %v succeeded", cniConfig.ContainerId) | |||
// mark success as true to avoid rollback | |||
success = true | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is where we cache CNI server provided pod information into cnipodcache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add and pass a boolean like secInterfaceEnabled to CNIServer, instead of checking the feature gate inside the CNIServer code? I would avoid featuregate in all code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or based on whether the cniPodInfoStore argument passed to Initialize() is nil, we can save the value to a boolean flag of CNIServer struct, like cacheCNIPodInfo, and check s.cacheCNIPodInfo here.
c977d6f
to
ec21933
Compare
Codecov Report
@@ Coverage Diff @@
## main #2651 +/- ##
==========================================
- Coverage 59.23% 50.37% -8.86%
==========================================
Files 302 300 -2
Lines 25873 35722 +9849
==========================================
+ Hits 15325 17994 +2669
- Misses 8892 15993 +7101
- Partials 1656 1735 +79
Flags with carried forward coverage won't be shown. Click here to find out more.
|
ec21933
to
42f0692
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial comments
@@ -0,0 +1,12 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this doesn't need to be done conditionally though patch, we can give the required permissions (to access Network Attachment Definitions) to the Agent by modifying: https://github.com/antrea-io/antrea/blob/main/build/yamls/base/agent-rbac.yml
- '*' | ||
verbs: | ||
- '*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should specify just the resource type we need (network-attachment-definition
) and just the permissions / verbs we need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, Taken care (I used "*" only for dev needs to ensure that initial agent able to fetch the net-attach-def CR info from APIserver).
WHEREABOUTS_KUBECONFIG_LITERAL=$(echo "$WHEREABOUTS_KUBECONFIG" | sed -e s'|/host||') | ||
|
||
# ------------------------------- Generate a "kube-config" | ||
SERVICE_ACCOUNT_PATH=/var/run/secrets/kubernetes.io/serviceaccount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will use the antrea-agent Pod's ServiceAccount, which has a lot of permissions. We should restrict the whereabouts permissions to what's actually needed (access to resources in the whereabouts.cni.cncf.io
API group). Maybe @tnqn will have a suggestion on what's the best way to generate and install that Kubeconfig file (which is consumed by the whereabouts plugin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet to take care on this concern and I fully agree with the suggestion. found new issue with whereabouts IP config and working on further to debug and address it.
cmd/antrea-agent/agent.go
Outdated
if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport) { | ||
//TBD: Initialize Secondary Network Infra here. | ||
|
||
//Create NetworkAttachmentDefinitions client handle to access secondary network object definition from API Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we typically leave a space between the //
and the actual comment
informerFactory, | ||
nodeConfig.Name, | ||
cniPodInfoStore, | ||
cniServer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should change that, there should be no need for the "pod watcher" to access the instance of the cniServer. I believe you are already working on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes certainly. I'm using it as temporary solution for this PR and I will take care of this as part of CNIserver pkg refactor PR.
return errors.New("Secondary network annotation parsing failed") | ||
} else { | ||
for _, netinfo := range networklist { | ||
klog.Infof("Secondary Network Information: %+v", netinfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have transitioned to structured logging function (klog.InfoS
, klog.ErrorS
) for all new logs. Please update the PR accordingly. You can find some examples in the rest of the PR.
@@ -0,0 +1,80 @@ | |||
// Copyright 2021 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to suggest a different package name instead of pkg/agent/cnf
. Maybe pkg/agent/secondarynetworks
?
cc @jianjuns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Secondary interfaces might be used for other cases besides CNF.
How about secondaryinterfaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also feel we can just name this file "cache.go", as it is already in the "cnipodcache" package.
//If container information is not available yet in the CNICache, return error to requeue work | ||
podCniInfo := podWatcher.podCache.GetCniConfigInfoPerPod(pod.Name, pod.Namespace) | ||
if len(podCniInfo) == 0 { | ||
return errors.New("CNI Cache Information not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will hit this "normal" case pretty often since there will be some delay between the Pod object being created in the k8s API and the CNI being invoked. We should avoid logging this error as it will be misleading to the user, but we also need to ensure that the Pod is added back to the workqueue so that there can be a retry later.
//TBD after secondary network specific OVS init code merge. | ||
klog.Errorf("VethPair not supported. Failed to configure interfaces for container %s: %v", containerInfo.ContainerId, err) | ||
} | ||
} //containerInfo loop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't typically include such comments in the Antrea codebase. If there is too much indentation, it would probably make sense to move that logic to its own function.
pkg/features/antrea_features.go
Outdated
FlowExporter: {Default: false, PreRelease: featuregate.Alpha}, | ||
NetworkPolicyStats: {Default: true, PreRelease: featuregate.Beta}, | ||
NodePortLocal: {Default: false, PreRelease: featuregate.Alpha}, | ||
SecondaryInterfaceSupport: {Default: false, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest SecondaryNetworks
as the Feature Gate name
In any case "Support" is not needed, it's implied for Feature Gates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about SecondaryInterfaces? I just feel it is more about creating secondary interfaces, and we not necessarily handle networks.
} | ||
|
||
//Add cniPodInfo to local cache store. | ||
func (cni *cniPodInfoCache) AddCniConfigInfo(cniConfig *CNIConfigInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cni -> CNI. We use all upper case for abbrs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change all occurrences.
cmd/antrea-agent/agent.go
Outdated
} | ||
|
||
} else { | ||
err = cniServer.Initialize(ovsBridgeClient, ofClient, ifaceStore, entityUpdates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reuse the same Initialize() func, but based on whether the cniPodInfoStore argument is nil, the func can know it needs to initialize CNI cache or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
introduce another argument (and passing it nil for non secondary interface flow) impact significant of cniserver specific existing test code (also calls these Initialize function). To avoid additional changes, I spilt it separate function with cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not get why that is a problem. You can just pass false/nil for existing callers inc. test code. Why it is significant impact?
cmd/antrea-agent/agent.go
Outdated
if err != nil { | ||
return fmt.Errorf("error initializing CNI server: %v", err) | ||
|
||
cniPodInfoStore := cnipodcache.NewCniPodInfoStore() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably assign cniPodInfoStore to cnipodcache.NewCniPodInfoStore inside the "if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport)" block.
cmd/antrea-agent/agent.go
Outdated
@@ -309,6 +322,30 @@ func run(o *Options) error { | |||
|
|||
go networkPolicyController.Run(stopCh) | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport) { | |||
//TBD: Initialize Secondary Network Infra here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this block to a separate func.
} | ||
|
||
//Add cniPodInfo to local cache store. | ||
func (cni *cniPodInfoCache) AddCniConfigInfo(cniConfig *CNIConfigInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change all occurrences.
@@ -0,0 +1,80 @@ | |||
// Copyright 2021 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Secondary interfaces might be used for other cases besides CNF.
How about secondaryinterfaces?
@@ -0,0 +1,59 @@ | |||
// Copyright 2021 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel no need to add a separate file, and the func can be in podwatch_controller.go. But no strong opinion.
pkg/agent/cniserver/server.go
Outdated
@@ -105,7 +107,7 @@ type CNIServer struct { | |||
hostProcPathPrefix string | |||
kubeClient clientset.Interface | |||
containerAccess *containerAccessArbitrator | |||
podConfigurator *podConfigurator | |||
PodConfigurator *podConfigurator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid touch too many code, could you add a func like GetPodConfigurator() to CNIServer instead?
pkg/agent/cniserver/server.go
Outdated
@@ -461,6 +463,15 @@ func (s *CNIServer) CmdAdd(ctx context.Context, request *cnipb.CniCmdRequest) (* | |||
klog.Infof("CmdAdd for container %v succeeded", cniConfig.ContainerId) | |||
// mark success as true to avoid rollback | |||
success = true | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryInterfaceSupport) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add and pass a boolean like secInterfaceEnabled to CNIServer, instead of checking the feature gate inside the CNIServer code? I would avoid featuregate in all code.
pkg/agent/cniserver/server.go
Outdated
cniPodInfoStore cnipodcache.CniPodInfoStore, | ||
) error { | ||
var err error | ||
s.PodConfigurator, err = newPodConfiguratorWithCniCache( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, could we just add cniPodInfoStore argument to newPodConfigurator() instead of adding new func with code duplication?
pkg/features/antrea_features.go
Outdated
FlowExporter: {Default: false, PreRelease: featuregate.Alpha}, | ||
NetworkPolicyStats: {Default: true, PreRelease: featuregate.Beta}, | ||
NodePortLocal: {Default: false, PreRelease: featuregate.Alpha}, | ||
SecondaryInterfaceSupport: {Default: false, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about SecondaryInterfaces? I just feel it is more about creating secondary interfaces, and we not necessarily handle networks.
c0b4929
to
658c301
Compare
WHEREABOUTS_KUBECONFIG_LITERAL=$(echo "$WHEREABOUTS_KUBECONFIG" | sed -e s'|/host||') | ||
|
||
# ------------------------------- Generate a "kube-config" | ||
SERVICE_ACCOUNT_PATH=/var/run/secrets/kubernetes.io/serviceaccount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet to take care on this concern and I fully agree with the suggestion. found new issue with whereabouts IP config and working on further to debug and address it.
build/yamls/antrea-aks.yml
Outdated
@@ -3393,6 +3393,27 @@ rules: | |||
- get | |||
- watch | |||
- list | |||
- apiGroups: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make manifest (after updating base rbac file) updated all the antrea-*.yml file. But, not sure if we really wanted to enable secondarynetwork config on all the variants?. if not, I can remove these changes.
@@ -0,0 +1,12 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. thanks for the suggestion. taken care.
- '*' | ||
verbs: | ||
- '*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, Taken care (I used "*" only for dev needs to ensure that initial agent able to fetch the net-attach-def CR info from APIserver).
cmd/antrea-agent/agent.go
Outdated
} | ||
|
||
} else { | ||
err = cniServer.Initialize(ovsBridgeClient, ofClient, ifaceStore, entityUpdates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
introduce another argument (and passing it nil for non secondary interface flow) impact significant of cniserver specific existing test code (also calls these Initialize function). To avoid additional changes, I spilt it separate function with cache.
informerFactory, | ||
nodeConfig.Name, | ||
cniPodInfoStore, | ||
cniServer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes certainly. I'm using it as temporary solution for this PR and I will take care of this as part of CNIserver pkg refactor PR.
pkg/agent/cniserver/server.go
Outdated
@@ -461,6 +468,15 @@ func (s *CNIServer) CmdAdd(ctx context.Context, request *cnipb.CniCmdRequest) (* | |||
klog.Infof("CmdAdd for container %v succeeded", cniConfig.ContainerId) | |||
// mark success as true to avoid rollback | |||
success = true | |||
|
|||
if s.secNetworkEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took care as suggested by @jianjuns (do not use feature gate all over the code. used bool variable to decide whether we need to cache the cni info or not)
@@ -0,0 +1,59 @@ | |||
// Copyright 2021 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jianjuns separate file for init was introduced based on secondary network infra (OVS) init needs. I will revisit on this file needs while pulling in the OVS bridge init changes in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it should not be called "podwatch_controller_init.go", but maybe just "init.go" under "/secondarynetwork"? For now, I still suggest to move the code to "controller.go".
pkg/features/antrea_features.go
Outdated
FlowExporter: {Default: false, PreRelease: featuregate.Alpha}, | ||
NetworkPolicyStats: {Default: true, PreRelease: featuregate.Beta}, | ||
NodePortLocal: {Default: false, PreRelease: featuregate.Alpha}, | ||
SecondaryNetwork: {Default: true, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed as SecondaryNetwork (more relevant to state it as network instead of interface, in my opinion - as we do configure secondary network to address CNF use cases with this feature gate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be SecondaryNetworks
...
658c301
to
a9e2b0a
Compare
cmd/antrea-agent/agent.go
Outdated
@@ -309,6 +322,30 @@ func run(o *Options) error { | |||
|
|||
go networkPolicyController.Run(stopCh) | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryNetwork) { | |||
// TBD: Initialize Secondary Network Infra here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, suggest to add a func for this block.
return nil | ||
} | ||
|
||
//get POD/Container information (container id, namespace, etc.,) from CNI info cache (added by CNI Server CmdAdd upon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a space after "//".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POD -> Pod
} | ||
|
||
//get POD/Container information (container id, namespace, etc.,) from CNI info cache (added by CNI Server CmdAdd upon | ||
//successful pod and primary network creation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod -> Pod
var ret bool | ||
pod := obj.(*corev1.Pod) | ||
|
||
secondaryNetwork, ret := checkForPodSecondaryNetworkAttachement(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case, you missed the previous comments:
You can check the old version of the Pod obj to know if the CNF annotation got removed from the Pod.
Check: https://github.com/antrea-io/antrea/blob/main/pkg/agent/nodeportlocal/k8s/npl_controller.go#L211
networklist, err := parsePodSecondaryNetworkAnnotation(secondaryNetwork) | ||
if err != nil { | ||
return errors.New("Secondary network annotation parsing failed") | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can remove the "else {}".
for _, netinfo := range networklist { | ||
klog.InfoS("Secondary Network Information:", netinfo) | ||
if len(netinfo.NetworkName) > 0 { | ||
netDefCRD, err := podWatcher.netAttachDefClient.NetworkAttachmentDefinitions(pod.Namespace).Get(context.TODO(), netinfo.NetworkName, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not support changes of NetworkAttachmentDefinition, or NetworkAttachmentDefinition created after Pods?
for _, containerInfo := range podCNIInfo { | ||
cmdArgs = &invoke.Args{Command: string("ADD"), ContainerID: containerInfo.ContainerId, | ||
NetNS: containerInfo.ContainerNetNS, IfName: containerInfo.ContainerIfDev, | ||
Path: string("/opt/cni/bin/")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define a constant for this path.
podSriovVFDeviceID, err := cniserver.GetPodContainerDeviceIDs(pod.Name, pod.Namespace) | ||
if err != nil { | ||
klog.ErrorS(err, "handleAddUpdatePod: GetPodContainerDeviceIDs failed for ", pod.Name, pod.Namespace) | ||
//return nil here. we do not want to retry/requeue work if VF id fetch fails (user to check if VFs are created at node). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a space after //.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we -> We
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node -> Node
|
||
ipamResult, err = ipam.GetIPAMGlobalAddress(cniconfig, cmdArgs) | ||
if err != nil { | ||
return errors.New("Secondary network global IPAM failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - how Multus handles such errors. Does it report the failure back to users via some ways (e.g. annotation, events, etc.)?
) | ||
|
||
const ( | ||
ipamGlobal = "whereabouts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove "Global" from (variable, func) names? It will be confusing, as we will support IPAM in Antrea too (see here: #2740).
If you think that is better, we can call it ipamWhereabouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RawExec: &invoke.RawExec{Stderr: os.Stderr}, | ||
} | ||
|
||
func delegateCommon(delegatePlugin string, exec invoke.Exec, cniPath string) (string, invoke.Exec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is a fair amount of code duplication with /agent/cniserver/ipam/ipam_delegator.go
, is there an option to reuse the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies for the delayed reply on this comment (though I communicate with Jianjun/Antonin over Slack on the inputs for their comments).
The reason I didn't reuse the existing implementation for ipam was due to,
- Antonin and Jianjun suggested to keep the secondary interface changes as isolated as feasible.
- during initial development, I thought we will have more to be done w.r.t whereabouts CNI parsing the retrieved IPAM context from networkAttachementDefinition. But, that wasn't the case when I finalized the implementation.
Indeed, I see that we would be able to reuse the IPAM package with changes/modifications to the existing implementation. If this is fine, I can get it addressed asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, from what I saw a few funcs here are exactly the same as those in cniserver/ipam, so probably we should reuse them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, you plan to refactor this part and share code with /agent/cniserver/ipam/ipam_delegator.go
in a later PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
49f43cb
to
9b0c0a0
Compare
WHEREABOUTS_KUBECONFIG_LITERAL=$(echo "$WHEREABOUTS_KUBECONFIG" | sed -e s'|/host||') | ||
|
||
# ------------------------------- Generate a "kube-config" | ||
SERVICE_ACCOUNT_PATH=/var/run/secrets/whereabouts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whereabouts restricted to use separate serviceaccount/secrets.
9b0c0a0
to
81f4b7b
Compare
} | ||
|
||
// Add CNIPodInfo to local cache store. | ||
func (pod *CNIPodInfoCache) AddCNIConfigInfo(CNIConfig *CNIConfigInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why you use "pod" for short name of CNIPodInfoCache, but not "cache" or just "c"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CNIConfig -> cniConfig (as it is not a global variable).
package cnipodcache | ||
|
||
type CNIConfigInfo struct { | ||
CniVersion string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CniVersion -> CNIVersion
We use all upper cases for abbrs.
CniVersion string | ||
PodName string | ||
PodNameSpace string | ||
ContainerId string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainerId -> ContainerID.
RawExec: &invoke.RawExec{Stderr: os.Stderr}, | ||
} | ||
|
||
func delegateCommon(delegatePlugin string, exec invoke.Exec, cniPath string) (string, invoke.Exec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
netDev = "eth1" | ||
) | ||
|
||
type PodWatchController struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember Antonin suggested to rename it to PodController.
And in my mind, we should put this file under /podwatch directly (remove /k8s), and rename it to "controller.go".
@@ -0,0 +1,59 @@ | |||
// Copyright 2021 Antrea Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it should not be called "podwatch_controller_init.go", but maybe just "init.go" under "/secondarynetwork"? For now, I still suggest to move the code to "controller.go".
81f4b7b
to
54b9c12
Compare
Type string `json:"type,omitempty"` | ||
IPAM IPAMConfig `json:"ipam,omitempty"` | ||
// Options to be passed in by the runtime. | ||
//RuntimeConfig RuntimeConfig `json:"runtimeConfig"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you just remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do this clean-up as part of another PR which does veth pair and OVS init handle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but why not remove these commented lines in this PR?
@@ -0,0 +1,12 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file needs to be removed? Comments have been addressed and it doesn't seem to be used anywhere anymore.
build/yamls/base/kustomization.yml
Outdated
- whereabouts-rbac.yml | ||
- agent.yml | ||
- whereabouts-secret.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we discussed that there should be a flag for generate-manifest.sh
, that would take care of applying all the yaml patches (except changes to agent-rbac.yml). Wasn't that the case @jianjuns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I had a flag defined for whereabouts (as part of rbac changes). But, since I got inputs to put them directly on the agent-rbac file, I thought we are doing the same for other changes as well.
Is that ok, If I do it in separate PR right after this one?.
cmd/antrea-agent/agent.go
Outdated
@@ -309,6 +321,29 @@ func run(o *Options) error { | |||
|
|||
go networkPolicyController.Run(stopCh) | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryNetwork) { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary empty line
cmd/antrea-agent/agent.go
Outdated
// Create NetworkAttachmentDefinitions client handle to access secondary network object definition from API Server. | ||
netAttachDefClient, err := k8s.CreateNetworkAttachDefClient(o.config.ClientConnection, o.config.KubeAPIServerOverride) | ||
if err != nil { | ||
return fmt.Errorf("networkAttachmentDefinition creation failed. %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("networkAttachmentDefinition creation failed. %v", err) | |
return fmt.Errorf("networkAttachmentDefinition client creation failed: %v", err) |
cmd/antrea-agent/agent.go
Outdated
@@ -309,6 +321,29 @@ func run(o *Options) error { | |||
|
|||
go networkPolicyController.Run(stopCh) | |||
|
|||
if features.DefaultFeatureGate.Enabled(features.SecondaryNetwork) { | |||
|
|||
// Create NetworkAttachmentDefinitions client handle to access secondary network object definition from API Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Create NetworkAttachmentDefinitions client handle to access secondary network object definition from API Server. | |
// Create the NetworkAttachmentDefinitions client which handles access to secondary network object definitions from the API Server. |
podSriovVFDeviceID, err := cniserver.GetPodContainerDeviceIDs(pod.Name, pod.Namespace) | ||
if err != nil { | ||
klog.ErrorS(err, "handleAddUpdatePod: GetPodContainerDeviceIDs failed for ", pod.Name, pod.Namespace) | ||
// return nil here. we do not want to retry/requeue work if VF id fetch fails (user to check if VFs are created at node). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/node/Node
I'm not sure I see the harm in requeueing here. Yes, we will keep logging the error periodically, but maybe it's not bad. And if VFs are eventually created, then we will be able to recover. Otherwise, the used as to delete the Pod and recreate it, or something similar.
|
||
ipamResult, err = ipam.GetIPAMSubnetAddress(cniconfig, cmdArgs) | ||
if err != nil { | ||
return errors.New("Secondary network global IPAM failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we do not capitalize error messages like this one
for _, ip := range result.IPs { | ||
ip.Interface = current.Int(1) | ||
} | ||
// If secondary interface name not provided in pod annotation, assign default name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pod/Pod
if len(netinfo.NetworkName) > 0 { | ||
netDefCRD, err := pc.netAttachDefClient.NetworkAttachmentDefinitions(pod.Namespace).Get(context.TODO(), netinfo.NetworkName, metav1.GetOptions{}) | ||
if err != nil { | ||
klog.Errorf("NetworkAttachmentDefinitions for %s returns err: %s", netinfo.NetworkName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use structured logging
} | ||
cniconfig, err := netdefutils.GetCNIConfig(netDefCRD, "") | ||
if err != nil { | ||
klog.Errorf("net-attach-def: CNI config spec read error %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
54b9c12
to
ee696e3
Compare
b075a3a
to
e9853b7
Compare
) | ||
|
||
const ( | ||
controllerName = "AntreaAgentPodController" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more comment: not a big fan of this name, it doesn't imply that this is specific to secondary networks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"SecondaryNetworkController"?
@arunvelayutham Our jenkins CI jobs are currently broken (at leasts some of them are). However, I see that you have some Github CI jobs currently failing for this PR. |
c766b26
to
042049f
Compare
/test-all |
042049f
to
e9e90f8
Compare
/test-all |
e9e90f8
to
aae7453
Compare
/test-all |
aae7453
to
f3a8068
Compare
/test-all |
f3a8068
to
67ef0b8
Compare
/test-all |
/test-all-features-conformance |
/test-conformance |
/test-all |
1 similar comment
/test-all |
Signed-off-by: arunvelayutham <arunkumar.velayutham@intel.com>
/test-all |
67ef0b8
to
7e0a14e
Compare
/test-all |
/test-all-features-conformance |
/test-integration |
/test-all |
/test-integration |
/test-e2e |
1 similar comment
/test-e2e |
Implement SR-IOV secondary network for Antrea CNI configured Pods This commit implements SR-IOV secondary network that creates Pod secondary network interfaces with SR-IOV VFs. At this moment, the implementation supports only baremetal Nodes, not SR-IOV interfaces on VM Nodes. Signed-off-by: arunvelayutham <arunkumar.velayutham@intel.com>
Implement secondary network for Antrea CNI configured pods using SRIOV interface. (CNF use case)
Signed-off-by: root arunkumar.velayutham@intel.com