Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USHIFT-2186: Multus CNI for MicroShift #1545

Merged
merged 25 commits into from
Feb 19, 2024

Conversation

pmtk
Copy link
Member

@pmtk pmtk commented Jan 23, 2024

No description provided.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 23, 2024

@pmtk: This pull request references USHIFT-2186 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 23, 2024
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
Originally that RPM was meant for Podman networking, but Podman shipped with RHEL9 does not use them anymore.
This means they can exist only for compatibility and are not actively maintained.
On the other hand, OpenShift networking team (with whom we have ongoing cooperation) are actively
maintaining these binaries and quickly addressing any CVEs or bugs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to mention here that the binaries delivered by the networking team come in a container image as part of the OCP payload.

@pmtk
Copy link
Member Author

pmtk commented Jan 25, 2024

@pmtk
Copy link
Member Author

pmtk commented Jan 25, 2024

/uncc @knobunc @travier

enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
"mac": "b2:6a:5c:ba:34:9a",
"dns": {},
"gateway": [
"\u003cnil\u003e"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line doesn't translate to any meaningful gateway value, shall it be updated to a real gateway address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I don't know. It looks funky to me but this is what I got from the annotations when using Multus...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s1061123 this sounds like a bug?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\u003c is < and \u003e is > so it looks like something is turning <nil> into a unicode byte string when it renders the annotation as JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is happen because host-local IPAM plugin is used with wrong configuration. As described https://www.cni.dev/plugins/current/ipam/host-local/, gateway configuration should be in range section.

With following net-attach-def,

---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: bridge-conf-1
spec:
  config: '{
            "cniVersion": "0.3.1",
            "type": "bridge",
            "bridge": "testbr1",
            "ipam": {
                "type": "host-local",
                "ranges": [
                    [
                        {
                            "subnet": "10.10.0.0/16",
                            "rangeStart": "10.10.1.20",
                            "rangeEnd": "10.10.3.50",
                            "gateway": "10.10.0.254"
                        }
                    ]
                ],
                "dataDir": "/var/lib/cni/testbr1"
            }
        }'

Then get the following result (gateway is configured in eth0 hence no gateway in net1)

    annotations:
      k8s.v1.cni.cncf.io/network-status: |-
        [{
            "name": "cbr0",
            "interface": "eth0",
            "ips": [
                "10.244.1.2"
            ],
            "mac": "5a:a3:ef:74:d1:36",
            "default": true,
            "dns": {},
            "gateway": [
                "10.244.1.1"
            ]
        },{
            "name": "default/bridge-conf-1",
            "interface": "net1",
            "ips": [
                "10.10.1.20"
            ],
            "mac": "7e:07:16:b2:ef:c5",
            "dns": {}
        }]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I used the snippet to update the enhancement

"mac": "7a:ca:cf:19:64:e8",
"dns": {},
"gateway": [
"\u003cnil\u003e"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update to a real gateway value?

have increased CNI failure rate.

#### Support Procedures

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other support procedure might be to configure multus to redirect log to local file.
Multus thin plugin mode doesn't provide stdout logs in its container like other long-running CNI (ovn-kubernetes) does. If any failure in multus binary itself, the stdout can only be retrieved by updating multus cni config to redirect log to local file: https://github.com/openshift/multus-cni/blob/master/docs/configuration.md#writing-to-a-log-file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multus messages in pod events are mainly for users (just for 'events'). More multus messages are captured by crio/kubelet journal logs, so please describe that in this section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example

[fedora@kube-node-1 ~]$ journalctl -u kubelet | tail
(snip)
Feb 06 14:27:31 kube-node-1 kubelet[13551]: E0206 14:27:31.893101   13551 kuberuntime_manager.go:1172] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to create pod network sandbox k8s_fedora_default_228fdcfa-dfc3-46de-b0b5-d996457f7d7e_0(dae451e8aaf3e9bff94ce8c1e2dc3918b46db2334f88f028c5e5d59b8156bf06): error adding pod default_fedora to CNI network \"multus-cni-network\": plugin type=\"multus\" name=\"multus-cni-network\" failed (add): [default/fedora/228fdcfa-dfc3-46de-b0b5-d996457f7d7e:macvlan-conf-1]: error adding container to network \"macvlan-conf-1\": Link not found" pod="default/fedora"
Feb 06 14:27:31 kube-node-1 kubelet[13551]: E0206 14:27:31.893286   13551 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"fedora_default(228fdcfa-dfc3-46de-b0b5-d996457f7d7e)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"fedora_default(228fdcfa-dfc3-46de-b0b5-d996457f7d7e)\\\": rpc error: code = Unknown desc = failed to create pod network sandbox k8s_fedora_default_228fdcfa-dfc3-46de-b0b5-d996457f7d7e_0(dae451e8aaf3e9bff94ce8c1e2dc3918b46db2334f88f028c5e5d59b8156bf06): error adding pod default_fedora to CNI network \\\"multus-cni-network\\\": plugin type=\\\"multus\\\" name=\\\"multus-cni-network\\\" failed (add): [default/fedora/228fdcfa-dfc3-46de-b0b5-d996457f7d7e:macvlan-conf-1]: error adding container to network \\\"macvlan-conf-1\\\": Link not found\"" pod="default/fedora" podUID="228fdcfa-dfc3-46de-b0b5-d996457f7d7e"

[fedora@kube-node-1 ~]$ journalctl -u crio | tail
(snip)
Feb 06 14:28:44 kube-node-1 crio[11778]: 2024-02-06T14:28:44+09:00 [verbose] Del: default:fedora:228fdcfa-dfc3-46de-b0b5-d996457f7d7e:cbr0:eth0 {"cniVersion":"0.3.1","name":"cbr0","plugins":[{"delegate":{"hairpinMode":true,"isDefaultGateway":true},"type":"flannel"},{"capabilities":{"portMappings":true},"type":"portmap"}]}
Feb 06 14:28:44 kube-node-1 crio[11778]: 2024-02-06T14:28:44+09:00 [debug] conflistDel: &{1ae5a8f8270afbe7faa5e54ed636692bae72fb73ab9ca0132e8e66307b34b696 /var/run/netns/bc624229-b577-40e3-ad41-7af830a909bc eth0 [[IgnoreUnknown true] [K8S_POD_NAMESPACE default] [K8S_POD_NAME fedora] [K8S_POD_INFRA_CONTAINER_ID 1ae5a8f8270afbe7faa5e54ed636692bae72fb73ab9ca0132e8e66307b34b696] [K8S_POD_UID 228fdcfa-dfc3-46de-b0b5-d996457f7d7e] [IgnoreUnknown 1] [K8S_POD_NAMESPACE default] [K8S_POD_NAME fedora] [K8S_POD_INFRA_CONTAINER_ID 1ae5a8f8270afbe7faa5e54ed636692bae72fb73ab9ca0132e8e66307b34b696] [K8S_POD_UID 228fdcfa-dfc3-46de-b0b5-d996457f7d7e]] map[] }, {"cniVersion":"0.3.1","name":"cbr0","plugins":[{"delegate":{"hairpinMode":true,"isDefaultGateway":true},"type":"flannel"},{"capabilities":{"portMappings":true},"type":"portmap"}]}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added kubelet, but I don't see these logs in crio even though I configured log_level = "trace".
I think it might be worth it after all to enable logging in the multus itself (--multus-log-file, --multus-log-level), WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • as far as I checked upstream, crio captures the logs as above. I don't know why. Does microshift uses special crio?
  • it is trade-off to disk space and debuggability. I don't have clear picture how to troubleshoot in microshift. If you have such troubleshoot process, it might be added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added log-file as open question.
I don't know about the crio - tried setting the log level, playing with different log settings for multus and still didn't get that kind of log. Maybe I'll find something during implementation phase

enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
NetworkAttachmentDefinition is incorrect or when Pod's Annotation contains NAD that does not exist.
In such cases, user needs to verify its manifests.

Pods without Multus' Annotation will be set up with the default CNI (ovn-kubernetes) and should not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pods without Multus' Annotation will be set up with the default CNI (ovn-kubernetes)

Let me double check that 'Pods with Multus' annotation' still use multus but multus only invokes ovn-kubernetes hence it should be okey and 'Pods with Multus' annotation' does NOT use ovn-kubernetes directly.

have increased CNI failure rate.

#### Support Procedures

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multus messages in pod events are mainly for users (just for 'events'). More multus messages are captured by crio/kubelet journal logs, so please describe that in this section.

enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
@dhellmann dhellmann added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 7, 2024
"mac": "b2:6a:5c:ba:34:9a",
"dns": {},
"gateway": [
"\u003cnil\u003e"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\u003c is < and \u003e is > so it looks like something is turning <nil> into a unicode byte string when it renders the annotation as JSON.

enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
enhancements/microshift/multus-cni-for-microshift.md Outdated Show resolved Hide resolved
to allow for multi-node clusters. For this reason, during implementation of this enhancement, no
assumption that MicroShift will always run as a single-node should be made and potential multi-node
should be kept in mind. We will not build the solution to support multi-node, but want to avoid
making decisions that make it harder to do so in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know! so there is a possibility whereabouts is needed in the future when multi-node is introduced.

@dhellmann
Copy link
Contributor

#1555 is changing the enhancement template in a way that will cause the header check in the linter job to fail for existing PRs. If this PR is merged within the development period for 4.16 you may override the linter if the only failures are caused by issues with the headers (please make sure the markdown formatting is correct). If this PR is not merged before 4.16 development closes, please update the enhancement to conform to the new template.

Copy link
Contributor

@s1061123 s1061123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.

@pmtk
Copy link
Member Author

pmtk commented Feb 15, 2024

/assign @dhellmann

Copy link
Contributor

@pacevedom pacevedom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Comment on lines +410 to +417
Considering only the manifests, we know that on each start MicroShift will apply manifests forcefully
overwriting any differences. However, MicroShift does not have any uninstall capabilities.
If manifests ever change, for example some ConfigMap is renamed, then these old parts will
keep existing in the database. This could have undesirable consequences of having two Multus DaemonSets
if the name or the namespace of original DaemonSet changes. To make the transition to thick Multus
smooth, we should not deviate from already existing resource names present in OpenShift's Multus
manifests. This problem is not Multus specific - it is how MicroShift works and it is not part of this
enhancement addressing this shortcoming.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this enhancement, but maybe we should start thinking about marking MicroShift's owned resources to perform checks when starting and remove any discrepancies.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 16, 2024

Both of these images will be part of the OpenShift payload which MicroShift references during rebase procedure.

### Risks and Mitigations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be possible to create a NAD that impacts the microshift host's networking, e.g.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: troublemaker-nad
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "macvlan", 
      "master": "enp1s0",  # Shares the host's main interface
      "ipam": {
          "type": "host-local",
          "subnet": "10.42.0.0/24",  # Overlaps with pod network range
          "routes": [
              {"dst": "0.0.0.0/0", "gw": "10.42.0.100/24"}  # Arbitrary default route
          ] 
      }
   }'

If so, this risk presumedly also exists for standard openshift cluster. To what extent are we going to rely on Openshift's multus documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks! Added to the risks.
In general I think we want to reuse as much as possible of what's relevant to MicroShift.


Both of these images will be part of the OpenShift payload which MicroShift references during rebase procedure.

### Risks and Mitigations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be possible to create a NAD that impacts the microshift host's networking, e.g.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: troublemaker-nad
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "macvlan", 
      "master": "enp1s0",  # Shares the host's main interface
      "ipam": {
          "type": "host-local",
          "subnet": "10.42.0.0/24",  # Overlaps with pod network range
          "routes": [
              {"dst": "0.0.0.0/0", "gw": "10.42.0.100/24"}  # Arbitrary default route
          ] 
      }
   }'

If so, this risk presumedly also exists for standard openshift cluster. To what extent are we going to rely on Openshift's multus documentation?

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 19, 2024
@dhellmann
Copy link
Contributor

With +1 from @pacevedom, @jerpeter1, @s1061123, and @zshi-redhat, it looks like we are ready to approve this.

Thank you, everyone!

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 19, 2024
Copy link
Contributor

openshift-ci bot commented Feb 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhellmann, s1061123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 19, 2024
Copy link
Contributor

openshift-ci bot commented Feb 19, 2024

@pmtk: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 5627f14 into openshift:master Feb 19, 2024
2 checks passed
jupierce pushed a commit to jupierce/enhancements that referenced this pull request Feb 23, 2024
* multus CNI for microshift

* review changes

* race condition risk between multus readiness and workload start

* add intro to race condition between MicroShift and other networking init procedures

* explicitly mention /opt/cni/bin

* NetPol and webhook are non-goals

* add goal about providing IPAMs and CNIs

* improve workflows

- added explicit RHEL (rpm) sections
- explicitly mention NAD before the App using NAD

* `vlan` -> `ipvlan`; images will be part of OCP payload

* reword origin of thick plugin

* add "longer startup of Pods" drawback

* fix trailing whitespace

* extra networks will not support K8s' Service API

* supportability: kubelet & crio logs

* add open question about --namespace-isolation

* Remove mention of `default-network` annotation

It's not intended to be used in normal mode of operation.

* mention dhcp server

* test day 2 installation

* alternative: Cluster Network Operator based operator

* fix multus.kubeconfig path

* OpenShift uses thick plugin since 4.14

* fix NAD example with nil gateway

* Q: logging to file?

* update template

* risk: NAD reusing MicroShift's CIDR

Currently MicroShift ships [ovn-kubernetes](https://github.com/openshift/ovn-kubernetes) (ovn-k)
CNI responsible for connectivity within and outside the cluster.
There are users that have needs beyond what ovn-k offers like adding more interfaces to the Pods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a limit of interfaces per pod?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any. It wasn't mentioned in any multus related documentation

## Motivation

Providing Multus CNI for MicroShift will help users:
- wanting to integrate MicroShift into existing environments

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning that we don't necessarily have to revise our network architecture for MicroShift?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multus is something extra, everything so far we have about ovn-k still applies. But if you could point me to where current network architecture resides I'd be happy to double check


Providing Multus CNI for MicroShift will help users:
- wanting to integrate MicroShift into existing environments
- wanting to slowly, step by step, migrate to MicroShift without being required to refactor everything at once

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does Multus allow a slow migration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If customer has apps running using podman and uses some CNI plugin for them, then they can transform the apps into Pods and run on microshift, but instead of throwing everything away and switching to k8s way of exposing services, they can use multus to keep using that CNI they have so the rest of the environment can stay the same (otherwise whole network might need to be reconfigured and applications on the other side would need to call these migrated apps in different way (different port, interface, etc.)).


* As a MicroShift admin, I want to add additional interfaces to certain Pods so that I can
make them accessible over networks that should not be available to rest of the cluster.
* As a MicroShift admin, I want to add additional interfaces to certain Pods so that I can

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Multus also use ONV-K as a plug-in, or are they mutually exclusive? (Since we don't use CNO, can MicroShift use OVN-K with a manifest we provide?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's possible. Actually there's nothing we can do to prohibit that if the NetworkAttachmentDefinition is created like that. I don't know the detail how that would work, but it would be an additional interface - the "original" k8s interface from microshift's ovnk network will continue to be primary interface

Some example requirements are connecting Pods to the host's bridge interface or setting up complex networking based on VLAN.
This functionality is Multus' trademark - adding additional interfaces to Pods.

This enhancement explores providing Multus CNI as an optional component to MicroShift.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Multus be used with RHEL for Edge images? If yes, can it be used in disconnected clusters, or only fully connected? How do we add the info to the blueprint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can. RPM wise, it's the same like e.g. OLM:

[[packages]]
name = "microshift-multus"
version = "*"

It can be used in disconnected clusters - image references will be in microshift-multus-release-info RPM.

(see alternatives for more information).
Because of the differences with OpenShift (which uses thick Multus plugin), existing OpenShift
manifests will require changes to make them suitable for MicroShift. These manifests will reside
in MicroShift repository.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these user-configurable, or do they need to be used as provided?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's like OLM - we provide RPM with manifests and there's no way to customize them further at a later stage. If someone wants to change the behavior then they can fork, change it and rebuild.

Something like this would require support for Helm in addition to kustomize, or provide some microshift specific Helm-like some mechanism to customize the kustomize manifests

configure the host's networking
Taking `bridge` CNI as an example: when the bridge interface does not exist, it will be created
when a Pod requiring that interface is created.
If user expects something else will create the interface, they will need to configure system to start

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this a boot-order type of config done on RHEL? Is there a generic link we can provide for users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is system (RHEL) level configuration. More specifically it's systemd - they would create service that's Before microshift.service.
There's RHEL documentation about creating systemd service files: Using systemd unit files to customize and optimize your system > 1. Working with systemd unit files > 1.4. Important [Unit] section options (Before is mentioned in context of After)


There may be a race condition between Multus on MicroShift and other services on the host that also
configure the host's networking
Taking `bridge` CNI as an example: when the bridge interface does not exist, it will be created

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes the pod "know" it needs that bridge? Can the bridge be created when it wasn't intended?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By Pod you mean actual Pod or the application running inside it?
If Pod, then you need to add annotation so the Multus knows which NetworkAttachmentDefinitions should use.
If the app - the app needs to be written in such way it knows there's some extra interface (or just serve on all interfaces).

Can the bridge be created when it wasn't intended?

Bridge will be created when a Pod (that mentions in the annotation NAD that uses bridge CNI and specifies some bridge name) is being created.
It wouldn't happen any other way, like accidentally.
Pods is being created -> it wants bridge NAD -> bridge CNI runs -> bridge interface doesn't exist -> bridge interface is created


Another existing risk is users creating NetworkAttachmentDefinitions that want to reuse MicroShift's
networks such as `10.42.0.0/16`, `10.43.0.0/16`, `10.44.0.0/16`, or `169.254.169.1/29`.
We should warn users they should not use them unless they reconfigure MicroShift's networking to use

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if users don't add other networks and reuse the default settings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard for me to say exactly, someone from networking team would know better.
But if user reuses 10.42.0.0/16, which is used by primary ovn-k CNI, then I would expect problems with routing - the Pod wouldn't be able to access ApiServer (if the second CNI would overwrite ovnk's routing), or some services if other CIDR is used.


## Operational Aspects of API Extensions

If Multus (or any delegate CNIs it executes) fails, a new Pod will be stuck in "ContainerCreating" status

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a good troubleshooting step?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's just it: Pod doesn't start because of the error, user runs oc describe $POD and in the events there's a message from multus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants