Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting up Openvswitch (Ovs) Bridge network #2

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

coiby
Copy link
Member

@coiby coiby commented Apr 23, 2024

Resolves: https://issues.redhat.com/browse/RHEL-33465

This patch supports setting up an Ovs bridge in kdump initrd. An Ovs
bridge is similar to a classic Linux bridge but we use ovs-vsctl to find
out the Ethernet device (having the MAC address as the bridge) added to
an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection
profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge
automatically in kdump initrd.

In the case of OpenShift Container Platform (OCP),
ovs-configuration.service [1] is responsible for setting up an Ovs bridge.
In theory, we can also try to bring up the original physical network
interface before ovs-configuration.service. But this approach is
cumbersome because it breaks our assumption that we should bring up the
same network in kdump intrd as in 1st kernel (establishing the same network
in kdump initrd only needs to copy the needed NM connection profiles
thus we don't need to learn how different network setup work under the
hood).

How to test this patch with the help of configure-ovs.sh?
=========================================================

1. Extract configure-ovs.sh from [2]

2. Install necessary packages for configure-ovs.sh
    dnf install openvswitch -yq
    dnf install NetworkManager-ovs nmap-ncat -yq

    systemctl enable --now openvswitch

    # restart NM so the ovs plugin can be activated
    systemctl restart NetworkManager

3. Assume the network interface used for creating an Ovs bridge is
   "ens2", use configure-ovs.sh to create an Ovs bridge,

    interface=ens2
    mkdir -p /etc/ovnk
    echo $interface > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

4. (Optional) If you want to make the created Ovs bridge survive a
   reboot, simply make the created NM connections created by
   configure-ovs.sh persist,

    cp /run/NetworkManager/system-connections/ovs-* /etc/NetworkManager/system-connections/

If you need to create an Ovs bridge on top of a bonding network, use the
following commands for step 3,

    nmcli con add type bond ifname bond0
    nmcli con add type ethernet ifname eth0 master bond0
    nmcli con add type ethernet ifname eth1 master bond0

    echo bond0 > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

Note

  1. For RHEL, openvswitch3.3 may be installed so we need to get the
    package name by "rpm -qf /usr/lib/systemd/system/openvswitch.service"

  2. For RHEL9, openvswitch package needs to installed from another repo,
    cat << 'EOF' > /etc/yum.repos.d/ovs.repo
    [rhosp-rhel-9-fdp-cdn]
    name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn
    baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/
    enabled=1
    gpgcheck=0
    EOF

    dnf install openvswitch3.3 -yq

[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml
[2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml

Signed-off-by: Coiby Xu coxu@redhat.com

@coiby coiby changed the title Install the nmconnection file for the physical interface of Ovs bridge Install the nmconnection file for the physical interface of Openvswitch bridge Apr 23, 2024
@coiby
Copy link
Member Author

coiby commented Apr 23, 2024

Hi @jbtrystram, can you take a look at this patch set? Thank you! Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.

Copy link
Collaborator

@prudo1 prudo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Coiby,

personally I would merge the two commits as I don't see much value in having the first one. But I don't have a strong opinion on that.

If possible I would like to simplify the second commit. Having ovs_phy_if as global variable and the extra kdump_install_ovs_nmconnection function seems over complicated to me on first view.

Thanks
Philipp

dracut-module-setup.sh Outdated Show resolved Hide resolved
dracut-module-setup.sh Outdated Show resolved Hide resolved
dracut-module-setup.sh Outdated Show resolved Hide resolved
@jbtrystram
Copy link

jbtrystram commented Apr 25, 2024

Hi @jbtrystram, can you take a look at this patch set? Thank you!

TBH i don't know this code at all so a review would be unhelpful..

Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.

Thanks for letting me know ! I'll update the MCO afterwards then

@coiby coiby force-pushed the ovs_fix branch 3 times, most recently from 471c0d9 to eb6c659 Compare May 2, 2024 14:18
@coiby
Copy link
Member Author

coiby commented May 2, 2024

Hi Coiby,

Hi Phillipp,

personally I would merge the two commits as I don't see much value in having the first one. But I don't have a strong opinion on that.

Previously, I used two commits because it may be easier for the OpenShift team to review. But since configure-ovs.sh contains all the info I need, now I squash two commits together in the new version.

This new version switches to a new approach i.e. we now try to bring up the same network in kdump initrd as in the 1st kernel. The reason is I find a Ovs bridge can be created on top of another network like bonding network. And it's cumbersome to support this case using the previous approach.

If possible I would like to simplify the second commit. Having ovs_phy_if as global variable and the extra kdump_install_ovs_nmconnection function seems over complicated to me on first view.

Thanks Philipp

@coiby
Copy link
Member Author

coiby commented May 2, 2024

Hi @jbtrystram, can you take a look at this patch set? Thank you!

TBH i don't know this code at all so a review would be unhelpful..

configure-ovs.sh has all the info I need now, thanks anyways!

Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.

Thanks for letting me know ! I'll update the MCO afterwards then

You are welcome!

@coiby coiby changed the title Install the nmconnection file for the physical interface of Openvswitch bridge Support setting up Openvswitch (Ovs) Bridge network May 2, 2024
@jbtrystram
Copy link

jbtrystram commented Jun 6, 2024

Anything blocking that PR ? We've have had reports of issues that could be fixed by this :)

@kenneth-dsouza
Copy link

Hello team, any update here?

Copy link
Contributor

@daveyoung daveyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[drop this comment] added another one to the first commit inline

@coiby
Copy link
Member Author

coiby commented Jul 15, 2024

Hi @jbtrystram and @kenneth-dsouza,

Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?

Comment on lines 617 to 650
kdump_install_ovs_deps() {
[[ $has_ovs_bridge == yes ]] || return 0
inst_multiple -o $(rpm -ql NetworkManager-ovs) $(rpm -ql openvswitch) /usr/bin/uuidgen /usr/bin/hostname /usr/bin/touch /usr/bin/expr /usr/bin/id /usr/bin/install /usr/bin/setpriv /usr/bin/nice /usr/bin/df
# By default, ovsdb-server.service runs as USRE=openvswitch, However
# openvswitch doesn't have the permission to write to /tmp in kdump initrd
# and ovsdb-service.servie will failure with the error
# "ovs-ctl[1190]: ovsdb-server: failed to create temporary file (Permission denied)"
# So run ovsdb-server.service as root instead
mkdir -p "${initdir}/etc/sysconfig/system.conf.d"
echo "root:root" >"${initdir}/etc/sysconfig/openvswitch"
# Bypass the error "referential integrity violation: Table Port column interfaces row"
# caused by changing the connection profiles
echo "OPTIONS=\"--ovsdb-server-options='--disable-file-column-diff'\"" >>"${initdir}/etc/sysconfig/openvswitch"

KDUMP_DROP_IN_DIR="${initdir}/etc/systemd/system/nm-initrd.service.d"
mkdir -p $KDUMP_DROP_IN_DIR
printf "[Unit]\nAfter=openvswitch.service\n" >$KDUMP_DROP_IN_DIR/01-after-ovs.conf

$SYSTEMCTL -q --root "$initdir" enable openvswitch.service
$SYSTEMCTL -q --root "$initdir" add-wants basic.target openvswitch.service
}

Copy link
Collaborator

@prudo1 prudo1 Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really ugly. I'm fine in adding it for a short term fix. But in the mid/long term this needs to be moved to a proper ovs dracut module we can simply depend on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @prudo1! Yes, it's better to move it to dracut or even this kdump dracut module in the long term. Unfortunately, this may need more code-refactoring than we have expected. In order for our kdump dracut module tell dracut to include the future ovs dracut module, we need to move the logic of detecting if it's an OVS bdrige to either the check function or mkdumprd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jbtrystram! How soon do you expect RHEL-33465 to be resolved? It's ideal to have an upstream ovs dracut module but it will takes much more time. So we have a short term fix firstly if that's what you prefer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coiby thanks for the ping ! Having a short term fix and a tracking issue to make sure we adjust when the dracut module lands is fine with me !
Thanks for the work

@jbtrystram
Copy link

jbtrystram commented Jul 18, 2024

Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?

I am not sure who you are referring to ? There is no harm in asking for a review !
I forwarded this to the openvswitch folks :)

kdump_install_ovs_deps() {
[[ $has_ovs_bridge == yes ]] || return 0
inst_multiple -o $(rpm -ql NetworkManager-ovs) $(rpm -ql openvswitch) /usr/bin/uuidgen /usr/bin/hostname /usr/bin/touch /usr/bin/expr /usr/bin/id /usr/bin/install /usr/bin/setpriv /usr/bin/nice /usr/bin/df
# By default, ovsdb-server.service runs as USRE=openvswitch, However
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: USER

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this typo!

# "ovs-ctl[1190]: ovsdb-server: failed to create temporary file (Permission denied)"
# So run ovsdb-server.service as root instead
mkdir -p "${initdir}/etc/sysconfig/system.conf.d"
echo "root:root" >"${initdir}/etc/sysconfig/openvswitch"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be OVS_USER_ID="root:root"?

One other option to running OVS as root could be to set TMPDIR to a writable directory, if one exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, OVS_USER_ID="root:root" is what I meant. But your comment made me realize my understanding is incorrect because echo "root:root" >"${initdir}/etc/sysconfig/openvswitch" also works. So actually ovsdb-server.service by default run as root unless /etc/sysconfig/openvswitc intructs ovsdb-server.service to run as openvswitch.

@JM1
Copy link

JM1 commented Jul 19, 2024

@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?

What did nmcli -t -f device,filename connection show --active (in kdump_install_nmconnections()) return in your tests?

@JM1
Copy link

JM1 commented Jul 19, 2024

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

How does your code restore OCP's br-ex bridge?

@coiby
Copy link
Member Author

coiby commented Jul 22, 2024

@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?

Hi @JM1! Thanks for raising this question. The code simply copy the NetworkManager connection files and the supporting files to kdump initrd and then NM will set up the same Open vSwitch bdrige to the 1st kernel automatically. I forgot which OCP version I tested early. But later I realize this not limited to OCP so I test it in RHEL9 and Fedora40. If you are interested in more details, I've included the info about how I set up a testing environment in the commit message.

What did nmcli -t -f device,filename connection show --active (in kdump_install_nmconnections()) return in your tests?

ovs-if-br-ex        de17dd7e-dd41-4865-a99e-60077f0837ae  ovs-interface  br-ex  
br-ex               c5662992-05c0-4b05-9302-a554c705ef6a  ovs-bridge     br-ex  
ovs-if-phys0        ddcae60d-7d24-46b2-86bc-c326b6688cc9  ethernet       eno1   
ovs-port-br-ex      3ffa5d8b-5ae6-45c6-9fd6-04e5f4b2eeaa  ovs-port       br-ex  
ovs-port-phys0      aadfdac7-fb7f-4aa0-9180-0eb64263f722  ovs-port       eno1   

Btw, I've also tested the case of Open vSwitch bridge over a bonding network.

@coiby
Copy link
Member Author

coiby commented Jul 22, 2024

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

@coiby coiby force-pushed the ovs_fix branch 2 times, most recently from bb0a37f to d2be59c Compare July 22, 2024 09:16
@JM1
Copy link

JM1 commented Jul 22, 2024

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

Your previous commit had this line:

_save_kdump_netifs "ovs-system"

But it is gone in later commits. Hence, solved by removal 😄

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

Ok. If I understand your code correctly, in kdump_collect_netif_usage it will collect the netdev (br-ex) that would be used when accessing the remote fs (e.g. via ssh). It would also identify the corresponding physical NIC such as enp1s0 with the help of kdump_setup_ovs and ovs_find_phy_if and add both (br-ex and enp1s0) to unique_netifs. Later, in kdump_install_nmconnections it would retrieve all NM connections associated with those two devices. I did not test it but the approach seems reasonable.

However, I am wondering why you do not simply try to copy and restore all NetworkManager connections? Wouldn't that reduce the amount of code here dramatically?

@coiby
Copy link
Member Author

coiby commented Jul 23, 2024

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

Your previous commit had this line:

_save_kdump_netifs "ovs-system"

But it is gone in later commits. Hence, solved by removal 😄

Oh my poor memory! This reminds me to do another test on OCP. Or if you think there is no need to do it, please let me know.

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

Ok. If I understand your code correctly, in kdump_collect_netif_usage it will collect the netdev (br-ex) that would be used when accessing the remote fs (e.g. via ssh). It would also identify the corresponding physical NIC such as enp1s0 with the help of kdump_setup_ovs and ovs_find_phy_if and add both (br-ex and enp1s0) to unique_netifs. Later, in kdump_install_nmconnections it would retrieve all NM connections associated with those two devices. I did not test it but the approach seems reasonable.

However, I am wondering why you do not simply try to copy and restore all NetworkManager connections? Wouldn't that reduce the amount of code here dramatically?

Good suggestion! --ovsdb-server-options='--disable-file-column-diff' also prompt me to think if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

  • Some environments like some Azure machine doesn't use persistent NIC name (the current solution is to modify a NM connection profile to match a device by MAC address, for details check commit 568623e)
  • If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail property to no. Otherwise, dumping vmcore over IPv6 could fail because only IPv4 network is ready or vice versa (current solution is to disable IPv6 if only IPv4 is used and vice versa, for details check commit 9dfcacf)
  • Some NICs need longer connection.wait-device-timeout otherwise the connection will fail to be established (commit 6b586a9).

@JM1
Copy link

JM1 commented Jul 23, 2024

... if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

  • Some environments like some Azure machine doesn't use persistent NIC name (the current solution is to modify a NM connection profile to match a device by MAC address, for details check commit 568623e)

  • If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail property to no. Otherwise, dumping vmcore over IPv6 could fail because only IPv4 network is ready or vice versa (current solution is to disable IPv6 if only IPv4 is used and vice versa, for details check commit 9dfcacf)

  • Some NICs need longer connection.wait-device-timeout otherwise the connection will fail to be established (commit 6b586a9).

Please, document this ^ in the commit message (or your code). You put a lot of thoughts into the why's ( 👍 👍 👍 ), so help us or a future engineer with understanding your reasoning.

I am not sure if copying configure-ovs.sh to initrd and executing it there (maybe wrapped somehow), might not still be less hassle in the long run. But your approach also works (i assume, did not test it), so please go ahead and merge ☺️

Resolves: https://issues.redhat.com/browse/RHEL-33465

This patch supports setting up an Ovs bridge in kdump initrd. An Ovs
bridge is similar to a classic Linux bridge but we use ovs-vsctl to find
out the Ethernet device (having the MAC address as the bridge) added to
an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection
profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge
automatically in kdump initrd.

In the case of OpenShift Container Platform (OCP),
ovs-configuration.service [1] is responsible for setting up an Ovs bridge.
In theory, we can also try to bring up the original physical network
interface before ovs-configuration.service. But this approach is
cumbersome because it breaks our assumption that we should bring up the
same network in kdump intrd as in 1st kernel (establishing the same network
in kdump initrd only needs to copy the needed NM connection profiles
thus we don't need to learn how different network setup work under the
hood).

How to test this patch with the help of configure-ovs.sh?
=========================================================

1. Extract configure-ovs.sh from [2]

2. Install necessary packages for configure-ovs.sh
    dnf install openvswitch -yq
    dnf install NetworkManager-ovs nmap-ncat -yq

    systemctl enable --now openvswitch

    # restart NM so the ovs plugin can be activated
    systemctl restart NetworkManager

3. Assume the network interface used for creating an Ovs bridge is
   "ens2", use configure-ovs.sh to create an Ovs bridge,

    interface=ens2
    mkdir -p /etc/ovnk
    echo $interface > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

4. (Optional) If you want to make the created Ovs bridge survive a
   reboot, simply make the created NM connections created by
   configure-ovs.sh persist,

    cp /run/NetworkManager/system-connections/ovs-* /etc/NetworkManager/system-connections/

If you need to create an Ovs bridge on top of a bonding network, use the
following commands for step 3,

    nmcli con add type bond ifname bond0
    nmcli con add type ethernet ifname eth0 master bond0
    nmcli con add type ethernet ifname eth1 master bond0

    echo bond0 > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

Note
1. For RHEL, openvswitch3.3 may be installed so we need to get the
   package name by "rpm -qf /usr/lib/systemd/system/openvswitch.service"

2. For RHEL9, openvswitch package needs to installed from another repo,
    cat << 'EOF' > /etc/yum.repos.d/ovs.repo
    [rhosp-rhel-9-fdp-cdn]
    name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn
    baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/
    enabled=1
    gpgcheck=0
    EOF

    dnf install openvswitch3.3 -yq

3.  We instruct ovsdb-server to ignore NM connection files changes by
    "--ovsdb-server-options='--disable-file-column-diff'". In the
    future, this may not be needed if we simply copy all active NM
    connection profiles to kdump initrd without changing them after
    coming up with different solutions for the following cases,
    1. Some environments like some Azure machine doesn't use persistent
       NIC name. Current solution is to modify a NM connection
       profile to match a device by MAC address, for details check
       commit 568623e)

    2. If a NIC has an IPv4 or IPv6 address, set the corresponding
       may-fail property to no. Otherwise, dumping vmcore over IPv6
       could fail because only IPv4 network is ready or vice versa. Current
       solution is to disable IPv6 if only IPv4 is used and vice versa,
       for details check commit 9dfcacf,

    3. Some NICs need longer connection.wait-device-timeout otherwise
       the connection will fail to be established (commit 6b586a9).

[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml
[2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml

Signed-off-by: Coiby Xu <coxu@redhat.com>
@coiby
Copy link
Member Author

coiby commented Jul 23, 2024

... if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

  • Some environments like some Azure machine doesn't use persistent NIC name (the current solution is to modify a NM connection profile to match a device by MAC address, for details check commit 568623e)
  • If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail property to no. Otherwise, dumping vmcore over IPv6 could fail because only IPv4 network is ready or vice versa (current solution is to disable IPv6 if only IPv4 is used and vice versa, for details check commit 9dfcacf)
  • Some NICs need longer connection.wait-device-timeout otherwise the connection will fail to be established (commit 6b586a9).

Please, document this ^ in the commit message (or your code). You put a lot of thoughts into the why's ( 👍 👍 👍 ), so help us or a future engineer with understanding your reasoning.

I've documented it the commit message! Thanks for the excellent suggestion!

I am not sure if copying configure-ovs.sh to initrd and executing it there (maybe wrapped somehow), might not still be less hassle in the long run. But your approach also works (i assume, did not test it), so please go ahead and merge ☺️

Thanks for reviewing the patches and endorsing this approach! As for running configure-ovs.sh in kdump initrd, configure-ovs.sh assumes there is an active NM connection first. For kdump initrd, this active NM doesn't exist (on the other hand if it exists, there is no need to use configure-ovs.sh).

@coiby coiby merged commit 224d310 into rhkdump:main Jul 29, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants