Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICCP Bridgeport removal failure when MCLAG interface Bridgeport exist #2535

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

prasanna-cls
Copy link
Contributor

What I did
while processing the bridge port removal, altered the invocation order
from,

  1. Remove bridge port in SAI (sai_bridge_api->remove_bridge_port())
  2. Notify about the bridge port deletion to Isolation group module (orchagent)

to,

  1. Notify about the bridge port deletion to Isolation group module (orchagent)
  2. Remove bridge port in SAI (sai_bridge_api->remove_bridge_port())

Why I did it
When Bridge removal is triggered before clearing the isolation group reference, then SAI returns error stating the bridge port reference count is still non-zero. reversing the invocation order will solve the issue and avoid the SAI errors for Bridgeport deletion.

How I verified it

  1. Deletion of ICCP Bridge port , followed by MCLAG interface Bridgeport - PASS
  2. Deletion of MCLAG interface Bridge port , followed by ICCP Bridgeport - PASS
  3. Recreating the MCLAG interface bridge port followed by ICCP interface bridge port - PASS
  4. Recreating the ICCP interface bridge port followed by MCLAG interface bridge port - PASS

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Fix for ICCP Bridgeport removal failure when MCLAG interface bridgeport exist

…eted

(removal of last vlan membership), when the MCLAG interface bridgeport instance still exist,
then, SAI returns error since the bridge port is still
referenced by the port isolation group object.
Hence by clearing the ICCP bridge port object reference prior to deleting the
bridge port at SAI level, this error can be avoided.
@prasanna-cls prasanna-cls changed the title In the MCLAG scenario, if the ICCP peer link bridge port is being del… ICCP Bridgeport removal failure when MCLAG interface Bridgeport exist Nov 18, 2022
@prasanna-cls
Copy link
Contributor Author

request to retest please.

@@ -4984,6 +4984,10 @@ bool PortsOrch::removeBridgePort(Port &port)
gFdbOrch->flushFDBEntries(port.m_bridge_port_id, SAI_NULL_OBJECT_ID);
SWSS_LOG_INFO("Flush FDB entries for port %s", port.m_alias.c_str());

/* Notify about bridgeport deletion */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause race-conditions. The idea of notifying other subscribers is to let them know BridgePort deletion is complete. This flow doesn't honor that. Please propose a different solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
From SAI point of view, the ISOLATION_GROUP_TABLE:MCLAG_ISO_GRP object depends upon the bridge port as a dependent object.
For this reason, when SWSS creates a bridge port, MCLAG is notified after the bridge port is created in SAI.
Similarly, during the bridge port deletion scenario, MCLAG has to be notified (to clean-up ISOLATION_GROUP) , before SAI deletes that bridge port.

And the MCLAG(for ISOLATION_GROUP_TABLE) is the only subscriber listening to this bridge port changes. and this changes are done as part of handling MCLAG in this pull request.

do you suggest to have a different message type, something like "SUBJECT_TYPE_BRIDGE_PORT_REMOVAL_HANDLE" that can be called before sai_bridge_api->remove_bridge_port() , instead of "SUBJECT_TYPE_BRIDGE_PORT_CHANGE", which could be retained to be called after SAI brdige port remove ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @prsunny, @prasanna-cls, can we fix this issue as follows

This way there is no change in notification and Isolation group also removed.

diff --git a/orchagent/portsorch.cpp b/orchagent/portsorch.cpp [9/1964]
index 376328f..804f356 100755
--- a/orchagent/portsorch.cpp
+++ b/orchagent/portsorch.cpp
@@ -4,6 +4,7 @@
#include "neighorch.h"
#include "gearboxutils.h"
#include "vxlanorch.h"
+#include "isolationgrouporch.h"
#include "directory.h"
#include "subintf.h"

@@ -49,6 +50,7 @@ extern NeighOrch *gNeighOrch;
extern CrmOrch *gCrmOrch;
extern BufferOrch *gBufferOrch;
extern FdbOrch *gFdbOrch;
+extern IsoGrpOrch gIsoGrpOrch;
extern Directory<Orch
> gDirectory;
extern sai_system_port_api_t *sai_system_port_api;
extern string gMySwitchType;
@@ -4984,6 +4986,9 @@ bool PortsOrch::removeBridgePort(Port &port)
gFdbOrch->flushFDBEntries(port.m_bridge_port_id, SAI_NULL_OBJECT_ID);
SWSS_LOG_INFO("Flush FDB entries for port %s", port.m_alias.c_str());

  • PortUpdate update = { port, false };

  • gIsoGrpOrch->update (SUBJECT_TYPE_BRIDGE_PORT_CHANGE, static_cast<void *>(&update));

  • /* Remove bridge port */
    status = sai_bridge_api->remove_bridge_port(port.m_bridge_port_id);
    if (status != SAI_STATUS_SUCCESS)
    @@ -5000,7 +5005,6 @@ bool PortsOrch::removeBridgePort(Port &port)
    port.m_bridge_port_id = SAI_NULL_OBJECT_ID;

    /* Remove bridge port */

  • PortUpdate update = { port, false };
    notify(SUBJECT_TYPE_BRIDGE_PORT_CHANGE, static_cast<void *>(&update));

    SWSS_LOG_NOTICE("Remove bridge port %s from default 1Q bridge", port.m_alias.c_str());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @kselvaraj2 , this looks ok to me. @prsunny is this approach fine ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny can I update the review with the approach suggested by @kselvaraj2 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants