-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FDB] Fix fbdorch to properly handle syncd FDB FLUSH Notif #2254
Conversation
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…ic-swss into fdb_remove_vlan
@vivekreddynv on which branch this fix should go? master only? |
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…ic-swss into fdb_remove_vlan
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
orchagent/fdborch.cpp
Outdated
|
||
/* | ||
Handles the SAI_FDB_EVENT_FLUSHED notification recieved from syncd | ||
Ref: https://github.com/opencomputeproject/SAI/blob/master/inc/saifdb.h#L223 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleaser remove reference to code. This will not be useful when the header file gets changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled
orchagent/fdborch.cpp
Outdated
if (entry->bv_id && | ||
!m_portsOrch->getPort(entry->bv_id, vlan)) | ||
{ | ||
SWSS_LOG_ERROR("FdbOrch notification type %d: Failed to locate vlan port from bv_id 0x%" PRIx64, type, entry->bv_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a NOTICE since the flow is in notification. The sonic-mgmt tests is now strict on what scenarios the ERR logs are being generated. There are currently some cases in fdb notif flow which is generating ERR which we are planning to revisit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled
orchagent/fdborch.cpp
Outdated
@@ -493,86 +594,21 @@ void FdbOrch::update(sai_fdb_event_t type, | |||
} | |||
case SAI_FDB_EVENT_FLUSHED: | |||
|
|||
SWSS_LOG_INFO("FDB Flush event received: [ %s , 0x%" PRIx64 " ], \ | |||
bridge port ID: 0x%" PRIx64 ".", | |||
SWSS_LOG_NOTICE("FDB Flush event received: [ %s , 0x%" PRIx64 " ], \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change it to INFO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled
orchagent/fdborch.cpp
Outdated
const string& fdb_type, | ||
const MacAddress& mac) | ||
{ | ||
MacAddress flush_mac(CONSOLIDATED_FLUSH_MAC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default assignment should be 0. I don't think we should define a new variable. You can give a comment that flush event shall have zero mac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled
orchagent/fdborch.cpp
Outdated
@@ -973,17 +1009,22 @@ void FdbOrch::doTask(NotificationConsumer& consumer) | |||
for (uint32_t i = 0; i < count; ++i) | |||
{ | |||
sai_object_id_t oid = SAI_NULL_OBJECT_ID; | |||
sai_int32_t fdb_entry_type = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest assign to 0.
orchagent/fdborch.cpp
Outdated
for (auto itr = m_entries.begin(); itr != m_entries.end();) | ||
{ | ||
auto curr = itr++; | ||
if (curr->second.type.find(fdb_type) != std::string::npos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check is redundant. afaik, the fdb_type is always populated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These entries might be static as well, so i think it's required to check the fdb_type before clearing them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you are not checking for static vs dynamic here. my point is, one or other is always true and never be npos, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"find" returns the index of first occurence, so this logic holds true, npos indicates a mismatch. I've used specifically thus because the type stored in internal cache can be of static, dynamic or dynamic_local. whereas the type received from the flush will either be static or dynamic.
Thus i've used this instead of a simple equality check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be i'm missing something. what do we miss with the following?
if (curr->second.type.find(fdb_type) != std::string::npos)
{
if (curr->first.mac == mac || mac == flush_mac)
{
clearFdbEntry(curr->first.mac, curr->first.bv_id, fdb_type, curr->first.port_name);
}
}
to
if (curr->first.mac == mac || mac == flush_mac)
{
clearFdbEntry(curr->first.mac, curr->first.bv_id, fdb_type, curr->first.port_name);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. In the flush, can user specify fdb_type? If not, is it correct that flush notification would be for all types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link User can specify the flush type. Looks like there can also be a SAI_FDB_FLUSH_ENTRY_TYPE_ALL along with SAI_FDB_FLUSH_ENTRY_TYPE_STATIC/SAI_FDB_FLUSH_ENTRY_TYPE_DYNAMIC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update to handle the SAI_FDB_FLUSH_ENTRY_TYPE_ALL case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sry, the flush notif will have either STATIC or DYNAMIC, so "all" case need not be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: Updated, remove reading the SAI_FDB_ENTRY_ATTR_TYPE attribute and cleared only non-static entries
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…ic-swss into fdb_remove_vlan
@vivekreddynv Please raise a separate PR for 202111 as there are conflicts |
…#2254) Vlan delete couldn't be handled by OA when there is fdb learnt on the member and when the member is deleted This inability of handling APPL_DB notif is affecting warm-restart. FDB Entry from State DB is not removed. OA doesn't have the logic to handle consolidate flush notif coming from syncd FdbOrch doesn't have logic to clear internal cache and decrement corresponding fdb counters during a flush notification Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…#2254) Vlan delete couldn't be handled by OA when there is fdb learnt on the member and when the member is deleted This inability of handling APPL_DB notif is affecting warm-restart. FDB Entry from State DB is not removed. OA doesn't have the logic to handle consolidate flush notif coming from syncd FdbOrch doesn't have logic to clear internal cache and decrement corresponding fdb counters during a flush notification Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…2401) * [FDB] Fix fbdorch to properly handle syncd FDB FLUSH Notif (#2254) Vlan delete couldn't be handled by OA when there is fdb learnt on the member and when the member is deleted This inability of handling APPL_DB notif is affecting warm-restart. FDB Entry from State DB is not removed. OA doesn't have the logic to handle consolidate flush notif coming from syncd FdbOrch doesn't have logic to clear internal cache and decrement corresponding fdb counters during a flush notification Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…onic-net#2401) * [FDB] Fix fbdorch to properly handle syncd FDB FLUSH Notif (sonic-net#2254) Vlan delete couldn't be handled by OA when there is fdb learnt on the member and when the member is deleted This inability of handling APPL_DB notif is affecting warm-restart. FDB Entry from State DB is not removed. OA doesn't have the logic to handle consolidate flush notif coming from syncd FdbOrch doesn't have logic to clear internal cache and decrement corresponding fdb counters during a flush notification Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
What I did
Fixes sonic-net/sonic-buildimage#10605
Why I did it
How I verified it
Checked this config and verified the remove vlan passed though
Unit Tests:
Details if related