Skip to content

Commit

Permalink
Don't handle buffer pool watermark during warm reboot reconciling (#1987
Browse files Browse the repository at this point in the history
)

- What I did
Don't handle buffer pool watermark during warm reboot reconciling

- Why I did it
This is to fix the community issue sonic-net/sonic-sairedis#862 and #8722

- How I verified it
Perform a warm reboot. Check whether

buffer pool watermark handling is skipped during reconciling and handled after it.
other watermark handling is handled during reconciling as it was before.
Details if related
The warm reboot flow is like this:

System starts. Orchagent fetches the items from database stored before warm reboot and pushes them into m_toSync of all orchagents. This is done by bake, which can be overridden by sub orchagent.
All sub orchagents handle the items in m_toSync. At this point, any notification from redis-db is blocked.
Warm reboot converges.
Orchagent starts to handle notifications from redis-db.
The fix is like this: in FlexCounterOrch::bake. the buffer pool watermark handling is skipped.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
  • Loading branch information
stephenxs authored Nov 24, 2021
1 parent 16d4bcd commit fb0a5fd
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 1 deletion.
44 changes: 43 additions & 1 deletion orchagent/flexcounterorch.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
#include <unordered_map>
#include "flexcounterorch.h"
#include "portsorch.h"
#include "fabricportsorch.h"
#include "select.h"
Expand Down Expand Up @@ -49,6 +48,7 @@ unordered_map<string, string> flexCounterGroupMap =

FlexCounterOrch::FlexCounterOrch(DBConnector *db, vector<string> &tableNames):
Orch(db, tableNames),
m_flexCounterConfigTable(db, CFG_FLEX_COUNTER_TABLE_NAME),
m_flexCounterDb(new DBConnector("FLEX_COUNTER_DB", 0)),
m_flexCounterGroupTable(new ProducerTable(m_flexCounterDb.get(), FLEX_COUNTER_GROUP_TABLE))
{
Expand Down Expand Up @@ -188,3 +188,45 @@ bool FlexCounterOrch::getPortBufferDropCountersState() const
{
return m_port_buffer_drop_counter_enabled;
}

bool FlexCounterOrch::bake()
{
/*
* bake is called during warmreboot reconciling procedure.
* By default, it should fetch items from the tables the sub agents listen to,
* and then push them into m_toSync of each sub agent.
* The motivation is to make sub agents handle the saved entries first and then handle the upcoming entries.
*/

std::deque<KeyOpFieldsValuesTuple> entries;
vector<string> keys;
m_flexCounterConfigTable.getKeys(keys);
for (const auto &key: keys)
{
if (!flexCounterGroupMap.count(key))
{
SWSS_LOG_NOTICE("FlexCounterOrch: Invalid flex counter group intput %s is skipped during reconciling", key.c_str());
continue;
}

if (key == BUFFER_POOL_WATERMARK_KEY)
{
SWSS_LOG_NOTICE("FlexCounterOrch: Do not handle any FLEX_COUNTER table for %s update during reconciling",
BUFFER_POOL_WATERMARK_KEY);
continue;
}

KeyOpFieldsValuesTuple kco;

kfvKey(kco) = key;
kfvOp(kco) = SET_COMMAND;

if (!m_flexCounterConfigTable.get(key, kfvFieldsValues(kco)))
{
continue;
}
entries.push_back(kco);
}
Consumer* consumer = dynamic_cast<Consumer *>(getExecutor(CFG_FLEX_COUNTER_TABLE_NAME));
return consumer->addToSync(entries);
}
3 changes: 3 additions & 0 deletions orchagent/flexcounterorch.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include "orch.h"
#include "port.h"
#include "producertable.h"
#include "table.h"

extern "C" {
#include "sai.h"
Expand All @@ -17,12 +18,14 @@ class FlexCounterOrch: public Orch
virtual ~FlexCounterOrch(void);
bool getPortCountersState() const;
bool getPortBufferDropCountersState() const;
bool bake() override;

private:
std::shared_ptr<swss::DBConnector> m_flexCounterDb = nullptr;
std::shared_ptr<swss::ProducerTable> m_flexCounterGroupTable = nullptr;
bool m_port_counter_enabled = false;
bool m_port_buffer_drop_counter_enabled = false;
Table m_flexCounterConfigTable;
};

#endif

0 comments on commit fb0a5fd

Please sign in to comment.