Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FC] process FC after apply view #3326

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

stepanblyschak
Copy link
Contributor

What I did

Simplify approach to delaying counters on warm boot and fast boot. Removed FLEX_COUNTER_DELAY_STATUS_FIELD and instead postpone all FC processing to happen after apply view to not delay data plane configuration.

The CONFIG_DB should not be updated in runtime anymore for counters to be delayed.

Why I did it

To address sonic-net/sonic-buildimage#20302.

How I verified it

Run warm-boot - make sure FC orch runs only after APPLY_VIEW.

Details if related

@stepanblyschak
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stepanblyschak
Copy link
Contributor Author

Putting back delay for 60 sec as we found some cases where oper state update handling is delayed due to FC configuration after APPLY_VIEW

m_bufferQueueConfigTable(db, CFG_BUFFER_QUEUE_TABLE_NAME),
m_bufferPgConfigTable(db, CFG_BUFFER_PG_TABLE_NAME),
m_deviceMetadataConfigTable(db, CFG_DEVICE_METADATA_TABLE_NAME)
{
SWSS_LOG_ENTER();
m_delayTimer = new SelectableTimer(timespec{.tv_sec = FLEX_COUNTER_DELAY_SEC, .tv_nsec = 0});
if (WarmStart::isWarmStart())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm this will also handle fast-reboot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

SWSS_LOG_ENTER();

SWSS_LOG_NOTICE("Processing counters");
m_delayTimer->stop();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about following?

if (!m_delayTimerExpired)
{
    m_delayTimer->stop();
    m_delayTimerExpired = true;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -254,6 +258,15 @@ void FlexCounterOrch::doTask(Consumer &consumer)
}
}

void FlexCounterOrch::doTask(SelectableTimer &timer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we delete the timer here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
else
{
m_delayTimerExpired = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to delay counter polling for all reboot types, and now it turns that counter polling will start right after APPLY_VIEW for cold reboot. Although we don't have a hard requirement for cold reboot time, there is still a question: could you please share some data about how this change would affect cold reboot time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly due to VS tests expecting FC counters to appear immediately, which would fail without putting some delays. Therefore I limited this change to fast/warm reboot only. I am looking for a way to not delay counters only for VS, then I can have this delay for all reboot types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stepanblyschak Since the change is limited to warm-reboot now, may I know how much time can be saved in warm-reboot after delaying flex counter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bingwang-ms No time savings, the delay was limited to fast and warm boot already. No delay if FC are present in CONFIG_DB already in cold boot.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
This reverts commit c8e8b41.
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
@bingwang-ms
Copy link
Contributor

@wen587 Can you help review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants