-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warmboot Manager HLD for warm reboot improvement #1485
Changes from 2 commits
88eb5b9
8320b6d
01e912e
46d5dba
4b4ac43
a544ba7
b34d2e7
28d567f
fc00e8c
6a4118c
9dd1ade
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,280 @@ | ||
# NSF Manager HLD | ||
|
||
|
||
## Table of Content | ||
|
||
|
||
- [Revision](#revision) | ||
- [Scope](#scope) | ||
- [Definitions/Abbreviations](#definitions-abbreviations) | ||
- [Overview](#overview) | ||
- [Requirements](#requirements) | ||
- [Architecture Design](#architecture-design) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also section to cover where the mgr runs and how to start it ? will it be docker or run on host image. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added details in the Architecture Design section. |
||
- [High-Level Design](#high-level-design) | ||
* [NSF Details Registration](#nsf-details-registration) | ||
* [Shutdown Orchestration](#shutdown-orchestration) | ||
+ [Phase 1: Freeze Components & Wait for Switch Quiescence](#phase-1--freeze-components---wait-for-switch-quiescence) | ||
+ [Phase 2: State Verification (Optional)](#phase-2--state-verification--optional-) | ||
+ [Phase 3: Trigger Checkpointing](#phase-3--trigger-checkpointing) | ||
+ [Phase 4: Prepare and Perform Reboot](#phase-4--prepare-and-perform-reboot) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sections to talk about transition of application from old to new reboot types and backward compatibility ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added Backward Compatibility with these details. |
||
+ [Application Shutdown Optimization](#application-shutdown-optimization) | ||
* [Reconciliation Monitoring](#reconciliation-monitoring) | ||
* [Component Warmboot States](#component-warmboot-states) | ||
- [SAI API](#sai-api) | ||
- [Configuration and management](#configuration-and-management) | ||
- [Warmboot and Fastboot Design Impact](#warmboot-and-fastboot-design-impact) | ||
- [Restrictions/Limitations](#restrictions-limitations) | ||
- [Testing Requirements/Design](#testing-requirements-design) | ||
- [Open/Action items - if any](#open-action-items---if-any) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Couple of sections can be added. one use cases covered ( LACP.. etc). Also sections which NPU types are tested or supported after testing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added NPU types detail in Testing Requirements/Design section. |
||
|
||
|
||
|
||
|
||
### Revision | ||
|
||
|
||
<table> | ||
<tr> | ||
<td>Rev | ||
</td> | ||
<td>Rev Date | ||
</td> | ||
<td>Author(s) | ||
</td> | ||
<td>Change Description | ||
</td> | ||
</tr> | ||
<tr> | ||
<td>v0.1 | ||
</td> | ||
<td>9/28/2023 | ||
</td> | ||
<td>Google | ||
</td> | ||
<td>Initial version | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
|
||
|
||
### Scope | ||
|
||
This document captures the high-level design for NSF Manager - a new daemon that is responsible for shutdown orchestration and reconciliation monitoring during warm reboot, and the location of this daemon in SONiC. | ||
|
||
|
||
### Definitions/Abbreviations | ||
|
||
|
||
<table> | ||
<tr> | ||
<td><strong>NSF</strong> | ||
</td> | ||
<td>Non-Stop Forwarding | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><strong>Applications</strong> | ||
</td> | ||
<td>Higher-layer components that may or may not be running depending on the switch role e.g. Teamd, BGP, P4RT etc. | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><strong>Infrastructure Components</strong> | ||
</td> | ||
<td>Switch stack components that are essential for switch operation irrespective of switch role e.g. Orchagent, Syncd and transceiver daemon | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
|
||
|
||
### Overview | ||
|
||
SONiC uses [fast-reboot](https://github.com/sonic-net/sonic-utilities/blob/master/scripts/fast-reboot) and [finalize\_warmboot.sh](https://github.com/sonic-net/sonic-buildimage/blob/master/files/image_config/warmboot-finalizer/finalize-warmboot.sh) scripts for warm reboot reconciliation. The former script is responsible for preparing the platform and database for reboot and orchestrating switch shutdown. The latter script is responsible for monitoring the reconciliation status of a fixed set of switch stack components and eventually removing the warm-boot flag from Redis DB.There are multiple drawbacks of using a bash script: | ||
|
||
|
||
|
||
* Separate binaries need to be developed for each component to send shutdown related notifications. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this mean adding new scripts to handle shutdown per service? The current approach does not require a separate script for all services. There is a templated way for all services. The build process auto-generates the scripts per service. A new script is needed only when a service requires special handling for the shutdown path. Templates: Services need to only add a soft reference this template and separate binary is not needed. Eg.; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is referring to the separate binaries that we have to send shutdown notifications to different components. For example, orchagent_restart_check for Orchagent freeze notification, syncd_request_shutdown for Syncd notifications etc. We aren't proposing to have separate service management scripts. |
||
* The rich set of Redis DB features provided by swss-common library cannot be utilized. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please elaborate here. Perhaps w/ some examples? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. swss-common enables us to create a pub/sub model wherein we can send notifications and subscribe for warm-boot state updates from multiple applications in Redis DB as opposed to poll for this information for individual components. |
||
* A framework to send notifications to components and wait for updates cannot be developed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this already not happening in the current bash script? Notifications are sent and are waited upon w/ a timeout? I do not fully understand the concern here and may need your help to elaborate here on how the current script limits us from managing the shutdown process? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The shutdown state machine for different components (proposed in this HLD) can be better monitored using a daemon that is leveraging swss-common libs as opposed to polling these warm-boot states of individual components using a bash script. |
||
|
||
Additionally, the current SONiC warm shutdown algorithm has custom shutdown related notifications for different components that triggers their warm shutdown logic. For example, Orchagent uses _freeze_ notification and notifies its ready status via notification channel. Syncd uses _pre-shutdown_ and _shutdown_ notifications and notifies its status via Redis DB. | ||
|
||
The proposal is to introduce a new daemon called NSF Manager that will be responsible for both shutdown orchestration and reconciliation monitoring during warm reboot. It will leverage Redis DB features provided by swss-common to create a common framework for warm-boot orchestration. As a result, there will be a unified framework for warm-boot orchestration for all switch stack components. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this mean everything that happens today will still happen w/ this new design, only that a wrapper around the current process will unify/abstract the different ways in which notifications are sent/received today? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This framework will co-exist with the existing shutdown orchestration. This framework introduces additional notification channels for the newer unified shutdown notifications. So components can choose to take actions based on the notification channel from which they receive the shutdown notification. Our proposal is to be compatible with the existing SONiC warm-boot orchestration. |
||
|
||
|
||
### Requirements | ||
|
||
The current design covers warm reboot orchestration daemon along with the warm shutdown and bootup sequence. It also covers the additional warmboot states introduced due to the warm shutdown sequence. It does not cover fast reboot. | ||
|
||
|
||
### Architecture Design | ||
|
||
_NA_ | ||
|
||
|
||
### High-Level Design | ||
|
||
|
||
|
||
![alt_text](img/warm-reboot-overall.png) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor: can you add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated freeze and reconciliation images. Hope they are readable now. |
||
|
||
|
||
|
||
NSF Manager will replace the existing [fast-reboot](https://github.com/sonic-net/sonic-utilities/blob/master/scripts/fast-reboot) and [finalize\_warmboot.sh](https://github.com/sonic-net/sonic-buildimage/blob/master/files/image_config/warmboot-finalizer/finalize-warmboot.sh) scripts to perform shutdown orchestration and reconciliation monitoring during warm reboot. It will use a registration based mechanism to determine the switch stack components that need to perform an orchestrated shutdown and bootup. This ensures that the NSF Manager design is generic, flexible and also avoids any hardcoding of component information in NSF Manager. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just calling out the concern that was raised in meeting - we still would like to keep the ability to fix shutdown path on the fly. The script-based design enabled us to add adhoc fixes (after testing) at runtime. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please elaborate on the type of fixes that are introduced at runtime? |
||
|
||
|
||
#### NSF Details Registration | ||
|
||
|
||
``` | ||
Component = <name> | ||
Docker = <name> | ||
Freeze = <True>/<False> | ||
Checkpoint = <True>/<False> | ||
Reconciliation = <True>/<False> | ||
``` | ||
|
||
|
||
Components that want to participate in warm-boot orchestration need to register the above details with NSF Manager. NSF Manager will use these registration details to determine the components that are going to participate in the orchestrated shutdown sequence and monitor reconciliation statuses during bootup. If a component doesn’t register with the NSF Manager then it will continue to operate normally until it is shutdown in [Phase 4](#phase-4-prepare-and-perform-reboot). Components that modify the switch state should register with NSF Manager because they can change the state of other components such as Orchagent, Syncd etc. that participate in the warm reboot orchestration and thus they can impact the warm reboot process. | ||
|
||
Components that want to participate in an orchestrated shutdown during warm reboot need to set _freeze = true_. NSF Manager will wait for the quiescence of all components that have _freeze = true_ in their registration. If _checkpoint = True_ only then will NSF Manager wait for the component to complete checkpointing. Components that want NSF Manager to monitor their reconciliation status need to set _reconciliation = true_. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should be a formal definition of flags used here - There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These couple of lines provide the brief definition of these flags. Do more details need to be added? |
||
|
||
|
||
#### Shutdown Orchestration | ||
|
||
During warm reboot, NSF Manager will orchestrate switch shutdown in a multi-phased approach that ensures that the switch stack is in a stable state before the intent/state is saved. It will perform shutdown in the following phases: | ||
|
||
|
||
|
||
* Phase 1: Freeze components & wait for switch quiescence | ||
* Phase 2: Perform state verification (optional) | ||
* Phase 3: Trigger checkpointing | ||
* Phase 4: Prepare and perform reboot | ||
|
||
|
||
##### Phase 1: Freeze Components & Wait for Switch Quiescence | ||
|
||
|
||
|
||
![alt_text](img/freeze.png) | ||
|
||
NSF Manager will send freeze notification to all registered components and will wait for the quiescence of only those components that have set _freeze = true_. Upon receiving the freeze notification, the component will complete processing the current request queue and stop generating new intents. Stopping new intents from being generated means that boundary components should stop processing requests from external components (external events) and all components should stop their periodic timers that generate new requests (internal events). For example: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doing this simultaneously to all registered components may not be a good idea. We do want some components to keep operating:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Freeze notification means that components should stop generating events for which they are the source. All components will continue to serve requests received by them but will stop generating events that are originated by them. This is similar to saying "I will not ask questions, but I will continue to answer questions". This means that components will continue to serve internal requests. After a point of time, all components will stop generating events and thus the switch will become quiescent. This means that during freeze phase, Orchagent and Syncd will continue to serve requests from other components such as Teamd. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we expect all components to finish phase-1, before NSF manager trigger phase-2 ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, boundary applications that are independent of state changes to other switch components such as UMF, P4RT etc. can aggregate Phase 1, 2 and 3 as a part of the freeze notification (see Application Shutdown Optimization section). But this is applicable for only a certain subset of the switch stack components. NSF Manager needs to wait for the entire switch to become quiescent before a global checkpointing can take place. For example, Syncd should only save internal states after all intent has been programmed i.e. the switch is quiescent. |
||
|
||
|
||
|
||
* UMF will stop listening to new gRPC requests from the controller. | ||
* P4RT will stop listening to new gRPC requests from the controller and stop packet I/O. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add Step4 under p4rt block that controller needs to be informed about warmboot initiated. Also can you please add details on how and what needs to be informed to the P4RT client from P4RT server when warmboot is initiated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This design introduces the high-level principles for the new warm-boot orchestration framework. I wanted to limit the scope of this design to these core principles. That is why this section talks about the actions of different components at a very high-level, thereby providing a brief idea of stopping self-sourced events. From P4RT perspective, the P4Runtime spec doesn't talk about reboots in general. During cold reboot, the client observes the same behavior as a server shutdown. Ideally, the client shouldn't observe different behavior during warm reboot. This is because from the P4Runtime lens, the server was shutdown (thus unavailable for sometime) and came back up in the same state as before thereby maintaining the read-write symmetry. I would be happy to discuss this further in the System Orchestration workgroup or in a separate thread. |
||
* xcvrd will stop listening to transceiver updates such as module presence. | ||
* Syncd will stop listening to link state events from the BCM chip. | ||
* Orchagent will stop periodic internal timers. | ||
* BGP will stop exchanging packets with the peers. | ||
* Teamd will stop exchanging LACP PDUs with the peers. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, this is not a good idea. Preventing teamd to stop exchanging LACPDUs in phase 1 would put a lot of pressure to reconcile within LAG session standard timeout of 90s. If we spend too much time in phase 2+ then LAGs are guaranteed to flap in the recovery path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same for syncd - since we want teamd to still continue to send LACPDUs as long as possible, we should not request syncd to reach quiescence at the same time when Orchagent does. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thus, I don't think we can have all components reaching There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the concern here. There are multiple ways to tackle this problem:
|
||
|
||
After all components have been freezed, the switch would eventually reach a state wherein each component stops generating new events and thus the switch becomes quiescent. This is because: | ||
|
||
|
||
|
||
* Switch boundaries that generate new events have been stopped. | ||
* All timers that generate new events have been stopped. | ||
* All components have completed processing their pending requests and thus there are no in-flight messages. | ||
|
||
After receiving the freeze notification, the components will update their quiescent state in STATE DB when they receive a new request (i.e. they are no longer quiescent) and when they complete processing their current request queue (i.e. they become quiescent). NSF Manager will monitor the quiescent state of all components in STATE DB to determine that the switch has become quiescent and thus further state changes won’t occur in the switch. If all components are in quiescent state then NSF Manager will declare that the switch has become quiescent and thus the switch has attained its final state. NSF Manager will wait for a period of time for the switch to become quiescent after which it will determine that warm reboot failed and abort the warm reboot operation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The design should also account for restoration of components state (unfreeze) in the event of failure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does SONiC currently support unfreeze of applications in case of failures? |
||
|
||
|
||
##### Phase 2: State Verification (Optional) | ||
|
||
|
||
|
||
![alt_text](img/state-verification.png) | ||
|
||
Since the switch is in quiescent state, this will be the final state of the switch before reboot. NSF Manager will trigger state verification to ensure that the switch is in a consistent state. Reconciling from an inconsistent state might cause traffic loss and thus it is important to ensure that the switch is in a consistent state before warm rebooting. NSF Manager will abort warm reboot operation if state verification fails. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is this phase different from what was done as last step in Phase 1? Also, please define There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Phase 1 ensures that the switch reaches a stable state. Phase 2 (optional) audits the switch state. State verification enables auditing of the state of different components and ensuring that they are in an expected state. Consistency depends on the component. It can mean that the internal cache matches the switch state, or the switch intent matches the switch state etc. For example, BGP kernel routes match APPL DB (or ASIC DB) information, oper-status in Linux host interface matches APPL DB information etc. |
||
|
||
|
||
##### Phase 3: Trigger Checkpointing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can NSF merge phase-2 and phase-3 together and let each component complete state verification and save internal state together so that we reduce turn around time with additional phase? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an interesting point :) Some context: With this background, if we support state verification and checkpoint as a single phase then Syncd can exit after checkpointing and state verification can fail for some other application. In such a case, we wouldn't be able to freeze --> state verification failure --> unfreeze unless we ensure that checkpointing isn't a point of no return for a switch stack component. So we need to analyze the pros and cons of merging these 2 phases. Let me discuss it internally and get back. |
||
|
||
|
||
|
||
![alt_text](img/checkpoint.png) | ||
|
||
NSF Manager will send checkpoint notification to all registered components and wait for only to those components that set _checkpoint = true_ to trigger checkpointing i.e. save internal states to either the DB or persistent file system. Checkpointing after state verification is successful ensures that the switch is reconciling from a consistent state. NSF Manager will wait for a period of time for the components to update STATE DB with their checkpointing status after which it will abort the warm reboot operation. | ||
|
||
|
||
##### Phase 4: Prepare and Perform Reboot | ||
|
||
|
||
|
||
![alt_text](img/phase-4-shutdown.png) | ||
|
||
At this point, the DB will be backed up, containers will be shutdown, the platform will be prepared for reboot and the switch will be rebooted. | ||
|
||
|
||
##### Application Shutdown Optimization | ||
|
||
|
||
|
||
![alt_text](img/shutdown-optimization.png) | ||
|
||
Higher layer applications are generally independent and their shutdown is not dependent on quiescence of other switch stack components. For example, upon receiving a freeze notification Teamd can stop exchanging LACP packets, perform state verification (optional) and checkpoint LACP states i.e. transition from Phase 1 to Phase 3 without waiting for NSF Manager to send further notifications. The design allows applications to transition through these phases independently as long as they continue to update their warmboot state in STATE DB. NSF Manager will monitor these states and will handle the applications’ phase transitions. In such a scenario, applications need to set _freeze = true_ and _checkpoint = false_ and thus the freeze notification will result in the application to transition from Phase 1 to Phase 3. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
But in warm shutdown we do need dependent transition. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Phase 1 ensures that the switch stack components continue to serve requests from other components. Also, having a state machine during shutdown helps alleviate the race conditions. I would be happy to look at individual examples of race conditions and check if this design is able to handle them. |
||
|
||
|
||
#### Reconciliation Monitoring | ||
|
||
|
||
![alt_text](img/reconciliation.png) | ||
|
||
Upon bootup, components will detect warm-boot flag in STATE DB and reconcile to the state before reboot. Components that registered _reconciliation = true_ will update their warm-boot state = RECONCILED in STATE DB after they have completed reconciliation. NSF Manager will monitor the reconciliation status of these registered components during warm bootup. Subsequently, the components will start their normal operation and listen to new requests. The proposed reconciliation monitoring mechanism is similar to the existing mechanism except that components being monitored are based on the NSF details registration as opposed to hardcoded list of components. | ||
|
||
|
||
#### Component Warmboot States | ||
|
||
Based on the shutdown and reconciliation sequence mentioned above, a component will transition through the following states during warm reboot: | ||
|
||
|
||
|
||
* Frozen (warm shutdown) | ||
* Quiescent (warm shutdown) | ||
* Checkpointed (optional, warm shutdown) | ||
* Initialized (warm bootup) | ||
* Reconciled (warm bootup) | ||
* Failed (warm shutdown or bootup) | ||
|
||
|
||
|
||
![alt_text](img/warmboot-states.png) | ||
|
||
Upon receiving a freeze notification, a component will transition to _frozen_ state when it has stopped generating new intents for which it is the source. Subsequently, it will transition to _quiescent_ state when it has completed processing all requests. It might transition between _frozen_ and _quiescent_ states when it receives a new request from some other switch component and after it has completed processing all requests i.e. a component might transition between _frozen_ and _quiescent_ states if it processes an intent in freeze mode. If a component fails to perform its freeze routine i.e. fails to stop generating new intents for which it is the source then it will transition to _failed_ state. Upon receiving a checkpoint notification, a component will transition to _checkpointed_ state after it has completed checkpointing or to _failed_ state if checkpointing failed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the warmboot state failed is just a state or will it trigger any event to restart ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is an initial NSF Manager design, the current proposal is to abort warm reboot if an application reports failed warm-boot state. However, we are exploring |
||
|
||
After warm reboot, a component will transition to _initialized_ state after it has completed its initialization routine. It will transition to _reconciled_ state after it has completed reconciliation. It will transition to _failed_ state if its initialization or reconciliation fails. | ||
|
||
Components will update their state in STATE DB using [setWarmStartState()](https://github.com/sonic-net/sonic-swss-common/blob/master/common/warm_restart.cpp#L223) API during the different warm reboot stages. NSF Manager will monitor these NSF states in STATE DB to determine whether it needs to proceed with the next phase of the warm-boot orchestration or not. | ||
|
||
|
||
### SAI API | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could add note that feature/new changes are independent of any SAI changes needed. There could be possible SAI changes needed for improvement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a note in SAI API section. |
||
_NA_ | ||
|
||
|
||
### Configuration and management | ||
|
||
_NA_ | ||
|
||
|
||
### Warmboot and Fastboot Design Impact | ||
|
||
Users of the existing warmboot and fastboot mechanism will not be impacted. All components will be compatible with the existing SONiC design. An additional flag will be added in CONFIG DB to indicate whether state verification is required during warm-boot orchestration or not. | ||
|
||
|
||
### Restrictions/Limitations | ||
|
||
|
||
### Testing Requirements/Design | ||
|
||
NSF Manager will have unit and component tests to verify shutdown orchestration and reconciliation monitoring functionality. Component tests will be added to all switch stack components that will register with NSF Manager to ensure that they process notifications from NSF Manager and update STATE DB correctly. | ||
|
||
|
||
### Open/Action items - if any | ||
|
||
NOTE: All the sections and sub-sections given above are mandatory in the design document. Users can add additional sections/sub-sections if required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you prevent using the terminalogy "NSF" in this design doc? There is another active HLD relating gNOI API, and we are discussing use gNOI NSF terminalogy to represent SONIC fast-reboot instead of SONiC warm-reboot, since there is another gNOI WARM already. ref: https://github.com/openconfig/gnoi/blob/98d6b81c6dfe3c5c400209f82a228cf3393ac923/system/system.proto#L128
May I propose "Warm Reboot Manager" in this HLD? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added a comment in that design: I am not sure why WARM and NSF reboot methods are defined separately in the gNOI spec. In my opinion they both refer to the same operation. I would propose adding a new reboot method for fast reboot since it doesn't map to any of the current gNOI reboot methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you prevent using the terminology "NSF" in this design doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somehow prefer using gnoi WARM terminology over NSF. Would you mind change all the NSF->WARM in this HLD including PR title and markdown headers. I will not block this PR if using WARM in HLD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced NSF with Warmboot. Updated the document tile, headings and diagrams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!