From 46d8e0a351f46632a202a4ec6a03397b0c4cbbe7 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Thu, 16 May 2024 19:42:42 +0000 Subject: [PATCH 01/14] Smart Switch reboot high level design This is initial draft --- .../reboot/images/dpu-reboot-seq.svg | 4 + .../reboot/images/smartswitch-reboot-seq.svg | 4 + doc/smart-switch/reboot/reboot-hld.md | 251 ++++++++++++++++++ 3 files changed, 259 insertions(+) create mode 100644 doc/smart-switch/reboot/images/dpu-reboot-seq.svg create mode 100644 doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg create mode 100644 doc/smart-switch/reboot/reboot-hld.md diff --git a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg new file mode 100644 index 0000000000..d4e4758ced --- /dev/null +++ b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg @@ -0,0 +1,4 @@ + + + +
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI Reboot API
GNOI Reboot API
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg new file mode 100644 index 0000000000..0c88bf72d5 --- /dev/null +++ b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg @@ -0,0 +1,4 @@ + + + +
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI RebootStatus API
GNOI RebootStatus API
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md new file mode 100644 index 0000000000..a952c83397 --- /dev/null +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -0,0 +1,251 @@ +# Smart Switch Reboot High Level Design + +## Table of Contents ## + +- [Smart Switch Reboot Design](#smart-switch-reboot-design) + - [Table of Contents](#table-of-contents) + - [Revision](#revision) + - [Glossory](#glossary) + - [Overview](#overview) + - [Assumptions](#assumptions) + - [Requirements](#requirements) + - [Methods of Switch and DPU Reboot](#methods-of-switch-and-dpu-reboot) + - [DPU reboot sequence](#dpu-reboot-sequence) + - [Switch reboot sequence](#switch-reboot-sequence) + - [High Level Design](#high-level-design) + - [ModuleBase Class API enhancement](#modulebase-class-api) + - [ModuleBase Class new APIs](#modulebase-class-new-apis) + - [NPU platform.json](#npu-platformjson) + - [GNOI API implementation](#gnoi-api-implementation) + - [reboot.py script modifications](#rebootpy-script-modifications) + - [Test plan](#test-plan) + +## Revision ## + +| Rev | Date | Author | Change Description | +| --- | ---- | ------ | ------------------ | +| 0.1 | 05/16/2024 | Vasundhara Volam | Initial version | + +## Glossory ## + +| Term | Meaning | +| ----- | ----------------------------------------- | +| ASIC | Application-Specific Integrated Circuit | +| DPU | Data Processing Unit | +| GNMI | gRPC Network Management Interface | +| GNOI | gRPC Network Operations Interface | +| NPU | Network Processing Unit | +| PCI-E | Peripheral Component Interconnect Express | + +## Overview ## + +Smart Switch aims to provide a full suite of network functionality, like traditional network devices, but with the flexibility +and scalability of cloud-based services. It consists of one switch ASIC (NPU) and multiple DPUs. The DPU ASICs are only +connected to the NPU, and all front panel ports are connected to the NPU. The DPU also connects to the SmartSwitch CPU via PCI-E +interfaces, allowing the Switch CPU to control DPUs through these interfaces. + +Each DPU will have one internal management IP which is used for internal communications, such as Redis database and zmq. This +internal communication is also used between NPU and DPU, and between DPUs. + +This document provides high level design of reboot sequence of a SmartSwitch with multiple DPUs and single DPU reboot sequence +through GNOI API. + +## Assumptions ## + +Smart Switch supports only cold-reboot and does not support warm-reboot as of today. + +## Requirements ## + +1. NPU host is running GNMI service to communicate with DPU. +2. DPU host is running GNMI server to listen to GNOI client requests. +3. Each DPU is assigned an IP address to communicate from NPU. + +## Methods of Switch and DPU Reboot ## + +The switch or DPU can be rebooted using either the CLI or during an image upgrade. The reboot can be initiated in two ways. + +1. Performing a complete SmartSwitch reboot, which restarts the NPU and all DPUs. +2. Rebooting a specific DPU by specifying its ID. + +In addition to the aforementioned causes of graceful reboots, a switch or DPU could be rebooted due to events such as power failures, kernel panics, etc. + +## DPU reboot sequence ## + +

+ +DPUs are internally connected to the NPU via PCI-E bridge. Below is the reboot sequence for rebooting a specific DPU: + +* Upon receiving a reboot CLI command to restart a particular DPU, the NPU transmits a GNOI Reboot API signal with reboot method set to ‘HALT’, instructing +the DPU to terminate all services. + +* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. +Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. + +* Subsequently, the NPU detaches the DPU PCI. Detachment can be achieved either by a vendor specific API or via sysfs +(echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/remove). + +* Next, the NPU triggers a platform vendor API to initiate the reboot process for the DPU. + +* The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI could be achieved either by calling vendor specific +API or via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/rescan). + +## Switch reboot sequence ## + +

+ +The following outlines the reboot procedure for the entire Smart Switch: + +* When the NPU receives a reboot command via the CLI to restart the SmartSwitch, it initiates the reboot sequence. + +* The NPU sends a GNOI Reboot API signal to all connected DPUs. This signal instructs the DPUs to gracefully terminate all services, excluding the GNMI +server, in preparation for the reboot. + +* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. +Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. + +* Following the confirmation from the DPUs, the NPU proceeds to detach the PCI devices associated with the DPUs. This detachment is achieved either by calling +vendor specific API or by issuing a command through the sysfs interface, specifically by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/remove file +for each DPU. + +* With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. + +* After initiating the reboot process for the DPUs, the NPU proceeds to reboot itself to complete the overall reboot procedure. + +* Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, +performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/rescan file, ensures that all PCI devices are properly +recognized and initialized. + +## High-Level Design ## + +### ModuleBase Class API ### + +reboot(self, reboot_type): +``` +Define new reboot_type as MODULE_REBOOT_DPU for DPU only reboot and MODULE_REBOOT_SMARTSWITCH for entire switch reboot. +``` + +### ModuleBase Class new APIs ### + +detach_dpu(self): + +Detach the DPU midplane. + +reattach_dpu(self): + +Rescan the midplane and attach it back. + +### NPU platform.json ### + +Introduce a new parameter, 'dpu_killservices_timeout', to specify the duration(in secs) for waiting for the DPU to terminate all services, as defined by +the platform vendor. If the DPU fails to respond within this timeout, the NPU will proceed with the reboot sequence. If no timeout is explicitly +defined, a default timeout will be used. + +```json +{ + . + . + "dpu_killservices_timeout" : "" + . + . +} +``` + +### GNOI API implementation ### + +According to the RebootRequest protocol outlined below, we will utilize the HALT command in the RebootMethod to terminate services on the DPU. +When the NPU sends the RebootRequest with the HALT RebootMethod to the DPU, it will kill all services except GNMI and database services. + +``` +*Arguments*: type of reboot (cold, warm, etc.), delay before issuing reboot, string describing reason for reboot, option to force reboot if sanity checks fail. + +rpc Reboot(RebootRequest) returns (RebootResponse) {} + +message RebootRequest { + RebootMethod method = 1; + // Delay in nanoseconds before issuing reboot. + uint64 delay = 2; + // Informational reason for the reboot. + string message = 3; + // Optional sub-components to reboot. + repeated types.Path subcomponents = 4; + // Force reboot if sanity checks fail. (ex. uncommited configuration) + bool force = 5; +} + +message RebootResponse { +} + +enum RebootMethod { + UNKNOWN = 0; // Invalid default method. + COLD = 1; // Shutdown and restart OS and all hardware. + POWERDOWN = 2; // Halt and power down, if possible. + *HALT = 3;* // Halt, if possible. + WARM = 4; // Reload configuration but not underlying hardware. + NSF = 5; // Non-stop-forwarding reboot, if possible. + // RESET method is deprecated in favor of the gNOI FactoryReset.Start(). + reserved 6; + POWERUP = 7; // Apply power, no-op if power is already on. +} +``` + +Upon sending the RebootRequest RPC to the DPU, the NPU will commence polling using RebootStatusRequest. If the DPU has effectively terminated the +services, it responds with STATUS_SUCCESS set in the RebootStatusResponse. Otherise, it will send the response with STATUS_RETRIABLE_FAILURE status. + +``` +rpc RebootStatus(RebootStatusRequest) returns (RebootStatusResponse) {} + +message RebootStatusRequest { + repeated types.Path subcomponents = 1; // optional sub-component. +} + +message RebootStatusResponse { + bool active = 1; // If reboot is active. + uint64 wait = 2; // Time left until reboot. + uint64 when = 3; // Time to reboot in nanoseconds since the epoch. + string reason = 4; // Reason for reboot. + uint32 count = 5; // Number of reboots since active. + RebootMethod method = 6; // Type of reboot. + RebootStatus status = 7; // Applicable only when active = false. +} + +message RebootStatus { + enum Status { + STATUS_UNKNOWN = 0; + STATUS_SUCCESS = 1; + STATUS_RETRIABLE_FAILURE = 2; + STATUS_FAILURE = 3; + } + Status status = 1; + string message = 2; +} +``` + +### reboot.py script modifications ### + +* Within the reboot() function, incorporate a verification step to invoke is_smartswitch(). Should is_smartswitch() yield false, proceed with the current +implementation. However, if is_smartswitch() returns true, invoke the new reboot_smartswitch() function, passing a parameter to specify whether it's +a complete switch reboot or targeting a specific DPU. + +* If the reboot_type is ‘REBOOT_TYPE_WARM’ and is_smartswitch is true, return a warning that this type of reboot is not supported. + +* Add a new reboot_smartswitch() function to reboot either the entire switch or a particular DPU, which takes DPU ID as an argument that +needs a reboot. + +## Test plan ## + +Presented below is the test plan within the ```sonic-mgmt``` framework for the smart switch reboot. + + +| Event | NPU reboot sequence | DPU reboot sequence | +| ----------------------------------------- | ------------------- | ------------------- | +| Power-On | Graceful boot | Graceful boot | +| Planned cold reboot of Smart Switch | Graceful reboot | Graceful reboot | +| Planned cold reboot of DPU | - | Graceful reboot | +| Planned power-cycle of Smart Switch | Graceful reboot | Graceful reboot | +| Planned power-cycle of DPU | - | Graceful reboot | +| Unplanned DPU power failure | - | Ungraceful reboot | +| Unplanned Smart Switch power failure | Ungraceful reboot | Ungraceful reboot | +| Unplanned Smart Switch System Crash | Ungraceful reboot | Ungraceful reboot | +| Unplanned DPU System Crash | - | Ungraceful reboot | From 2af68806add124f3c05fa74e0d97f7ddbc00ea4f Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Fri, 17 May 2024 18:43:51 +0000 Subject: [PATCH 02/14] Update HLD with modified APIs and images --- .../reboot/images/dpu-reboot-seq.svg | 2 +- .../reboot/images/smartswitch-reboot-seq.svg | 2 +- doc/smart-switch/reboot/reboot-hld.md | 107 ++++++++++++++++-- 3 files changed, 97 insertions(+), 14 deletions(-) diff --git a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg index d4e4758ced..ba52fd8c1f 100644 --- a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg @@ -1,4 +1,4 @@ -
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI Reboot API
GNOI Reboot API
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
Text is not SVG - cannot display
+
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI RebootResponse
GNOI RebootResponse
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI Reboot API
GNOI Reboot API
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg index 0c88bf72d5..d6e676f87f 100644 --- a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg @@ -1,4 +1,4 @@ -
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI RebootStatus API
GNOI RebootStatus API
Text is not SVG - cannot display
+
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponses
GNOI RebootStatusResponses
GNOI RebootStatus API to all DPUs
GNOI RebootStatus API to all DPUs
GNOI RebootResponses
GNOI RebootResponses
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index a952c83397..bfdf9da565 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -18,13 +18,15 @@ - [NPU platform.json](#npu-platformjson) - [GNOI API implementation](#gnoi-api-implementation) - [reboot.py script modifications](#rebootpy-script-modifications) + - [Error handling and exception scenarios](#error-handling-and-exception-scenarios) - [Test plan](#test-plan) ## Revision ## -| Rev | Date | Author | Change Description | -| --- | ---- | ------ | ------------------ | -| 0.1 | 05/16/2024 | Vasundhara Volam | Initial version | +| Rev | Date | Author | Change Description | +| --- | ---------- | ---------------- | ------------------ | +| 0.1 | 05/16/2024 | Vasundhara Volam | Initial version | +| 0.2 | 05/29/2024 | Vasundhara Volam | Update images and APIs | ## Glossory ## @@ -98,8 +100,8 @@ The following outlines the reboot procedure for the entire Smart Switch: * When the NPU receives a reboot command via the CLI to restart the SmartSwitch, it initiates the reboot sequence. -* The NPU sends a GNOI Reboot API signal to all connected DPUs. This signal instructs the DPUs to gracefully terminate all services, excluding the GNMI -server, in preparation for the reboot. +* The NPU sends a GNOI Reboot API signal to all connected DPUs in parallel using multiple threads. This signal instructs the DPUs to gracefully terminate all +services, excluding the GNMI server, in preparation for the reboot. * Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. @@ -111,7 +113,7 @@ for each DPU. * With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. -* After initiating the reboot process for the DPUs, the NPU proceeds to reboot itself to complete the overall reboot procedure. +* After receiving the response from the DPUs, the NPU proceeds to reboot itself to complete the overall reboot procedure. * Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/rescan file, ensures that all PCI devices are properly @@ -126,19 +128,37 @@ reboot(self, reboot_type): Define new reboot_type as MODULE_REBOOT_DPU for DPU only reboot and MODULE_REBOOT_SMARTSWITCH for entire switch reboot. ``` +This API is defined in [smartswitch-pmon.md](https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/pmon/smartswitch-pmon.md#:~:text=reboot(self%2C%20reboot_type)%3A) + ### ModuleBase Class new APIs ### -detach_dpu(self): +pci_detach_dpu(self): +``` +Detaches the DPU PCI device from the NPU. In the case of non-smart-switch chassis, no action is taken. + +Returns: + True +``` + +pci_reattach_dpu(self): +``` +Rescans the PCI bus and attach the DPU back to NPU. In the case of non-smart-switch chassis, no action is taken. -Detach the DPU midplane. +Returns: + True +``` -reattach_dpu(self): +get_dpu_bus_info(self, dpu_id): +``` +For a given DPU id, retrieve the PCI bus information. In the case of non-smart-switch chassis, no action is taken. -Rescan the midplane and attach it back. +Returns: + Returns the PCI bus information in BDF format like "[DDDD:]:BB:SS.F" +``` ### NPU platform.json ### -Introduce a new parameter, 'dpu_killservices_timeout', to specify the duration(in secs) for waiting for the DPU to terminate all services, as defined by +Introduce a new parameter, 'dpu_halt_services_timeout', to specify the duration(in secs) for waiting for the DPU to terminate all services, as defined by the platform vendor. If the DPU fails to respond within this timeout, the NPU will proceed with the reboot sequence. If no timeout is explicitly defined, a default timeout will be used. @@ -146,7 +166,27 @@ defined, a default timeout will be used. { . . - "dpu_killservices_timeout" : "" + "dpu_halt_services_timeout" : "TBD" + + "DPUs" : [ + { + "dpu0": { + "bus_info" : "[DDDD:]BB:SS.F" + } + }, + { + "dpu1": { + "bus_info" : "[DDDD:]BB:SS.F" + } + }, + . + . + { + "dpuX": { + "bus_info" : "[DDDD:]BB:SS.F" + } + } + ] . . } @@ -222,6 +262,23 @@ message RebootStatus { } ``` +### reboot CLI modifications ### + +Introduce a new parameter '-d' to the reboot command for specifying the DPU ID requiring a reboot. If the chassis is +not a smart switch, this action will have no effect. If the reboot command is executed without specifying any '-d' option, the entire switch will +be rebooted. + +``` +Usage /usr/local/bin/reboot [options] + Request rebooting the device. Invoke platform-specific tool when available. + This script will shutdown syncd before rebooting. + + Available options: + -h, -? : getting this help + -f : execute reboot force + -d : DPU ID +``` + ### reboot.py script modifications ### * Within the reboot() function, incorporate a verification step to invoke is_smartswitch(). Should is_smartswitch() yield false, proceed with the current @@ -233,6 +290,28 @@ a complete switch reboot or targeting a specific DPU. * Add a new reboot_smartswitch() function to reboot either the entire switch or a particular DPU, which takes DPU ID as an argument that needs a reboot. +``` +def reboot_smartswitch(duthost, localhost, reboot_type='cold', reboot_dpu='false', dpu_id='0') + """ + reboots SmartSwitch or a DPU + :param duthost: DUT host object + :param localhost: local host object + :param reboot_type: reboot type (cold) + :param reboot_dpu: reboot dpu or switch (true, false) + :param dpu_id: reboot the dpu with id, valid only if reboot_dpu is true. +``` + +### Error handling and exception scenarios ### + +* If the GNMI service is not operational on the DPU or DPU is unreachable for any reason, detach the PCI, and proceed with the reboot after a timeout +upon receiving an acknowledgment. + +* After the DPU reboots, if the DPU PCI fails to reconnect for any reason, an error-handling mechanism should be in place to restore the DPU. + +* If a DPU fails to reboot during a switch reboot, the NPU should attempt to recover the DPU and log any errors that occur. + +* In the event of power failure, a power-cycle due to a kernel panic, or any other unknown reason, both the DPU and NPU will undergo an ungraceful reboot. + ## Test plan ## Presented below is the test plan within the ```sonic-mgmt``` framework for the smart switch reboot. @@ -249,3 +328,7 @@ Presented below is the test plan within the ```sonic-mgmt``` framework for the s | Unplanned Smart Switch power failure | Ungraceful reboot | Ungraceful reboot | | Unplanned Smart Switch System Crash | Ungraceful reboot | Ungraceful reboot | | Unplanned DPU System Crash | - | Ungraceful reboot | + +### Test case details ### + +In progress From 7539f7a5f725108be125099e7ad2653ebd829f92 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Thu, 30 May 2024 03:02:20 +0000 Subject: [PATCH 03/14] Minor update to test plan --- doc/smart-switch/reboot/images/dpu-reboot-seq.svg | 2 +- doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg | 2 +- doc/smart-switch/reboot/reboot-hld.md | 5 ++--- 3 files changed, 4 insertions(+), 5 deletions(-) diff --git a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg index ba52fd8c1f..401877928a 100644 --- a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg @@ -1,4 +1,4 @@ -
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI RebootResponse
GNOI RebootResponse
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI Reboot API
GNOI Reboot API
Text is not SVG - cannot display
+
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI RebootResponse
GNOI RebootResponse
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI Reboot API
GNOI Reboot API
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg index d6e676f87f..219d8488aa 100644 --- a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg @@ -1,4 +1,4 @@ -
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponses
GNOI RebootStatusResponses
GNOI RebootStatus API to all DPUs
GNOI RebootStatus API to all DPUs
GNOI RebootResponses
GNOI RebootResponses
Text is not SVG - cannot display
+
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponses
GNOI RebootStatusResponses
GNOI RebootStatus API to all DPUs
GNOI RebootStatus API to all DPUs
GNOI RebootResponses
GNOI RebootResponses
Text is not SVG - cannot display
diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index bfdf9da565..65b78d4259 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -329,6 +329,5 @@ Presented below is the test plan within the ```sonic-mgmt``` framework for the s | Unplanned Smart Switch System Crash | Ungraceful reboot | Ungraceful reboot | | Unplanned DPU System Crash | - | Ungraceful reboot | -### Test case details ### - -In progress +The test scenarios above ensure that both the NPU and all DPUs are fully operational following any type of reboot. Furthermore, the tests verify the +functionality of PCI communication between NPU and DPUs. From 24c47fb005237d439213576123a19a2e6a2c508c Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Mon, 10 Jun 2024 23:39:41 +0000 Subject: [PATCH 04/14] Minor changes based on discussion with the community --- doc/smart-switch/reboot/reboot-hld.md | 28 +++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 65b78d4259..16a665de7c 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -13,10 +13,11 @@ - [DPU reboot sequence](#dpu-reboot-sequence) - [Switch reboot sequence](#switch-reboot-sequence) - [High Level Design](#high-level-design) - - [ModuleBase Class API enhancement](#modulebase-class-api) + - [ModuleBase Class API](#modulebase-class-api) - [ModuleBase Class new APIs](#modulebase-class-new-apis) - [NPU platform.json](#npu-platformjson) - [GNOI API implementation](#gnoi-api-implementation) + - [reboot CLI modifications](#reboot-cli-modifications) - [reboot.py script modifications](#rebootpy-script-modifications) - [Error handling and exception scenarios](#error-handling-and-exception-scenarios) - [Test plan](#test-plan) @@ -27,6 +28,7 @@ | --- | ---------- | ---------------- | ------------------ | | 0.1 | 05/16/2024 | Vasundhara Volam | Initial version | | 0.2 | 05/29/2024 | Vasundhara Volam | Update images and APIs | +| 0.3 | 06/10/2024 | Vasundhara Volam | Minor changes based on discussion with the community | ## Glossory ## @@ -69,7 +71,7 @@ The switch or DPU can be rebooted using either the CLI or during an image upgrad 1. Performing a complete SmartSwitch reboot, which restarts the NPU and all DPUs. 2. Rebooting a specific DPU by specifying its ID. -In addition to the aforementioned causes of graceful reboots, a switch or DPU could be rebooted due to events such as power failures, kernel panics, etc. +In addition to the previously mentioned causes of graceful reboots, a switch or DPU may also reboot due to events such as power failures during DPU power-up, kernel panics, and other similar incidents. ## DPU reboot sequence ## @@ -84,13 +86,13 @@ the DPU to terminate all services. service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. -* Subsequently, the NPU detaches the DPU PCI. Detachment can be achieved either by a vendor specific API or via sysfs +* Subsequently, the NPU detaches the DPU PCI with a vendor defined API. If a vendor specific API is not defined, detachment is done via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/remove). -* Next, the NPU triggers a platform vendor API to initiate the reboot process for the DPU. +* Next, the NPU triggers a platform vendor reboot API to initiate the reboot process for the DPU. -* The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI could be achieved either by calling vendor specific -API or via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/rescan). +* The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI is achieved by vendor defined API. If vendor specific API +is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/rescan). ## Switch reboot sequence ## @@ -111,9 +113,9 @@ Until the services are terminated gracefully, DPU response RebootStatusResponse vendor specific API or by issuing a command through the sysfs interface, specifically by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/remove file for each DPU. -* With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. +* With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. Vendor API reboots a single DPU, but the NPU spawns multiple threads to reboot DPUs in parallel. -* After receiving the response from the DPUs, the NPU proceeds to reboot itself to complete the overall reboot procedure. +* DPUs will send an acknowledgment to the NPU and then undergo a reboot. After receiving the acknowledgment from the DPUs, the NPU will proceed to reboot itself to complete the overall reboot procedure. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. * Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/rescan file, ensures that all PCI devices are properly @@ -148,9 +150,9 @@ Returns: True ``` -get_dpu_bus_info(self, dpu_id): +get_dpu_bus_info(self, dpu_module_name): ``` -For a given DPU id, retrieve the PCI bus information. In the case of non-smart-switch chassis, no action is taken. +For a given DPU module name, retrieve the PCI bus information. In the case of non-smart-switch chassis, no action is taken. Returns: Returns the PCI bus information in BDF format like "[DDDD:]:BB:SS.F" @@ -264,7 +266,7 @@ message RebootStatus { ### reboot CLI modifications ### -Introduce a new parameter '-d' to the reboot command for specifying the DPU ID requiring a reboot. If the chassis is +Introduce a new parameter '-d' to the reboot command for specifying the DPU module name requiring a reboot. If the chassis is not a smart switch, this action will have no effect. If the reboot command is executed without specifying any '-d' option, the entire switch will be rebooted. @@ -276,7 +278,7 @@ Usage /usr/local/bin/reboot [options] Available options: -h, -? : getting this help -f : execute reboot force - -d : DPU ID + -d : DPU module name ``` ### reboot.py script modifications ### @@ -303,6 +305,8 @@ def reboot_smartswitch(duthost, localhost, reboot_type='cold', reboot_dpu='false ### Error handling and exception scenarios ### +The following are specific error scenarios where the DPU state will not be DPU_READY. + * If the GNMI service is not operational on the DPU or DPU is unreachable for any reason, detach the PCI, and proceed with the reboot after a timeout upon receiving an acknowledgment. From 94dec18612766f16b8ec88f6fe642c1d2b639b3b Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Tue, 11 Jun 2024 17:19:04 +0000 Subject: [PATCH 05/14] Address review comments --- doc/smart-switch/reboot/reboot-hld.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 16a665de7c..7284c298f8 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -5,7 +5,7 @@ - [Smart Switch Reboot Design](#smart-switch-reboot-design) - [Table of Contents](#table-of-contents) - [Revision](#revision) - - [Glossory](#glossary) + - [Glossary](#glossary) - [Overview](#overview) - [Assumptions](#assumptions) - [Requirements](#requirements) @@ -36,8 +36,8 @@ | ----- | ----------------------------------------- | | ASIC | Application-Specific Integrated Circuit | | DPU | Data Processing Unit | -| GNMI | gRPC Network Management Interface | -| GNOI | gRPC Network Operations Interface | +| gNMI | gRPC Network Management Interface | +| gNOI | gRPC Network Operations Interface | | NPU | Network Processing Unit | | PCI-E | Peripheral Component Interconnect Express | @@ -60,8 +60,8 @@ Smart Switch supports only cold-reboot and does not support warm-reboot as of to ## Requirements ## -1. NPU host is running GNMI service to communicate with DPU. -2. DPU host is running GNMI server to listen to GNOI client requests. +1. NPU host is running gNMI/gNOI server to communicate with DPU. +2. DPU host is running gNOI server to listen to gNOI client requests. 3. Each DPU is assigned an IP address to communicate from NPU. ## Methods of Switch and DPU Reboot ## @@ -79,10 +79,10 @@ In addition to the previously mentioned causes of graceful reboots, a switch or DPUs are internally connected to the NPU via PCI-E bridge. Below is the reboot sequence for rebooting a specific DPU: -* Upon receiving a reboot CLI command to restart a particular DPU, the NPU transmits a GNOI Reboot API signal with reboot method set to ‘HALT’, instructing +* Upon receiving a reboot CLI command to restart a particular DPU, the NPU transmits a gNOI Reboot API signal with reboot method set to ‘HALT’, instructing the DPU to terminate all services. -* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database +* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except gNOI and database service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. @@ -102,8 +102,8 @@ The following outlines the reboot procedure for the entire Smart Switch: * When the NPU receives a reboot command via the CLI to restart the SmartSwitch, it initiates the reboot sequence. -* The NPU sends a GNOI Reboot API signal to all connected DPUs in parallel using multiple threads. This signal instructs the DPUs to gracefully terminate all -services, excluding the GNMI server, in preparation for the reboot. +* The NPU sends a gNOI Reboot API signal to all connected DPUs in parallel using multiple threads. This signal instructs the DPUs to gracefully terminate all +services, excluding the gNOI server and also database, in preparation for the reboot. * Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. @@ -307,7 +307,7 @@ def reboot_smartswitch(duthost, localhost, reboot_type='cold', reboot_dpu='false The following are specific error scenarios where the DPU state will not be DPU_READY. -* If the GNMI service is not operational on the DPU or DPU is unreachable for any reason, detach the PCI, and proceed with the reboot after a timeout +* If the gNOI service is not operational on the DPU or DPU is unreachable for any reason, detach the PCI, and proceed with the reboot after a timeout upon receiving an acknowledgment. * After the DPU reboots, if the DPU PCI fails to reconnect for any reason, an error-handling mechanism should be in place to restore the DPU. From c050f48d3490b67a41118e4b41b3c740f7cab058 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Wed, 26 Jun 2024 18:59:02 +0000 Subject: [PATCH 06/14] Minor correction to pci rescan information --- doc/smart-switch/reboot/reboot-hld.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 7284c298f8..1d2eda9558 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -92,7 +92,7 @@ Until the services are terminated gracefully, DPU response RebootStatusResponse * Next, the NPU triggers a platform vendor reboot API to initiate the reboot process for the DPU. * The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI is achieved by vendor defined API. If vendor specific API -is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/rescan). +is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/rescan). ## Switch reboot sequence ## @@ -118,7 +118,7 @@ for each DPU. * DPUs will send an acknowledgment to the NPU and then undergo a reboot. After receiving the acknowledgment from the DPUs, the NPU will proceed to reboot itself to complete the overall reboot procedure. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. * Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, -performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/rescan file, ensures that all PCI devices are properly +performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/rescan file, ensures that all PCI devices are properly recognized and initialized. ## High-Level Design ## From a0c94127a6219ee0307f9862fe46b066b3730bc0 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Thu, 18 Jul 2024 21:56:37 +0000 Subject: [PATCH 07/14] Update reboot mechanism of the DPU and pcie daemon changes --- doc/smart-switch/reboot/reboot-hld.md | 62 ++++++++++++++++++++++----- 1 file changed, 52 insertions(+), 10 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 1d2eda9558..b7594f03bc 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -18,9 +18,12 @@ - [NPU platform.json](#npu-platformjson) - [GNOI API implementation](#gnoi-api-implementation) - [reboot CLI modifications](#reboot-cli-modifications) - - [reboot.py script modifications](#rebootpy-script-modifications) + - [reboot script modifications](#reboot-script-modifications) + - [PCIe daemon modifications](#pcie-daemon-modifications) + - [Hardware watchdog on DPU](#hardware-watchdog-on-dpu) - [Error handling and exception scenarios](#error-handling-and-exception-scenarios) - [Test plan](#test-plan) + - [References](#references) ## Revision ## @@ -29,6 +32,7 @@ | 0.1 | 05/16/2024 | Vasundhara Volam | Initial version | | 0.2 | 05/29/2024 | Vasundhara Volam | Update images and APIs | | 0.3 | 06/10/2024 | Vasundhara Volam | Minor changes based on discussion with the community | +| 0.4 | 07/29/2024 | Vasundhara Volam | Add PCIe daemon changes | ## Glossory ## @@ -160,9 +164,9 @@ Returns: ### NPU platform.json ### -Introduce a new parameter, 'dpu_halt_services_timeout', to specify the duration(in secs) for waiting for the DPU to terminate all services, as defined by -the platform vendor. If the DPU fails to respond within this timeout, the NPU will proceed with the reboot sequence. If no timeout is explicitly -defined, a default timeout will be used. +Introduce a new parameter, 'dpu_halt_services_timeout', to specify the duration(in secs) for waiting for the DPU to terminate all services, +as defined by the platform vendor. If the DPU fails to respond within this timeout, the NPU will proceed with the reboot sequence. If no timeout is explicitly defined, +a default timeout will be used. ```json { @@ -196,8 +200,10 @@ defined, a default timeout will be used. ### GNOI API implementation ### -According to the RebootRequest protocol outlined below, we will utilize the HALT command in the RebootMethod to terminate services on the DPU. -When the NPU sends the RebootRequest with the HALT RebootMethod to the DPU, it will kill all services except GNMI and database services. +According to the RebootRequest protocol outlined below, we will utilize the 'HALT' in the RebootMethod to terminate services on the DPU. When the NPU sends the RebootRequest +with the HALT RebootMethod to the DPU, it will invoke the /usr/local/bin/reboot script to stop all the services except GNMI server and database services. Refer to the +[gNOI reboot HLD](#https://github.com/sonic-net/SONiC/blob/master/doc/warm-reboot/Warmboot_Manager_HLD.md) for design information of gNOI reboot flow to invoke the +reboot script in SONiC host services. ``` *Arguments*: type of reboot (cold, warm, etc.), delay before issuing reboot, string describing reason for reboot, option to force reboot if sanity checks fail. @@ -232,8 +238,8 @@ enum RebootMethod { } ``` -Upon sending the RebootRequest RPC to the DPU, the NPU will commence polling using RebootStatusRequest. If the DPU has effectively terminated the -services, it responds with STATUS_SUCCESS set in the RebootStatusResponse. Otherise, it will send the response with STATUS_RETRIABLE_FAILURE status. +After receiving the acknowledgement for RebootRequest RPC from the DPU, the NPU starts polling with RebootStatusRequest. If the DPU has effectively terminated +the services, it responds with STATUS_SUCCESS set in the RebootStatusResponse. Otherise, it will send the response with STATUS_RETRIABLE_FAILURE status. ``` rpc RebootStatus(RebootStatusRequest) returns (RebootStatusResponse) {} @@ -263,7 +269,6 @@ message RebootStatus { string message = 2; } ``` - ### reboot CLI modifications ### Introduce a new parameter '-d' to the reboot command for specifying the DPU module name requiring a reboot. If the chassis is @@ -279,9 +284,10 @@ Usage /usr/local/bin/reboot [options] -h, -? : getting this help -f : execute reboot force -d : DPU module name + -p : pre-shutdown ``` -### reboot.py script modifications ### +### reboot script modifications ### * Within the reboot() function, incorporate a verification step to invoke is_smartswitch(). Should is_smartswitch() yield false, proceed with the current implementation. However, if is_smartswitch() returns true, invoke the new reboot_smartswitch() function, passing a parameter to specify whether it's @@ -303,6 +309,35 @@ def reboot_smartswitch(duthost, localhost, reboot_type='cold', reboot_dpu='false :param dpu_id: reboot the dpu with id, valid only if reboot_dpu is true. ``` +* NPU invokes the reboot script with "-p" option on the DPU via GNOI API to reboot the DPU. When reboot script is invoked with "-p" option, +execute all the steps except the actual reboot at the end of the script. + +* When a DPU module is requested for a reboot, the reboot script will update StateDB with the reboot information according to the schema defined +below. Define a new function named update_dpu_reboot_info() for this purpose. Additionally, if the entire smart switch is undergoing a reboot, +update the same information for all the DPUs. Once the DPU is rebooted and the PCIe device is reattached, the StateDB entry will be updated accordingly. + +#### REBOOT_INFO schema in StateDB + +``` +"REBOOT_INFO|DPU_0": { + "value": { + "id": "1", + "dpu_state": "rebooting", + "bus_info" : "[DDDD:]BB:SS.F" + } +} +``` + +### PCIe daemon modifications ### +The PCIe daemon will be updated to avoid logging "PCIe Device: Not Found" messages when DPUs are undergoing a reboot, as this is a +user-initiated action. + +In the [check_pci_devices()](#https://github.com/sonic-net/sonic-platform-daemons/blob/bf865c6b711833347d3c57e9d84cd366bcd1b776/sonic-pcied/scripts/pcied#L155) function, +read the State DB for the REBOOT_INFO and suppress the "device not found" warning logs during a DPU reboot when the device is intentionally detached. + +### Hardware watchdog on DPU ### +TBD + ### Error handling and exception scenarios ### The following are specific error scenarios where the DPU state will not be DPU_READY. @@ -335,3 +370,10 @@ Presented below is the test plan within the ```sonic-mgmt``` framework for the s The test scenarios above ensure that both the NPU and all DPUs are fully operational following any type of reboot. Furthermore, the tests verify the functionality of PCI communication between NPU and DPUs. + +## References + +- [Openconfig system.proto](#https://github.com/openconfig/gnoi/blob/main/system/system.proto) +- [Warmboot Manager HLD](#https://github.com/sonic-net/SONiC/blob/master/doc/warm-reboot/Warmboot_Manager_HLD.md) +- [gNOI reboot HLD](#https://github.com/sonic-net/SONiC/blob/master/doc/warm-reboot/Warmboot_Manager_HLD.md) +- [PCIe daemon](#https://github.com/sonic-net/sonic-platform-daemons/blob/master/sonic-pcied/scripts/pcied) From 6b165b204883e030e7254f30dfcb36866ef3255f Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Tue, 30 Jul 2024 01:10:28 +0000 Subject: [PATCH 08/14] Minor changes --- doc/smart-switch/reboot/reboot-hld.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index b7594f03bc..95d926e8ac 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -20,7 +20,6 @@ - [reboot CLI modifications](#reboot-cli-modifications) - [reboot script modifications](#reboot-script-modifications) - [PCIe daemon modifications](#pcie-daemon-modifications) - - [Hardware watchdog on DPU](#hardware-watchdog-on-dpu) - [Error handling and exception scenarios](#error-handling-and-exception-scenarios) - [Test plan](#test-plan) - [References](#references) @@ -335,9 +334,6 @@ user-initiated action. In the [check_pci_devices()](#https://github.com/sonic-net/sonic-platform-daemons/blob/bf865c6b711833347d3c57e9d84cd366bcd1b776/sonic-pcied/scripts/pcied#L155) function, read the State DB for the REBOOT_INFO and suppress the "device not found" warning logs during a DPU reboot when the device is intentionally detached. -### Hardware watchdog on DPU ### -TBD - ### Error handling and exception scenarios ### The following are specific error scenarios where the DPU state will not be DPU_READY. @@ -351,6 +347,8 @@ upon receiving an acknowledgment. * In the event of power failure, a power-cycle due to a kernel panic, or any other unknown reason, both the DPU and NPU will undergo an ungraceful reboot. +* In the event of a DPU reboot failure, a hardware watchdog is needed to monitor and reset the DPU. This implementation is vendor-specific. + ## Test plan ## Presented below is the test plan within the ```sonic-mgmt``` framework for the smart switch reboot. From a37115ca504ce468d158be0ed877808e0d672a6d Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Tue, 6 Aug 2024 19:25:05 +0000 Subject: [PATCH 09/14] Minor changes --- doc/smart-switch/reboot/reboot-hld.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 95d926e8ac..6fe8f5641a 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -63,7 +63,7 @@ Smart Switch supports only cold-reboot and does not support warm-reboot as of to ## Requirements ## -1. NPU host is running gNMI/gNOI server to communicate with DPU. +1. NPU host is running gNOI client to communicate with DPU. 2. DPU host is running gNOI server to listen to gNOI client requests. 3. Each DPU is assigned an IP address to communicate from NPU. @@ -173,7 +173,7 @@ a default timeout will be used. . "dpu_halt_services_timeout" : "TBD" - "DPUs" : [ + "DPUS" : [ { "dpu0": { "bus_info" : "[DDDD:]BB:SS.F" @@ -312,16 +312,16 @@ def reboot_smartswitch(duthost, localhost, reboot_type='cold', reboot_dpu='false execute all the steps except the actual reboot at the end of the script. * When a DPU module is requested for a reboot, the reboot script will update StateDB with the reboot information according to the schema defined -below. Define a new function named update_dpu_reboot_info() for this purpose. Additionally, if the entire smart switch is undergoing a reboot, +below. Define a new function named update_dpu_pcie_info() for this purpose. Additionally, if the entire smart switch is undergoing a reboot, update the same information for all the DPUs. Once the DPU is rebooted and the PCIe device is reattached, the StateDB entry will be updated accordingly. -#### REBOOT_INFO schema in StateDB +#### PCIE_DETACH_INFO schema in StateDB ``` -"REBOOT_INFO|DPU_0": { +"PCIE_DETACH_INFO|DPU_0": { "value": { "id": "1", - "dpu_state": "rebooting", + "dpu_state": "detaching", "bus_info" : "[DDDD:]BB:SS.F" } } @@ -332,7 +332,7 @@ The PCIe daemon will be updated to avoid logging "PCIe Device: Not user-initiated action. In the [check_pci_devices()](#https://github.com/sonic-net/sonic-platform-daemons/blob/bf865c6b711833347d3c57e9d84cd366bcd1b776/sonic-pcied/scripts/pcied#L155) function, -read the State DB for the REBOOT_INFO and suppress the "device not found" warning logs during a DPU reboot when the device is intentionally detached. +read the State DB for the PCIE_DETACH_INFO and suppress the "device not found" warning logs during a DPU reboot when the device is intentionally detached. ### Error handling and exception scenarios ### From 442e8a7836659fbbb24d085268e5b620c2ada3db Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Fri, 9 Aug 2024 17:32:40 +0000 Subject: [PATCH 10/14] Made a minor change to dup_id based on get_dpu_id() update in https://github.com/sonic-net/sonic-platform-common/pull/454 --- doc/smart-switch/reboot/reboot-hld.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 6fe8f5641a..e8858446c7 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -320,7 +320,7 @@ update the same information for all the DPUs. Once the DPU is rebooted and the P ``` "PCIE_DETACH_INFO|DPU_0": { "value": { - "id": "1", + "dpu_id": "0", "dpu_state": "detaching", "bus_info" : "[DDDD:]BB:SS.F" } From f7ca496f0fd77d40df8e3a770b3c7e0076c5f4f1 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Tue, 24 Sep 2024 22:42:59 +0000 Subject: [PATCH 11/14] Add some enhancements --- doc/smart-switch/reboot/reboot-hld.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index e8858446c7..a770275fa3 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -86,13 +86,14 @@ DPUs are internally connected to the NPU via PCI-E bridge. Below is the reboot s the DPU to terminate all services. * Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except gNOI and database -service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. -Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS and 'active' +will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. * Subsequently, the NPU detaches the DPU PCI with a vendor defined API. If a vendor specific API is not defined, detachment is done via sysfs (echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/remove). -* Next, the NPU triggers a platform vendor reboot API to initiate the reboot process for the DPU. +* Next, the NPU triggers a platform vendor reboot API to initiate the reboot process for the DPU. If the DPU is stuck or unresponsive, the DPU reboot platform API should +attempt a cold boot or power cycle to recover it. * The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI is achieved by vendor defined API. If vendor specific API is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/rescan). @@ -109,14 +110,14 @@ The following outlines the reboot procedure for the entire Smart Switch: services, excluding the gNOI server and also database, in preparation for the reboot. * Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database -service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS. -Until the services are terminated gracefully, DPU response RebootStatusResponse with STATUS_RETRIABLE_FAILURE status. +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS and 'active' +will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. * Following the confirmation from the DPUs, the NPU proceeds to detach the PCI devices associated with the DPUs. This detachment is achieved either by calling vendor specific API or by issuing a command through the sysfs interface, specifically by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/remove file for each DPU. -* With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. Vendor API reboots a single DPU, but the NPU spawns multiple threads to reboot DPUs in parallel. +* With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. Vendor API reboots a single DPU, but the NPU spawns multiple threads to reboot DPUs in parallel. If any of the the DPU is stuck or unresponsive, the DPU reboot platform API should attempt a cold boot or power cycle to recover it. * DPUs will send an acknowledgment to the NPU and then undergo a reboot. After receiving the acknowledgment from the DPUs, the NPU will proceed to reboot itself to complete the overall reboot procedure. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. @@ -238,7 +239,8 @@ enum RebootMethod { ``` After receiving the acknowledgement for RebootRequest RPC from the DPU, the NPU starts polling with RebootStatusRequest. If the DPU has effectively terminated -the services, it responds with STATUS_SUCCESS set in the RebootStatusResponse. Otherise, it will send the response with STATUS_RETRIABLE_FAILURE status. +the services, it responds with STATUS_SUCCESS and 'active' will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, +'active' will be '1' in the RebootStatusResponse. ``` rpc RebootStatus(RebootStatusRequest) returns (RebootStatusResponse) {} From 605c3a56ac2717dbbb638433e7bb13054fc05a31 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Mon, 30 Sep 2024 18:27:22 +0000 Subject: [PATCH 12/14] Minor change to new APIs --- doc/smart-switch/reboot/reboot-hld.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index a770275fa3..977afc28d5 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -138,7 +138,7 @@ This API is defined in [smartswitch-pmon.md](https://github.com/sonic-net/SONiC/ ### ModuleBase Class new APIs ### -pci_detach_dpu(self): +pci_detach(self. module_name): ``` Detaches the DPU PCI device from the NPU. In the case of non-smart-switch chassis, no action is taken. @@ -146,7 +146,7 @@ Returns: True ``` -pci_reattach_dpu(self): +pci_reattach(self, module_name): ``` Rescans the PCI bus and attach the DPU back to NPU. In the case of non-smart-switch chassis, no action is taken. @@ -154,7 +154,7 @@ Returns: True ``` -get_dpu_bus_info(self, dpu_module_name): +get_bus_info(self, module_name): ``` For a given DPU module name, retrieve the PCI bus information. In the case of non-smart-switch chassis, no action is taken. From 04240e7397ac369b9fcd86281ae114b197fd05ba Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Wed, 9 Oct 2024 17:30:41 +0000 Subject: [PATCH 13/14] Address review comments --- doc/smart-switch/reboot/reboot-hld.md | 37 ++++++++++++++++++--------- 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index 977afc28d5..c285f8a134 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -66,6 +66,7 @@ Smart Switch supports only cold-reboot and does not support warm-reboot as of to 1. NPU host is running gNOI client to communicate with DPU. 2. DPU host is running gNOI server to listen to gNOI client requests. 3. Each DPU is assigned an IP address to communicate from NPU. +4. SONiC host services on both the NPU and DPU should undergo a graceful shutdown during reboot. ## Methods of Switch and DPU Reboot ## @@ -82,11 +83,10 @@ In addition to the previously mentioned causes of graceful reboots, a switch or DPUs are internally connected to the NPU via PCI-E bridge. Below is the reboot sequence for rebooting a specific DPU: -* Upon receiving a reboot CLI command to restart a particular DPU, the NPU transmits a gNOI Reboot API signal with reboot method set to ‘HALT’, instructing -the DPU to terminate all services. +* Upon receiving a [reboot](https://github.com/sonic-net/sonic-utilities/blob/master/scripts/reboot) CLI command to restart a particular DPU, the NPU transmits a gNOI Reboot RPC signal with RebootMethod set to ‘HALT’, instructing the DPU to terminate all services. -* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except gNOI and database -service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS and 'active' +* Upon dispatching the gNOI Reboot RPC, the NPU issues the gNOI RebootStatus RPC to monitor whether the DPU has terminated all services except gNOI and database +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the gNOI RebootStatus RPC with STATUS_SUCCESS and 'active' will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. * Subsequently, the NPU detaches the DPU PCI with a vendor defined API. If a vendor specific API is not defined, detachment is done via sysfs @@ -104,13 +104,13 @@ is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/rescan). The following outlines the reboot procedure for the entire Smart Switch: -* When the NPU receives a reboot command via the CLI to restart the SmartSwitch, it initiates the reboot sequence. +* When the NPU receives a [reboot](https://github.com/sonic-net/sonic-utilities/blob/master/scripts/reboot) command via the CLI to restart the SmartSwitch, it initiates the reboot sequence. -* The NPU sends a gNOI Reboot API signal to all connected DPUs in parallel using multiple threads. This signal instructs the DPUs to gracefully terminate all +* The NPU sends a gNOI Reboot RPC signal to all connected DPUs in parallel using multiple threads. This signal instructs the DPUs to gracefully terminate all services, excluding the gNOI server and also database, in preparation for the reboot. -* Upon dispatching the Reboot API, the NPU issues the RebootStatus API to monitor whether the DPU has terminated all services except GNMI and database -service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the RebootStatus API with STATUS_SUCCESS and 'active' +* Upon dispatching the gNOI Reboot RPC, the NPU issues the gNOI RebootStatus RPC to monitor whether the DPU has terminated all services except GNMI and database +service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the gNOI RebootStatus RPC with STATUS_SUCCESS and 'active' will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. * Following the confirmation from the DPUs, the NPU proceeds to detach the PCI devices associated with the DPUs. This detachment is achieved either by calling @@ -119,7 +119,7 @@ for each DPU. * With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. Vendor API reboots a single DPU, but the NPU spawns multiple threads to reboot DPUs in parallel. If any of the the DPU is stuck or unresponsive, the DPU reboot platform API should attempt a cold boot or power cycle to recover it. -* DPUs will send an acknowledgment to the NPU and then undergo a reboot. After receiving the acknowledgment from the DPUs, the NPU will proceed to reboot itself to complete the overall reboot procedure. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. +* After all the DPUs have rebooted and responded to the platform's reboot vendor API, the NPU will proceed with its own reboot to complete the overall reboot process. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. * Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/rescan file, ensures that all PCI devices are properly @@ -172,7 +172,7 @@ a default timeout will be used. { . . - "dpu_halt_services_timeout" : "TBD" + "dpu_halt_services_timeout" : "300" "DPUS" : [ { @@ -355,6 +355,14 @@ upon receiving an acknowledgment. Presented below is the test plan within the ```sonic-mgmt``` framework for the smart switch reboot. +###Graceful boot/reboot### + +A graceful boot refers to a controlled and orderly startup process where the system (whether it is a device, DPU, NPU, or the entire system) powers on or reboots without any unexpected interruptions or failures. During a graceful boot, all components follow a well-defined sequence to ensure system stability and functionality. + +###Ungraceful boot/reboot### + +An ungraceful boot occurs when the boot process is interrupted, incomplete, or initiated in a hasty or unexpected manner, leading to potential system errors or data corruption. This may result from power loss, forced shutdowns, or reboot failures. + | Event | NPU reboot sequence | DPU reboot sequence | | ----------------------------------------- | ------------------- | ------------------- | @@ -368,11 +376,16 @@ Presented below is the test plan within the ```sonic-mgmt``` framework for the s | Unplanned Smart Switch System Crash | Ungraceful reboot | Ungraceful reboot | | Unplanned DPU System Crash | - | Ungraceful reboot | -The test scenarios above ensure that both the NPU and all DPUs are fully operational following any type of reboot. Furthermore, the tests verify the -functionality of PCI communication between NPU and DPUs. +The test scenarios described above ensure that both the NPU and all DPUs are fully operational after any type of reboot. Additionally, the tests verify the following post-reboot conditions: + +1. DPUs that were UP before the reboot have successfully come back online. +2. DPUs that were administratively down remain in the down state after the reboot. +3. PCI communication between the NPU and any DPUs that are online is functioning correctly. +4. The cause of the reboot is accurately recorded and updated. ## References +- [PMON HLD](https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/pmon/smartswitch-pmon.md) - [Openconfig system.proto](#https://github.com/openconfig/gnoi/blob/main/system/system.proto) - [Warmboot Manager HLD](#https://github.com/sonic-net/SONiC/blob/master/doc/warm-reboot/Warmboot_Manager_HLD.md) - [gNOI reboot HLD](#https://github.com/sonic-net/SONiC/blob/master/doc/warm-reboot/Warmboot_Manager_HLD.md) From 92504ae2a4f997d063f6aac95eb52be14f5ee667 Mon Sep 17 00:00:00 2001 From: Vasundhara Volam Date: Tue, 22 Oct 2024 19:49:03 +0000 Subject: [PATCH 14/14] Address some review comments --- doc/smart-switch/reboot/images/dpu-reboot-seq.svg | 2 +- .../reboot/images/smartswitch-reboot-seq.svg | 2 +- doc/smart-switch/reboot/reboot-hld.md | 14 ++++---------- 3 files changed, 6 insertions(+), 12 deletions(-) diff --git a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg index 401877928a..c607b70474 100644 --- a/doc/smart-switch/reboot/images/dpu-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/dpu-reboot-seq.svg @@ -1,4 +1,4 @@ -
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI RebootResponse
GNOI RebootResponse
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe vendor API
Rescan PCIe vendor API
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI Reboot API
GNOI Reboot API
Text is not SVG - cannot display
+
DPU ASIC Host
DPU ASIC Host
Switch ASIC (NPU) Host
Switch ASIC (NPU) Ho...
reboot CLI
with DPU ID
reboot CLI...
GNOI RebootResponse
GNOI RebootResponse
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach DPU PCI vendor API
Detach DPU PCI vendor API
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
Rescan PCIe
Rescan PCIe
PCI-E Bridge
PCI-E Bridge
GNOI RebootStatus API
GNOI RebootStatus API
GNOI RebootStatusResponse
GNOI RebootStatusResponse
GNOI Reboot API
GNOI Reboot API
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg index 219d8488aa..11406fe5b4 100644 --- a/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg +++ b/doc/smart-switch/reboot/images/smartswitch-reboot-seq.svg @@ -1,4 +1,4 @@ -
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
Rescan PCIe vendor API
Rescan PCIe vendor API
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponses
GNOI RebootStatusResponses
GNOI RebootStatus API to all DPUs
GNOI RebootStatus API to all DPUs
GNOI RebootResponses
GNOI RebootResponses
Text is not SVG - cannot display
+
Switch ASIC (NPU) Host
Switch ASIC (NPU) Host
PCI-E bridge
PCI-E bridge
reboot CLI
reboot CLI
GNOI Reboot API to all DPUs
GNOI Reboot API to all DPUs
Graceful services 
shudown 
(except GNMI server)
Graceful services...
Detach PCI vendor API for all DPUs
Detach PCI vendor API for all DPUs
Platform vendor API - reboot DPU
Platform vendor API - reboot DPU
reboot
reboot
return
return
reboot
reboot
DPUn ASIC Host
DPUn ASIC Host
GNOI RebootStatusResponses
GNOI RebootStatusResponses
GNOI RebootStatus API to all DPUs
GNOI RebootStatus API to all DPUs
GNOI RebootResponses
GNOI RebootResponses
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/smart-switch/reboot/reboot-hld.md b/doc/smart-switch/reboot/reboot-hld.md index c285f8a134..5a7163c982 100644 --- a/doc/smart-switch/reboot/reboot-hld.md +++ b/doc/smart-switch/reboot/reboot-hld.md @@ -89,14 +89,12 @@ DPUs are internally connected to the NPU via PCI-E bridge. Below is the reboot s service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the gNOI RebootStatus RPC with STATUS_SUCCESS and 'active' will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. -* Subsequently, the NPU detaches the DPU PCI with a vendor defined API. If a vendor specific API is not defined, detachment is done via sysfs -(echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/remove). +* Subsequently, the NPU detaches the DPU PCI device using the sysfs interface by executing `echo 1 > /sys/bus/pci/devices/XXXX:XX:XX.X/remove`. This detachment is currently performed via sysfs, with future plans to implement a vendor-specific API when the DPU bus information cannot have a fixed value. * Next, the NPU triggers a platform vendor reboot API to initiate the reboot process for the DPU. If the DPU is stuck or unresponsive, the DPU reboot platform API should attempt a cold boot or power cycle to recover it. -* The NPU either immediately rescans the PCI upon return or after a timeout period. Rescan of the PCI is achieved by vendor defined API. If vendor specific API -is not defined, then rescan is done via sysfs (echo 1 > /sys/bus/pci/rescan). +* The NPU either immediately rescans the PCI bus upon return or after a specified timeout period. This rescan is performed via the sysfs interface by echoing '1' to /sys/bus/pci/rescan. ## Switch reboot sequence ## @@ -113,17 +111,13 @@ services, excluding the gNOI server and also database, in preparation for the re service, continuing until the timeout is reached. Once the DPU successfully terminates all services, it responds to the gNOI RebootStatus RPC with STATUS_SUCCESS and 'active' will be set to false in the RebootStatusResponse. Until the services are terminated gracefully, 'active' will be '1' in the RebootStatusResponse. -* Following the confirmation from the DPUs, the NPU proceeds to detach the PCI devices associated with the DPUs. This detachment is achieved either by calling -vendor specific API or by issuing a command through the sysfs interface, specifically by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/remove file -for each DPU. +* Following the confirmation from the DPUs, the NPU proceeds to detach the PCI devices associated with the DPUs. This detachment is achieved through the sysfs interface by echoing '1' to the /sys/bus/pci/devices/XXXX:XX:XX.X/remove file for each DPU. While this detachment is currently performed via sysfs, there are plans to implement a vendor-specific API for cases where the DPU bus information cannot have a fixed value. * With the DPUs prepared for reboot, the NPU triggers a platform vendor API to initiate the reboot process for the DPUs. Vendor API reboots a single DPU, but the NPU spawns multiple threads to reboot DPUs in parallel. If any of the the DPU is stuck or unresponsive, the DPU reboot platform API should attempt a cold boot or power cycle to recover it. * After all the DPUs have rebooted and responded to the platform's reboot vendor API, the NPU will proceed with its own reboot to complete the overall reboot process. The vendor-specific reboot API should include an error handling mechanism to manage DPU reboot failures. Additionally log all the failures. DPUs will be in DPU_READY state, if the reboot happened successfully. -* Upon successful reboot, the NPU resumes operation. As part of the post-reboot process, the NPU may choose to rescan the PCI devices. This rescan operation, -performed either by invoking vendor API or by echoing '1' to the /sys/bus/pci/rescan file, ensures that all PCI devices are properly -recognized and initialized. +* After a successful reboot, the NPU resumes its operations, and PCI enumeration occurs as part of the reboot process. ## High-Level Design ##