SONiC Chassis Platform Requirements and Enhancements Analysis #945

abdosi · 2022-02-23T16:59:55Z

Capture Platform specific Requirements for Chassis Platforms.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

…er_ecmp

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi · 2022-02-23T17:12:14Z

doc/pmon/pmon-chassis-requirements.md

@@ -0,0 +1,37 @@
+Section 1 Requirements that are needed by default:-
+        1. On LC the reboot command should power-cycle the entire LC . Expectation is Peer node should detect link down when reboot is given on LC


Sonic Reboot command

sonic reboot command that invokes platform plugin to reboot

@abdosi remove power-cycle in the point

abdosi · 2022-02-23T17:20:58Z

doc/pmon/pmon-chassis-requirements.md

@@ -0,0 +1,37 @@
+Section 1 Requirements that are needed by default:-
+        1. On LC the reboot command should power-cycle the entire LC . Expectation is Peer node should detect link down when reboot is given on LC
+		2. On RP the reboot command should reboot the entire system (RP and LC). . Expectation is Peer node should detect link down when reboot is given on RP


Graceful reboot for LC vs power-cycle the LC ? Open question. Possibility of SDD corruption without Graceful restart.

Basically reboot of LC from RP is ungraceful.

Scenarios (High to Low Priority Order):

Graceful Supervisor going down and LC ungraceful (default)

Ungraceful Supervisor going down and LC react to this . Need further discussion. No Conclusion yet.

Graceful Supervisor going down and LC graceful (Orchestration start from RP) Need further discussion. No Conclusion yet.

Conclusion for Ideal Case:
Enhance Reboot to do Supervisor only Reboot. Option for Reboot (Supervisor vs Entire Chassis)

Entire Chassis: Graceful Supervisor going down and LC ungraceful (default). Default and must-needed behavior.

Supervisor only option is useful for

a) Orchestrate Graceful reboot for entire chassis via external controller.
b) Dual Supervisor case.

Need more discussion: for Worst Case (Supervisor just disappearance eg: watchdog triggered on supervisor)
a) To determine what is happening now,
b) Enhancements to handle this will be needed.

if Line card detect sup going down and then LC should be kept in down state if Platform specific LC shutdown capability) else LC do self-reboot (can be continuous if SUP never comes up or always in bad state)
if Line-card can not detect sup going down then sup after comin-up broadcast all LC's to self-reboot

if Line-card can not detect sup going down then sup after coming-up broadcast all LC's to self-reboot

This might have an issue! There are chassis/platform whereby LC comes up on their own the moment power is turned-on to the chassis. So, Supervisor asking for its boot-up may interrupt already booting-up LC in a chassis power-on/reload scenarios. Suggest to have Supervisor 'booting up' workflow to be same regardless of SUP-only reload/boot-up or chassis reload.

if Line card detect sup going down

This would be over Keepalives/ heartbeats exchange between SUP and LC.
Suggest adding another scenario: SUP detecting LC not there and discuss it.

Summary based of internal discussion:

If Supervisor goes down unplanned/ungracefully then there is no need to reboot LC or any other action.

In above scenario LC should be generating syslog complaining about Supervisor not being reachable (Eg: PMON trying to access Chassis DB to push the data)

Above syslog can be used by Alerting logic and necessary action can be taken form LC/Chassis perspective like doing isolation and doing config reload on LC by external controller

Regarding (1), "there is no need to reboot LC or any other action"
There are a number of concerns with letting linecards run headless while a supervisor reboots

There will be a long window of time where no state exchange between linecards will be possible due to lack of chassis_db.

The software on the linecard now has to deal with live disconnection/reconnection of connectivity to chassis_db or other supervisor

There's no real stated benefit motivating this propsal. A strong working assumption behind the chassis architecture conception for the current set of use cases is that there is enough redundancy in the network to easily tolerate the entire chassis going down and coming back, especially when it is a rare event like an unplanned supervisor rebbot.

In general, it is best to keep failure handling simple and predictable, and avoid divergent flows across different scenarios. So unless there is a specific concrete problem that is solved by keeping the linecards, it would be strongly preferable to always reboot linecards when a supervisor reboots.

At the very least, the headless operation should be made optional and not mandated.

Option 1 (Preferable)
If Supervisor goes down (headless operation) then LC should go down

Option 2
IF LC are running in headless mode the LC should be able to send syslog asap (mid-plane connectivity should get restored) so that LC can communicate the error to External Management.

Supervisor platform services/code to be healthy to make linecard's running smooth. When supervisor platform code isn't healthy (like hw heartbeat is down, sw heartbeat etc), its considered as unhealthy supervisor or headless. In this case, HW vendor has defined what would linecards do. Linecard's don't operate when supervisor detected down.

rlhui · 2022-03-02T07:01:51Z

Can we rename this PR to better reflect this doc?

abdosi · 2022-03-02T16:38:54Z

Can we rename this PR to better reflect this doc?

@rlhui Updated the PR title.

abdosi · 2022-03-02T16:45:26Z

cc @Staphylo

abdosi · 2022-03-02T16:45:34Z

cc @shyam77git

abdosi · 2022-03-02T16:46:05Z

cc @mprabhu-nokia

abdosi · 2022-03-02T16:56:12Z

doc/pmon/pmon-chassis-requirements.md

+		3. Config shut/unshut of LC will be supported as per the Chassis-d design.
+		4. Generate syslog for all the critical events and share the threshold (for appropriate/needed components)  in documents and recommended for given threshold range.  Expectation is we will bind syslog to our Alert Orchestration system and perform recommnded action based on the documents.
+		5. PCI-e issue of not able to detect FC ASIC’s and LC ASIC’s and syslog for same.
+		Integrate with pcied process in PMON[sonic-platform-daemons/pcied at master · Azure/sonic-platform-daemons (github.com)]. Note: Current PCI daemon polling for pci devices is 60sec which is large poll interval. Does it need optimization ?


should we make this change generic https://github.com/Azure/sonic-buildimage/blob/master/files/scripts/swss.sh#L193 ?

this can cause of delay of overall SW initialization. not backward-compatible as of now. can impact existing running systems.

Generate syslog for all the critical events and share the threshold (for appropriate/needed components) in documents and recommended for given threshold range. Expectation is we will bind syslog to our Alert Orchestration system and perform recommnded action based on the documents.

Per today's chassis workgroup sync-up, can we enhance this point to highlight following:
a) For now (near-term solution): External controller would take the recommended action (based on document) in real-time
b) Enhancing it to have a system-driven solution: I suggested having a platform-supplied policy (look-up) file of events and actions. On receiving an event (syslog), SONiC (LC, RP) or Ext Controller to perform a lookup on this policy file and take recommended action.

Document can also provide how many FC'S ASIC should be there to support X FC and Y LC Scenario. Each Platform vendor need to provide this matrix.

abdosi · 2022-03-02T16:57:11Z

doc/pmon/pmon-chassis-requirements.md

+		5. PCI-e issue of not able to detect FC ASIC’s and LC ASIC’s and syslog for same.
+		Integrate with pcied process in PMON[sonic-platform-daemons/pcied at master · Azure/sonic-platform-daemons (github.com)]. Note: Current PCI daemon polling for pci devices is 60sec which is large poll interval. Does it need optimization ?
+		6. Boot-up failure Handling. Need to see the SONiC behaviour from system perspective/docker status/syslog getting generated with required/correct information
+		7. HW-Watchdog adhering to current SONiC behavior. Start before reboot and explicitly disabled post reboot by SONiC (This means SONiC is booted up and Services are fine)


Ref: https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/watchdog-control/watchdog-control.sh
https://github.com/Azure/sonic-buildimage/blob/master/files/image_config/watchdog-control/watchdog-control.service

Watchdog Scheme will need enhancement in case if we want to detect some faults (for eg: CPU getting stuck in running state) then given platform/vendor has HW-watchdog that can take recovery action in such cases. Currently since hw watchdog disable by SONiC post boot above scenarios can not be handle even if given platform/vendor can support it. It can be used for Debugging purpose where possible by collecting dumps.

abdosi · 2022-03-02T18:15:14Z

doc/pmon/pmon-chassis-requirements.md

+Section 1 Requirements that are needed by default:-
+        1. On LC the reboot command should power-cycle the entire LC . Expectation is Peer node should detect link down when reboot is given on LC
+		2. On RP the reboot command should reboot the entire system (RP and LC). . Expectation is Peer node should detect link down when reboot is given on RP
+		3. Config shut/unshut of LC will be supported as per the Chassis-d design.


not all platforms can support LC shut as there might not be power control on LC

@abdosi will update the point from supervisor.

Useful when LC is not reachable via SSH or Console:-

These command are invoke from supervisor for given LC:-
shut: power shut for card (platform dependent if not supported return the not supported.)
unshut: bring power back (platform dependent if not supported return error)
reboot: can be power-cycle or cpu reset only (best to platform capability if not supported return error)

reboot-cause shows what on LC when above option are invoke on supervisor ? Need to revisit/discuss.

config chassis startup <module-name>
config chassis shutdown <module-name>
reboot <module-name>

module-name here are are LC. Based on 03/18 update (see below) FC are also in scope.
power-cycle of FC can also have implication Kernel.
FC Graceful handling dependency on Kernel modules also ?
Do we support FC Insertion/Removal ?

03/18: Update:

Possible steps for RMA of FC:

Isolate the Chassis (No Traffic)

Config chassis shut on FC (Enhancement possible: To see if we can have SONiC Service also gracefully stopped here. Need to check.)

Unplug the FC

Plug new FC (post-RMA)

config chassis startup on FC

Config reload on Supervisor

Steps to Reload FC: (Link/Parity/cell Error are seen and Platform Vendor recommendation to reload FC. Not at stage for RMA but Recoverable)

What about LC ASIC bad ? In Such case action need to be taken from LC perspespective based on LC Alert.
These are not frequent errors.

If N+ 1 redundancy:-

Config chassis shut on FC (Enhancement possible: To see if we can have SONiC Service also gracefully stopped here. Need to check.)

config chassis startup on FC

else
See above RMA process

FC reload/shutdown scenario to be discussed separately.
Beside above cases (graceful handling of FC shutdown/reload), another case to discuss:
If chassis running on less FC(s)/FC-NPUs, then what is SONiC's expectation? Depending upon bandwidth impact, isolate the entire chassis or isolate (shutdown) impacted FC(s) or replace impacted FC(s)?

reboot

similar to this, there should be 'shutdown CLI to shutdown specified module (LC/FC).

config chassis startup
config chassis shutdown

intent/goal of these commands is to do config shutdown or startup (reload/bring-up) of specified module.
wondering as to why 'chassis' keyword is there in these commands?

In my opinion when a FC goes bad/down, expect syslog be generated. For system that has extra FC redundancy I think losing one FC does not impact the overall operation of the chassis. So in this case the syslog should cause an alarm to allow schedule maintenance service to replace the FC. If loosing another FC or chassis has no FC redundancy then the lost of FC syslog should clearly indicate it is running in degraded mode with expected traffic impact. This syslog should cause alarm service to detect and trigger mitigation steps (whether be Admin user intervention or automation to start isolate it). The chassis itself can not tell if there are redundancy built into the involved network and if trigger self isolation it might causes more customer impact... This is just my own opinion...

config chassis startup <module-name> config chassis shutdown <module-name> reboot <module-name>

Note that the "Reboot " is an Ungraceful reload of the LC. It should not be expected that Supervisor will orchestrate a graceful reboot of the LC. Sup will simply cycle power to the LC.

abdosi · 2022-03-16T16:39:37Z

doc/pmon/pmon-chassis-requirements.md

+		7. HW-Watchdog adhering to current SONiC behavior. Start before reboot and explicitly disabled post reboot by SONiC (This means SONiC is booted up and Services are fine)
+		8. chassisd daemon support on both LC and RP with all fields of table "CHASSIS_MODULE_TABLE|xxxx” correctly populated
+		9. chassisd daemon support populating fields in table "CHASSIS_ASIC_TABLE|xxx", this is used to start swss/syncd in SUP when FABRIC ASIC is ready.
+        10. Slot Nummber in "CHASSIS_MODULE_TABLE|xxxx” need not be unique ? Slot Number is based on physcial layout (Ex: LC can be back facing and can have 0..n and  FC can be Front facing and be 0.n). chassisd can support this model ?


Physcial slot Id need to be unique and can use sticker/label name based on given platform vendor. Use Case: Technician to identify the given Card based on Visual Inspection

FYI - this change requires changing the get_slot (in module.py) and get_supervisor_slot and get_my_slot (in chassis.py) PMON API's to return a string instead of an int.

However, in platform api sonic-mgmt tests this field is expected to be an int (example: https://github.com/Azure/sonic-mgmt/blob/master/tests/platform_tests/api/test_module.py#L334)

Do we modify the tests to be either int or str for now to allow backward compatibility till all vendors implement this enhancement?

abdosi · 2022-03-16T17:02:20Z

doc/pmon/pmon-chassis-requirements.md

+		9. chassisd daemon support populating fields in table "CHASSIS_ASIC_TABLE|xxx", this is used to start swss/syncd in SUP when FABRIC ASIC is ready.
+        10. Slot Nummber in "CHASSIS_MODULE_TABLE|xxxx” need not be unique ? Slot Number is based on physcial layout (Ex: LC can be back facing and can have 0..n and  FC can be Front facing and be 0.n). chassisd can support this model ?
+		10. psud power algorithm on supervisor as specified in chassis design document
+		11. PSU LED Status  in the show command of supervisor


Ref for Point10: https://github.com/Azure/SONiC/blob/master/doc/pmon/pmon-chassis-design.md please check if there is Platform API for setting Master LED for PSU. API is there: https://github.com/Azure/sonic-platform-common/blob/master/sonic_platform_base/psu_base.py#L226

Ref for Point 11: show platform psustatus to display LED current color (based on current running status of PSU)

abdosi · 2022-03-16T17:14:17Z

doc/pmon/pmon-chassis-requirements.md

+		11. PSU LED Status  in the show command of supervisor
+		12. TEMPERATURE_INFO table update into Chassis State DB from both Supervisor and LC. Local TEMPERATURE_INFO is also available in LC STATE_DB.
+		13. Fan speed algorithm on supervior as specified in chassis design document
+		14. FAN LED Status in the show command of supervisor


fan tray led display enhancement. Need to check if sonic has component for Fan-Tray/Fan-Drawer.
For now given Vendor can overload on Fan Led

abdosi · 2022-03-16T17:25:56Z

doc/pmon/pmon-chassis-requirements.md

+		12. TEMPERATURE_INFO table update into Chassis State DB from both Supervisor and LC. Local TEMPERATURE_INFO is also available in LC STATE_DB.
+		13. Fan speed algorithm on supervior as specified in chassis design document
+		14. FAN LED Status in the show command of supervisor
+		15. reboot-cause reason and history is working fine for both RP and LC


Ref: doc/system-telemetry/reboot-cause.md

Platform Reference: https://github.com/Azure/sonic-platform-common/blob/master/sonic_platform_base/chassis_base.py#L92

abdosi · 2022-03-16T17:34:44Z

doc/pmon/pmon-chassis-requirements.md

+		13. Fan speed algorithm on supervior as specified in chassis design document
+		14. FAN LED Status in the show command of supervisor
+		15. reboot-cause reason and history is working fine for both RP and LC
+		16. show commands for mid-plane switch as per Chassis Design Document. Add namespace parameter support for "show chassis midplane-status" command.  


Need to check. show ip interface -n <asic> -d all should be displaying it.

abdosi · 2022-03-16T17:56:12Z

doc/pmon/pmon-chassis-requirements.md

+		4. Generate syslog for all the critical events and share the threshold (for appropriate/needed components)  in documents and recommended for given threshold range.  Expectation is we will bind syslog to our Alert Orchestration system and perform recommnded action based on the documents.
+		5. PCI-e issue of not able to detect FC ASIC’s and LC ASIC’s and syslog for same.
+		Integrate with pcied process in PMON[sonic-platform-daemons/pcied at master · Azure/sonic-platform-daemons (github.com)]. Note: Current PCI daemon polling for pci devices is 60sec which is large poll interval. Does it need optimization ?
+		6. Boot-up failure Handling. Need to see the SONiC behaviour from system perspective/docker status/syslog getting generated with required/correct information


This is more to understand the behavior and identify any missing test-gap to cover this.

abdosi · 2022-03-18T16:37:45Z

doc/pmon/pmon-chassis-requirements.md

+		16. show commands for mid-plane switch as per Chassis Design Document. Add namespace parameter support for "show chassis midplane-status" command.  
+
+2. Section2: General Chassis Enhancements that are needed:-
+		1. LC/FC Fabric Link down Handling


Data-path component. Not in scope of platform. Expectation to have atleast monitoring and alert/syslog in such cases and action needed to be taken in such scenarios.

abdosi · 2022-03-18T16:54:19Z

doc/pmon/pmon-chassis-requirements.md

+
+2. Section2: General Chassis Enhancements that are needed:-
+		1. LC/FC Fabric Link down Handling
+		2. Module/Chassis/Board LED’s .  Need general infra enhancement of led daemon and show commands


New Design Document is needed. Need to discuss in SONiC Community,

abdosi · 2022-03-18T17:10:49Z

doc/pmon/pmon-chassis-requirements.md

+		1. LC/FC Fabric Link down Handling
+		2. Module/Chassis/Board LED’s .  Need general infra enhancement of led daemon and show commands
+		3. LC/FC  operation status detection quicker using (get_change_event() notification handling to detect async card up/down events) rather than using current Polling Interval of 10 sec
+		4. Generic console for LC using . Possible using this: https://github.com/Azure/SONiC/blob/master/doc/console/SONiC-Console-Switch-High-Level-Design.md ?


need more analysis from platform vendor's and will need revisit.

abdosi · 2022-03-18T17:13:20Z

doc/pmon/pmon-chassis-requirements.md

+		2. Module/Chassis/Board LED’s .  Need general infra enhancement of led daemon and show commands
+		3. LC/FC  operation status detection quicker using (get_change_event() notification handling to detect async card up/down events) rather than using current Polling Interval of 10 sec
+		4. Generic console for LC using . Possible using this: https://github.com/Azure/SONiC/blob/master/doc/console/SONiC-Console-Switch-High-Level-Design.md ?
+		5. Process for RMA the card (Fabric/LC). This is just a discussion to document correct process for doing so.


Platform vendor recommendation/guidance needed here.

abdosi · 2022-03-23T16:11:06Z

doc/pmon/pmon-chassis-requirements.md

+		3. LC/FC  operation status detection quicker using (get_change_event() notification handling to detect async card up/down events) rather than using current Polling Interval of 10 sec
+		4. Generic console for LC using . Possible using this: https://github.com/Azure/SONiC/blob/master/doc/console/SONiC-Console-Switch-High-Level-Design.md ?
+		5. Process for RMA the card (Fabric/LC). This is just a discussion to document correct process for doing so.
+		6. Monit check on the supervisor to check if the LCs are  reachable. This is to alert if the linecard is down. Do we need Monit here or use above 10 sec polling ?


If CHASSIS_MIDPLANE_TABLE have the information which monit can read from there else we need to see if we can push to DB.

abdosi · 2022-03-23T16:26:54Z

doc/pmon/pmon-chassis-requirements.md

+		4. Generic console for LC using . Possible using this: https://github.com/Azure/SONiC/blob/master/doc/console/SONiC-Console-Switch-High-Level-Design.md ?
+		5. Process for RMA the card (Fabric/LC). This is just a discussion to document correct process for doing so.
+		6. Monit check on the supervisor to check if the LCs are  reachable. This is to alert if the linecard is down. Do we need Monit here or use above 10 sec polling ?
+		7. Handling of parallel reboot of linecard and supervisor. This should not result in the chassis/linecard to go down or unreachable. (Mention by Arvind) . If we follow Section 1 Point 2 this should           be handled ? 


need to add more test case to cover different scenarios here.

abdosi · 2022-03-23T16:28:25Z

doc/pmon/pmon-chassis-requirements.md

+		5. Process for RMA the card (Fabric/LC). This is just a discussion to document correct process for doing so.
+		6. Monit check on the supervisor to check if the LCs are  reachable. This is to alert if the linecard is down. Do we need Monit here or use above 10 sec polling ?
+		7. Handling of parallel reboot of linecard and supervisor. This should not result in the chassis/linecard to go down or unreachable. (Mention by Arvind) . If we follow Section 1 Point 2 this should           be handled ? 
+		8. Mechanism to recover an down/unreachable linecard without power-cycle or reboot of the whole chassis.


config chassis startup <module-name> ==> Power on LC (if platform can do it)
config chassis shutdown <module-name> ===> Power off LC ( if Platform can do it)
reboot <module-name> ===> Power on/off toggle for LC (if platform can do it) or CPU reset toggle for LC

Worst case we need to power-cycle of chassis from external agent.

For clarity to platforms/vendors, though these commands are under "config" but they are only executed (and not saved) until config save is issued.

abdosi · 2022-03-23T16:49:34Z

doc/pmon/pmon-chassis-requirements.md

+		6. Monit check on the supervisor to check if the LCs are  reachable. This is to alert if the linecard is down. Do we need Monit here or use above 10 sec polling ?
+		7. Handling of parallel reboot of linecard and supervisor. This should not result in the chassis/linecard to go down or unreachable. (Mention by Arvind) . If we follow Section 1 Point 2 this should           be handled ? 
+		8. Mechanism to recover an down/unreachable linecard without power-cycle or reboot of the whole chassis.
+		9. Enhance "Show chassis module status" command for linecard should display hostname iso of generic names like LINECARD1


Show chassis module status is platform specific command . May need another command in SONiC /enhancement.

abdosi · 2022-03-23T16:54:19Z

doc/pmon/pmon-chassis-requirements.md

+		7. Handling of parallel reboot of linecard and supervisor. This should not result in the chassis/linecard to go down or unreachable. (Mention by Arvind) . If we follow Section 1 Point 2 this should           be handled ? 
+		8. Mechanism to recover an down/unreachable linecard without power-cycle or reboot of the whole chassis.
+		9. Enhance "Show chassis module status" command for linecard should display hostname iso of generic names like LINECARD1
+		10. Support "show system-health detail/monitor-list/summary" commands in RP/LC


Ref: https://github.com/Azure/SONiC/blob/e1744f1ff05916d2b61b429f67e569fb3e29d0a6/doc/system_health_monitoring/system-health-HLD.md

abdosi · 2022-03-23T17:05:05Z

doc/pmon/pmon-chassis-requirements.md

+
+3. Section3 : Enhancements based on Significat Design Changes 
+		1. Auto Handling by Platfrom SW to reboot/shutdown the HW Component when detecting the critical Fault’s.
+		2. Temperature Measuring Category Enhancements. More Granular and Increase Polling Interval for same. Also show command optimize not dump all sesors and filter based on location


Need Sonic Design Document. Cisco can propose something on this.

abdosi · 2022-03-23T17:13:25Z

doc/pmon/pmon-chassis-requirements.md

+3. Section3 : Enhancements based on Significat Design Changes 
+		1. Auto Handling by Platfrom SW to reboot/shutdown the HW Component when detecting the critical Fault’s.
+		2. Temperature Measuring Category Enhancements. More Granular and Increase Polling Interval for same. Also show command optimize not dump all sesors and filter based on location
+		3. Move Voltage and Current sensors support from existing sensorsd/libsensors model to PMON/ thermalCtld model This provide Ability/mechanism in SONiC NOS to poll for board’s Voltage and Current sensors (from platform) for power alogorithm.


Need SONiC Design Document. Cisco can propose something on this.

abdosi · 2022-03-23T17:17:32Z

doc/pmon/pmon-chassis-requirements.md

+		1. Auto Handling by Platfrom SW to reboot/shutdown the HW Component when detecting the critical Fault’s.
+		2. Temperature Measuring Category Enhancements. More Granular and Increase Polling Interval for same. Also show command optimize not dump all sesors and filter based on location
+		3. Move Voltage and Current sensors support from existing sensorsd/libsensors model to PMON/ thermalCtld model This provide Ability/mechanism in SONiC NOS to poll for board’s Voltage and Current sensors (from platform) for power alogorithm.
+        4. Midplane Switch Counters (Debugging) /Modifying QOS Properties if needed (Performance) 


each platform vendor can provide some document to debug the midplane drop and any optimization that we need to do.

Based on PR sonic-net/SONiC#945, we should return the sticker/label name on the chassis for the physical slot id in the get_supervisor_slot PMON API and 'show chassis module status' command. For Nokia linecards, the sticker label for supervisor is 'A'. Thus we need to allow for string as possible return value as well - apart for int.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi and others added 8 commits November 8, 2021 22:55

Order ECMP HLD.

8ea680a

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Added Picture

8ce9566

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Update ordered_ecmp_next_hop_hld.md

06f3e94

Update ordered_ecmp_next_hop_hld.md

c7da583

Address Review Comment

bfed708

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Merge branch 'order_ecmp' of https://github.com/abdosi/SONiC into ord…

a18bf56

…er_ecmp

Merge remote-tracking branch 'upstream/master' into order_ecmp

e74fd46

Added Requirements for PMON for chassis based systems.

f206d64

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi commented Feb 23, 2022

View reviewed changes

abdosi changed the title ~~PMON~~ SONiC Chassis Platform Requirements and Enhancements Analysis Mar 2, 2022

abdosi commented Mar 2, 2022

View reviewed changes

abdosi commented Mar 16, 2022

View reviewed changes

Update pmon-chassis-requirements.md

68db94a

abdosi commented Mar 16, 2022

View reviewed changes

abdosi commented Mar 18, 2022

View reviewed changes

abdosi commented Mar 23, 2022

View reviewed changes

sanmalho-git mentioned this pull request Apr 15, 2022

Support both str and int type for slot_id received in pmon API sonic-net/sonic-mgmt#5518

Merged

5 tasks

yxieca force-pushed the master branch 2 times, most recently from 8498931 to 8837dc2 Compare April 15, 2022 16:51

abdosi added the chassis label Apr 21, 2022

rlhui assigned abdosi Mar 12, 2023

bmridul mentioned this pull request Oct 26, 2023

Add platform test for kdump sonic-net/sonic-mgmt#10047

Merged

6 tasks

abdosi added 2 commits July 6, 2024 02:35

Updated to MD5 format

8157977

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

More fix

97e178f

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

rlhui approved these changes Jul 6, 2024

View reviewed changes

rlhui merged commit 66af3be into sonic-net:master Jul 6, 2024
1 check passed

		@@ -0,0 +1,37 @@
		Section 1 Requirements that are needed by default:-
		1. On LC the reboot command should power-cycle the entire LC . Expectation is Peer node should detect link down when reboot is given on LC

SONiC Chassis Platform Requirements and Enhancements Analysis #945

SONiC Chassis Platform Requirements and Enhancements Analysis #945

Conversation

abdosi commented Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

shyam77git Mar 7, 2022 • edited Loading

Choose a reason for hiding this comment

abdosi Mar 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rlhui commented Mar 2, 2022

abdosi commented Mar 2, 2022

abdosi commented Mar 2, 2022

abdosi commented Mar 2, 2022

abdosi commented Mar 2, 2022

Choose a reason for hiding this comment

abdosi Mar 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 2, 2022 • edited Loading

Choose a reason for hiding this comment

abdosi Mar 2, 2022 • edited Loading

Choose a reason for hiding this comment

abdosi Mar 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shyam77git Mar 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 16, 2022 • edited Loading

Choose a reason for hiding this comment

abdosi Mar 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abdosi Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

abdosi commented Feb 23, 2022 •

edited

Loading

abdosi Feb 23, 2022 •

edited

Loading

shyam77git Mar 7, 2022 •

edited

Loading

abdosi Mar 9, 2022 •

edited

Loading

abdosi Mar 9, 2022 •

edited

Loading

abdosi Mar 9, 2022 •

edited

Loading

abdosi Mar 2, 2022 •

edited

Loading

abdosi Mar 2, 2022 •

edited

Loading

abdosi Mar 2, 2022 •

edited

Loading

shyam77git Mar 7, 2022 •

edited

Loading

abdosi Mar 16, 2022 •

edited

Loading

abdosi Mar 16, 2022 •

edited

Loading

abdosi Mar 18, 2022 •

edited

Loading

abdosi Mar 23, 2022 •

edited

Loading

abdosi Mar 23, 2022 •

edited

Loading

abdosi Mar 23, 2022 •

edited

Loading

abdosi Mar 23, 2022 •

edited

Loading