Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast-reboot: add a new flag to ignore ASIC checksum verification failure #1292

Merged
merged 1 commit into from
Dec 7, 2020

Conversation

vaibhavhd
Copy link
Contributor

- What I did
This is to fix/enhance the issue sonic-net/sonic-buildimage#5972

warm-reboot with force flag ignores ASIC config checksum mismatch along with orchagent RESTARTCHECK failure.
There can be a use case when checksum-verification should be ignored but orchagent pause check should not be ignored.

- How I did it
Added a new option in fast-reboot script to ignore ASIC checksum verification failures.

- How to verify it

Reproduced the issue locally:

root@sonic:~# warm-reboot -vvv
Sat 05 Dec 2020 02:14:38 AM UTC Saving counters folder before warmboot...
ASIC config may have changed: errno=1
Sat 05 Dec 2020 02:14:40 AM UTC warm-reboot failure (1) cleanup ...
Sat 05 Dec 2020 02:14:41 AM UTC Cancel warm-reboot: code (1)
root@sonic:~# 

With the local change ASIC configuration checksum verification is ignored.

root@sonic:~# warm-reboot -h
Usage: warm-reboot [options]
    -h,-? : get this help
    -v    : turn on verbose
    -f    : force execution
    -i    : ignore MD5-checksum-verification of ASIC configuration files
    -r    : reboot with /sbin/reboot
    -k    : reboot with /sbin/kexec -e [default]
    -x    : execute script with -x flag
    -c    : specify control plane assistant IP list
    -s    : strict mode: do not proceed without:
            - control plane assistant IP list.
root@sonic:~# 
root@sonic:~# warm-reboot -vvvi
Sat 05 Dec 2020 02:14:46 AM UTC Saving counters folder before warmboot...
Sat 05 Dec 2020 02:14:47 AM UTC Ignoring ASIC config checksum failure...
Sat 05 Dec 2020 02:14:48 AM UTC Pausing orchagent ...
Sat 05 Dec 2020 02:14:48 AM UTC Collecting logs to check ssd health before warm-reboot...
Sat 05 Dec 2020 02:14:49 AM UTC Stopping nat ...
Dumping conntrack entries failed
Error response from daemon: Cannot kill container: nat: No such container: nat
Sat 05 Dec 2020 02:14:49 AM UTC Stopped nat ...
Sat 05 Dec 2020 02:14:49 AM UTC Stopping radv service...
Sat 05 Dec 2020 02:14:50 AM UTC Stopped radv service...
Sat 05 Dec 2020 02:14:50 AM UTC Stopping bgp ...
Sat 05 Dec 2020 02:14:52 AM UTC Stopped bgp ...
Sat 05 Dec 2020 02:14:54 AM UTC Stopping swss service ...
Sat 05 Dec 2020 02:15:03 AM UTC Stopped swss service ...
Sat 05 Dec 2020 02:15:03 AM UTC Initialize pre-shutdown ...
Sat 05 Dec 2020 02:15:04 AM UTC Requesting pre-shutdown ...
Sat 05 Dec 2020 02:15:04 AM UTC Waiting for pre-shutdown ...
Sat 05 Dec 2020 02:15:05 AM UTC Pre-shutdown succeeded ...
Sat 05 Dec 2020 02:15:05 AM UTC Backing up database ...
Sat 05 Dec 2020 02:15:07 AM UTC Stopping teamd ...
Sat 05 Dec 2020 02:15:08 AM UTC Stopped teamd ...
Sat 05 Dec 2020 02:15:08 AM UTC Stopping syncd ...
Sat 05 Dec 2020 02:15:20 AM UTC Stopped syncd ...
Sat 05 Dec 2020 02:15:20 AM UTC Stopping all remaining containers ...
Warning: Stopping mgmt-framework.service, but it can still be activated by:
  mgmt-framework.timer
Sat 05 Dec 2020 02:15:22 AM UTC Stopped all remaining containers ...
Sat 05 Dec 2020 02:15:24 AM UTC updating ssd fw forwarm-reboot
Sat 05 Dec 2020 02:15:24 AM UTC Enabling Watchdog before warm-reboot
Watchdog armed for 180 seconds
Sat 05 Dec 2020 02:15:25 AM UTC Running x86_64-dell_s6100_c2538-r0 specific plugin...
Sat 05 Dec 2020 02:15:25 AM UTC Rebooting with /sbin/kexec -e to SONiC-OS-20181130.85 ...

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

@vaibhavhd vaibhavhd requested a review from yxieca December 5, 2020 02:42
@vaibhavhd vaibhavhd self-assigned this Dec 5, 2020
@vaibhavhd vaibhavhd merged commit 326e534 into sonic-net:master Dec 7, 2020
@vaibhavhd vaibhavhd deleted the fast-boot-checksum-ignore branch December 7, 2020 17:00
anand-kumar-subramanian pushed a commit to anand-kumar-subramanian/sonic-utilities that referenced this pull request Mar 2, 2021
…on failures (sonic-net#1292)

To fix the issue sonic-net/sonic-buildimage#5972
warm-reboot with force flag ignores ASIC config checksum mismatch along with orchagent RESTARTCHECK failure.
This commit accounts for a use case when checksum-verification should be ignored but orchagent pause check should not be ignored.
The change is to add a new option in fast-reboot script to ignore ASIC checksum verification failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants