-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[config] Adding sanity checks for config reload #1664
Conversation
This PR should fix sonic-net/sonic-buildimage#7132 |
/azp run |
|
||
def _system_running(): | ||
out = clicommon.run_command("sudo systemctl is-system-running", return_cmd=True) | ||
return out.strip() == "running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While verifying PR sonic-net/sonic-buildimage#8003. I hit this condition where I was I was not able to do "config reload". Even after system is up for a while the systemctl was showing running state as degraded
admin@str-s6000-acs-9:~$ sudo systemctl is-system-running
degraded
admin@str-s6000-acs-9:~$ sudo config reload -y
System is not up. Retry later or use -f to avoid system checks
admin@str-s6000-acs-9:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8baf489477e6 docker-snmp:latest "/usr/local/bin/supe…" 18 minutes ago Up 2 minutes snmp
ad835c8b42ba docker-sonic-mgmt-framework:latest "/usr/local/bin/supe…" 18 minutes ago Up 2 minutes mgmt-framework
930f188f20a7 docker-router-advertiser:latest "/usr/bin/docker-ini…" 20 minutes ago Up 5 minutes radv
15fd0337e6ea docker-dhcp-relay:latest "/usr/bin/docker_ini…" 20 minutes ago Up 5 minutes dhcp_relay
c37a6752f57f docker-lldp:latest "/usr/bin/docker-lld…" 20 minutes ago Up 5 minutes lldp
79cb8660e7d2 docker-syncd-brcm:latest "/usr/local/bin/supe…" 20 minutes ago Up 5 minutes syncd
77e8ca15b220 docker-teamd:latest "/usr/local/bin/supe…" 20 minutes ago Up 5 minutes teamd
ce63284a0f0a docker-orchagent:latest "/usr/bin/docker-ini…" 20 minutes ago Up 5 minutes swss
277e43234314 docker-fpm-frr:latest "/usr/bin/docker_ini…" 20 minutes ago Up 5 minutes bgp
d9c63106202a docker-platform-monitor:latest "/usr/bin/docker_ini…" 21 minutes ago Up 5 minutes pmon
34129d689ebd docker-database:latest "/usr/local/bin/dock…" 21 minutes ago Up 6 minutes database
I have seen cases where system remains in degraded state as may be one of the systemd service is not started. @dgsudharsan do share your observations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@judyjoseph I believe we should try to fix the service that is in the degraded state. The whole point of introducing these checks were to avoid issues around config reload which happen when the system was degraded. Until the faulty services are fixed we can expect config reload without force not to work. However to give a handle we introduced '-f' option which lets user to take own risk and perform config reload when the system may not be in a stable state.
Fixed config reload to add some system sanity checks 1) To check if the system is in running state 2) Check if the services which are grouped under delayed target up 3) Check if swss is running for at least 120 seconds To force config reload and to avoid these checks an extra option -f/--force is added
Updating command reference reflecting the latest config reload behavior #### What I did Update command reference for the config reload behavior. These changes were introduced by #1664
Updating command reference reflecting the latest config reload behavior #### What I did Update command reference for the config reload behavior. These changes were introduced by #1664
Updating command reference reflecting the latest config reload behavior #### What I did Update command reference for the config reload behavior. These changes were introduced by sonic-net/sonic-utilities#1664
DO NOT MERGE UNTIL sonic-net/sonic-buildimage#7846 is merged.
What I did
Fixed config reload to add some system sanity checks
1) To check if the system is in running state
2) Check if the services which are grouped under delayed target up
3) Check if swss is running for at least 120 seconds
To force config reload and to avoid these checks an extra option -f/--force is added
How I did it
Added the above mentioned checks in config reload flow
How to verify it
Perform config reload after reboot, between system up and delayed services up, immediately after another config reload and running two config reload in parallel. In all the cases the config reload should not execute and throw appropriate error messages.
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)
When system is not up
sonic:~#config reload
System is not up
When delayed services are not up
sonic:~#config reload
Services are not up
When swss is not up for at least 120 seconds
sonic:~# config reload -y
SWSS is not ready