-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ntp] enable/disable NTP long jump according to reboot type #4577
Conversation
- Enable NTP long jump after cold reboot. - Disable NTP long jump after warrm/fast reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
retest this please |
files/image_config/ntp/ntp-config.sh
Outdated
|
||
function get_database_reboot_type() | ||
{ | ||
SYSTEM_WARM_START=`/usr/bin/redis-cli -n 6 hget "WARM_RESTART_ENABLE_TABLE|system" enable` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use sonic-db-cli for multi-db support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd suggest to use syslog as addition to all echo output.
- Please check my another comment.
What was exactly the issue caused by NTP long jump? Could that happen even if NTP long jump disabled? I feel we should fix the source of the issue rather than modifying NTP config. |
Unfortunately using monotonic clock when checking time difference is not a well established behavior. We fixed the known issues. But there are open source code that we don't have control over. We are afraid similar issue might happen in the future. |
|
That is fair, but I still failed to understand the issue itself. which daemons caused the issues and what were the symptoms? why cold boot is fine? and why disable NTP long jump is guaranteed to work? Sorry for too many questions. :-) |
This fix/issue: sonic-net/sonic-sairedis#613 |
* [ntp] enable/disable NTP long jump according to reboot type - Enable NTP long jump after cold reboot. - Disable NTP long jump after warrm/fast reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * fix typo * further refactoring * use sonic-db-cli instead
* [201911][devices] skip_fancontrol for wedge 100 barefoot platforms (sonic-net#4528) * [device] DellEMC s5232f 50G hwsku support (sonic-net#4525) * [device] DellEmc S5232 support for new hwsku C8D48 8 100G ports and 48 50G ports * 10G ports update for S5232 hwsku-C8D48 Signed-off-by: Srideep Devireddy <srideep_devireddy@dell.com> * DellEMC S6000 updated sensors.conf (sonic-net#4568) Change PSU MAX temperature to 80 degree Change tmp75 sensors default temperature value from 25/50 to 70/80 degree. * [sonic-slave-stretch]: install same version for docker-ce and docker-ce-cli difference versions can cause compatibility issue between the server and client Signed-off-by: Guohan Lu <lguohan@gmail.com> * [baseimage]: install same version for docker-ce and docker-ce-cli Signed-off-by: Guohan Lu <lguohan@gmail.com> * [FRR]: Update frr to latest 7.2.1-s3 (sonic-net#4294) - Updated to latest frr 7.2.1 from the master. - Updated patches accordingly * [sonic-buildimage] updated minigraph for ACL Table data and ACL Interface Binding for Multi-NPU platforms (sonic-net#4491) * [sonic-buildimage] updated minigraph for ACL Table data and ACL Interface binding update for multu-npu platform based on subrole as "Frontend" or "Backend". For backend npu no ACL table is associated. For frontend npu only front-panel interface are associated. Updated with test case and fix typo in sample-mingraph for npu Address Review comments Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Fixed the logic as per preview comment. Interface Filter logic only applies to Everflow/Mirror tables. * Address Review Comments. * Changes for LLDP docker to support multi-npu platforms (sonic-net#4530) * Changes for LLDP for Multi NPU Platoforms:- a) Enable LLDP for Host namespace for Management Port b) Make sure Management IP is avaliable in per asic namespace needed for LLDP Chassis configuration c) Make sure chassis mac-address is correct in per asic namespace d) Do not run lldp on eth0 of per asic namespace and avoid chassis configuration for same e) Use Linux hostname instead from Device Metadata for lldp chassis configuration since in multi-npu platforms device metadata hostname will be differnt Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comment with following changes: a) Use Device Metadata hostname even in per namespace conatiner. updated minigraph parsing for same to have hostname as system hostname and add new key for asic name b) Minigraph changes to have MGMT_INTERFACE Key in per asic/namespace config also as needed for LLDP for setting chassis management IP. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments * Moved utility functions for multi-npu platforms from sonic-utilities to sonic_device_util.py (sonic-net#4559) * Moved utility functions for multi-npu platforms from sonic-utilities config/main.py to here so that they can be used any module * Fix the issue with test run during compilation with acl-uploader PR#908 of sonic-utilities. * Fix get_num_npu as it was retuning string and not int * Address Review Comments * Address Review Comments * Fix for issue where image is compile with flag ENABLE_DHCP_GRAPH_SERVICE (sonic-net#4573) and then we load image and reboot even if there was existing config_db.json we will look for DHCP Service. we should disbale update_graph in such cases. This behaviour is silimar to what we have in 201811 image. * Change to enable redistribute connected on Frontend asics instead of backend asics (sonic-net#4588) Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> * [DellEMC] S6000 Disable Low power mode by default (sonic-net#4592) * [BFN] Updated Barefoot SDK to 2020-05-07 (sonic-net#4566) Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com> * [minigraph] Add tags for egress mirror tables (sonic-net#4526) Signed-off-by: Danny Allen <daall@microsoft.com> * [Submodule update] sonic-utlities with PR's [201911][show] Fix abbreviations for 'show ip bgp ...' commands (sonic-net#909) Changes to support acl-loader and mirror-session config commands for multi-npu platforms. (sonic-net#908) Changes to commands config reload/load-minigraph (sonic-net#919) Stop/Start restapi server upon config reload (sonic-net#911) [config] Add 'interface transceiver' subgroup with 'lpmode' and 'reset' subcommands (sonic-net#904) * [minigraph] Support FECDisabled in minigraph parser (sonic-net#4556) (sonic-net#4624) Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com> * [ntp] enable/disable NTP long jump according to reboot type (sonic-net#4577) * [ntp] enable/disable NTP long jump according to reboot type - Enable NTP long jump after cold reboot. - Disable NTP long jump after warrm/fast reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com> * fix typo * further refactoring * use sonic-db-cli instead * [arista]: remove the soc property disabling sram scan (sonic-net#4623) * Changes to support config-setup service for multi-npu (sonic-net#4609) * Changes to support config-setup service for multi-npu platforms. For Multi-npu we are not supporting as of now config initializtion and ZTP. It will support creating config db from minigraph or using config db from previous file system Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Address Review Comments. * Address Review comments * Address Review Comments of using pyhton based config load_minigraph/ config save/config reload from shell scripts so that we don't duplicate code. Also while running from shell we will skip stop/start services done by those commands. * Updated to use python command so no code duplication. * [config]: Fix the device type and internal bgp session status for multi NPU platforms (sonic-net#4600) * The following changes for multi-npu platforms are done - Set the type in device_metadata for asic configuration to be same as host - Set the admin-status of internal bgp sessions as up Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com> * Adding new BGP peer groups PEER_V4_INT and PEER_V6_INT. (sonic-net#4620) * Adding new BGP peer groups PEER_V4_INT and PEER_V6_INT. The internal BGP sessions will be added to this peer group while the external BGP sessions will be added to the exising PEER_V4 and PEER_V6 peer group. * Check for "ASIC" keyword in the hostname to identify the internal neighbors. * [submodule update] sonic-swss with PR [vnet] Fix IP2ME route creation logic for BITMAP VNET interface (sonic-net#1284) * [submodule update] sonic-util Revert "[config] Add 'interface transceiver' subgroup with 'lpmode' and 'reset' subcommands (sonic-net#904)" Multi-asic changes for config bgp commands and utilities. (sonic-net#910) * [submodule update] sonic-rest API's PR#39 Setup module versioning Add support for get all Vlans (#37) * Update golang version for 1.11.5 to 1.14.2 (sonic-net#4520) Co-authored-by: Myron Sosyak <49795530+msosyak@users.noreply.github.com> Co-authored-by: Srideep <srideep_devireddy@dell.com> Co-authored-by: paavaanan <paavaanan_t_n@dell.com> Co-authored-by: Guohan Lu <lguohan@gmail.com> Co-authored-by: pavel-shirshov <pavelsh@microsoft.com> Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com> Co-authored-by: arlakshm <55814491+arlakshm@users.noreply.github.com> Co-authored-by: Santhosh Kumar T <53558409+santhosh-kt@users.noreply.github.com> Co-authored-by: Andriy Kokhan <43479230+akokhan@users.noreply.github.com> Co-authored-by: Danny Allen <daall@microsoft.com> Co-authored-by: Abhishek Dosi <abdosi@microsoft.com> Co-authored-by: Qi Luo <qiluo-msft@users.noreply.github.com> Co-authored-by: Ying Xie <yxieca@users.noreply.github.com> Co-authored-by: Samuel Angebault <staphylo@arista.com> Co-authored-by: judyjoseph <53951155+judyjoseph@users.noreply.github.com>
- Why I did it
NTP long jump could still causing unexpected failures. It is more dangerous after fast/warm reboot when reliability is on the line.
- How I did it
Signed-off-by: Ying Xie ying.xie@microsoft.com
- How to verify it
cold reboot:
May 12 04:20:12.791159 str-7260cx3-acs-1 INFO systemd[1]: Starting Update NTP configuration...
May 12 04:20:13.928988 str-7260cx3-acs-1 INFO ntp-config.sh[3829]: Enabling NTP long jump for reboot type cold ...
admin@str-7260cx3-acs-1:$ cat /etc/default/ntp
NTPD_OPTS='-g'
warm reboot:
May 12 04:37:45.971592 str-7260cx3-acs-1 INFO systemd[1]: Starting Update NTP configuration...
May 12 04:37:47.329887 str-7260cx3-acs-1 INFO ntp-config.sh[3933]: Disabling NTP long jump for reboot type warm ...
admin@str-7260cx3-acs-1:$ cat /etc/default/ntp
NTPD_OPTS='-x'
fast reboot:
May 12 04:40:39.773005 str-7260cx3-acs-1 INFO systemd[1]: Starting Update NTP configuration...
May 12 04:40:40.860081 str-7260cx3-acs-1 INFO ntp-config.sh[3787]: Disabling NTP long jump for reboot type fast ...
admin@str-7260cx3-acs-1:$ cat /etc/default/ntp
NTPD_OPTS='-x'