-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BGP sessions are not established for both IPv4 and IPv6 due to IPv4&6 addresses on all Port channel lost after config reload #3043
Comments
I found sth strange when the ip address on portchannels lost. See blow. In a config reload procedure where ip addresses are lost, teammgrd has been stopped/started twice, which, I suspect, causes the ip addresses just set lost. However, in a normal config reload procedure, teamd has been stopped/started only once, when ip addresses will not lost. The whole log has been attached to the issue in the github. The following logs are extracted from the entire log.
TEAMD container stopped
TEAMD container restarting
teammgrd & teamsyncd stopped
teammrgd starting for the first time
intfmgrd setting the ip addresses for PortChannels
here, teamd has been stopped and started for the second round, which may cause addresses lost.
after teammgrd started for the second round, it may re-issue commands that destroy/create port channel again, causing what has been already set on those port channels, including ip addresses, lost. |
what the config reload procedure regarding team stuff looks like in a normal procedure: team stuff stopped
teammgrd started
intfmgrd started
setting ip addresses
no teammgrd restarting for the second round found. |
Fix in #3114 |
the issue that some docker stopped/started twice has been fixed by [config] Do not stop or restart dependent services when reloading config #582. |
Description
After issuing a "config reload" ipv4&6 addresses on all PortChannels have been lost in kernel protocol stack. "ifconfig PortChannelxxx" shows only ipv6 link local address existing.
It seems that thing goes well on SONiC side:
intfmgr has handled the CONFIG_DB update and issued the "ip address add xxx dev PortChannelxxx" command (which will add address to kernel). according to the log, no error is reported for this command and no command that delete address has been issued.
orchagent and sai also work well. the newly added ip address has been popagated to APPL_DB and SAI side.
I guess there might be some race conditions.
probability of reproducing: less than 20%.
Steps to reproduce the issue:
in testbed with topo t0 or t1-lag, execute:
sudo config reload
the config_db.json is a default config file for t0 or t1-lag topo.
Describe the results you received:
ip address on all PortChannels have been lost, only ipv6 link local address existing.
admin@mtbc-sonic-01-2410:/proc/net$ sudo ifconfig PortChannel0001
PortChannel0001: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9100
inet6 fe80::9a03:9bff:fef3:f500 prefixlen 64 scopeid 0x20
ether 98:03:9b:f3:f5:00 txqueuelen 1000 (Ethernet)
Describe the results you expected:
after config reload, the ip addresses should not be lost.
Additional information you deem important (e.g. issue happens only occasionally):
***reproducing log
syslog
related dmesg
admin@mtbc-sonic-01-2410:~$ show version
SONiC Software Version: SONiC.HEAD.7-d67c6d4b
Distribution: Debian 9.9
Kernel: 4.9.0-8-2-amd64
Build commit: d67c6d4
Build date: Sun Jun 16 09:18:35 UTC 2019
Built by: johnar@jenkins-worker-4
The text was updated successfully, but these errors were encountered: