-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker teamd restart failed to create all port channels #1154
Comments
Unable to reproduce on another device. |
during reboot it will also show:
Even with successfullly started teamd processes:
|
Appears to be related to my recent change, which causes all teamd processes to be started simultaneously by supervisor. With enough processes started simultaneously, we can potentially run out of available memory. On my test device, I'm only starting four teamd processes, thus I cannot reproduce it. |
Closing as offending change was reverted here: #1156 Either teamd will need to be modified to allow starting multiple processes at once without consuming all available memory, in which we can re-commit the change, or we will need to find a workaround to start the processes sequentially, which is not easy to do with supervisor. |
Update sonic-sairedis submodule pointer to include the following: 402eb14 [ppi]: Enable bulk API. (#1171) 86bb828 Switch to using stock gcovr 5.2 (#1174) 1c9ca78 Manage LANES mapping on VOQ system (#1127) 5887d31 Fix for [EVPN] When MAC moves from remote end point to local, ASIC DB fields are not updated properly for the mac #11503Update NotificationProcessor.cpp (#1118) 559bd5b [ci][asan] add DVS tests run with ASAN (#1139) 4ab46b5 Initialize attr variables in Legacy.switch_get and LegacyFdbEntry.fdb_entry_get (#1169) 4e24c77 The meta_sai_validate_fdb_entry() validates the input FDB entry for the (#1154) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
The current master:
After running
systemctl restart teamd
, not all teamd daemons start.After manually calling the command, it shows the following error:
----------------------------------------------------------------------------------------------------
Update:
The device I am using:
What I see:
port channels are not started successfully. Sometimes partial port channels are created, sometimes, none of them is created. If I go into the docker, I could observe something like below:
It shows that not all port channels are started. Besides, for the already started port channels, they are also in bad states:
By looking at the syslog, I notice:
for all port channels that are not started yet.
How to reproduce this issue
The current topology is t1-lag: we have 8 LAGs in total each of the LAG having two member ports.
By repeatedly rebooting the system, it is fairly easy to get into the current situation just by checking the output of the command
teamshow
.The text was updated successfully, but these errors were encountered: