Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dockers]: use supervisord dependent startup to start services #4599

Merged
merged 25 commits into from
May 22, 2020

Conversation

lguohan
Copy link
Collaborator

@lguohan lguohan commented May 14, 2020

- Why I did it
to remove multiple 'supervisorctl start' in the start.sh which consumes lots of cpus

- How I did it
use supervisord dependent startup plugin and list the service startup dependency in the supervisor config file.

other changes:

  • move rsyslogd as the first process to start to capture any logs (use -iNONE to avoid creating rsyslogd.pid)

- How to verify it
verify by manually modify the bgp docker (to build image the test)

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@lguohan lguohan linked an issue May 14, 2020 that may be closed by this pull request
@lguohan lguohan requested a review from qiluo-msft May 14, 2020 22:19
dockers/docker-base-buster/Dockerfile.j2 Outdated Show resolved Hide resolved
dockers/docker-base-stretch/Dockerfile.j2 Outdated Show resolved Hide resolved
dockers/docker-fpm-frr/start.sh Show resolved Hide resolved
Copy link
Collaborator

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering how cpu usage improved by this PR?

@lguohan
Copy link
Collaborator Author

lguohan commented May 15, 2020

Wondering how cpu usage improved by this PR?

I haven't measure, but from #4554, supervisorctl start cost 32 seconds in the startup stage. we can ask them to measure after the change. there are 69 "supervisorctl start" under dockers directory, and only 17 dockers. So, we can reduce 69-17=52 python process starts.

@lguohan lguohan force-pushed the ordered branch 3 times, most recently from 2416028 to 7db4952 Compare May 16, 2020 18:17
@lguohan lguohan marked this pull request as ready for review May 16, 2020 18:58
@lguohan lguohan force-pushed the ordered branch 2 times, most recently from bcb07bf to 2909659 Compare May 17, 2020 02:32
@lguohan lguohan changed the title [docker-bgp-frr]: use supervisord dependent startup to start services [dockers]: use supervisord dependent startup to start services May 17, 2020
@lguohan
Copy link
Collaborator Author

lguohan commented May 17, 2020

retest vsimage please

@lguohan lguohan force-pushed the ordered branch 2 times, most recently from 81c7b97 to 418411a Compare May 17, 2020 11:32
@lguohan
Copy link
Collaborator Author

lguohan commented May 22, 2020

retest vsimage please

Copy link
Collaborator

@stepanblyschak stepanblyschak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without-dependent-start
with-dependent-start

Now supervisorctl is not at the top of the list, a python process appeared and taking 17 sec (I assume python -m supervisord_dependent_startup), however it is much less than 41 sec by supervisorctl as in the picture above.

@lguohan
Copy link
Collaborator Author

lguohan commented May 22, 2020

thanks for the update. it appear like we need to substract the previous python consumption. 17-5=12 seconds, compare with 41 seconds from supervisorctl.

@lguohan lguohan merged commit ddd6368 into sonic-net:master May 22, 2020
@lguohan lguohan deleted the ordered branch May 22, 2020 18:01
lguohan pushed a commit that referenced this pull request Jul 28, 2020
Copy proper fancontrol config file to the proper destination. Also some minor refactoring for code reuse to help prevent issues like this in the future.

Fixes a bug introduced by #4599
lguohan pushed a commit that referenced this pull request Aug 16, 2020
Copy proper fancontrol config file to the proper destination. Also some minor refactoring for code reuse to help prevent issues like this in the future.

Fixes a bug introduced by #4599
@jleveque jleveque mentioned this pull request Aug 21, 2020
3 tasks
jleveque added a commit that referenced this pull request Aug 21, 2020
**- Why I did it**

PR #4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).
noaOrMlnx added a commit to noaOrMlnx/sonic-buildimage that referenced this pull request Aug 26, 2020
* [BFN] Add support pcied daemon for Montara and Newport (sonic-net#5199)

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>

* [cfggen] Allow Write To Redis DB With Template/Batch Mode (sonic-net#5203)

Argument to write to config-db is not allowed when using template.
This PR allows cfggen to write to redis db when using template
mode.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [submodule]: Advance sonic-snmpagent. (sonic-net#5213)

Update sonic-snmpagent submodule to include below commits:
1a2b62a [Namespace]: Fix SAI_ID key used in cpfcIfTable and csqIfQosGroupStatsTable implementation (sonic-net#138)
d06f00c [pytest/coverage]: add coverage support (sonic-net#156)
90e9f2e [Namespace]: Simplify sync_d functions to use higher order (sonic-net#154)
b5815d9 [LLDP]: Modify OID index of LLDPRemTableUpdater MIB (sonic-net#155)
d5f2b92 [Multiasic]: Provide namespace support for ipNetToMediaPhysAddress (sonic-net#129)
166c221 [Namespace]: Fix interface counters in RFC 1213 (sonic-net#145)

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>

* [cfggen] Conform With Python 3 Syntax (sonic-net#5154)

Preparing sonic-cfggen for migration to Python 3.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [redis-dump-load] Update submodule (sonic-net#5215)

* src/redis-dump-load 832a645...7585497 (2):
  > Merge pull request sonic-net#63 from jleveque/update_gitignore
  > Merge pull request sonic-net#59 from breser/redis-load-empty

* [services] Fix Delay Start of SNMP And Telemetry (sonic-net#5211)

SNMP and Telemetry services are not critical to switch startup.
They also cause fast-reboot not to meet timing requirements.
In order to delay start those service are associated with systemd
timer units, however when hostcfgd initiate service start, it start
the service and not the timer. This PR fixes this issue by
starting the timer associated with systemd unit.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [sonic-py-common][multi ASIC] API to get a list of frontend ports (sonic-net#5221)

* [sonic-py-common][multi ASIC] utility to get a list of frontend ports from a given list of ports

* [sonic-config-engine] Update .gitignore (sonic-net#5223)

- Ignore directories generated by building Python wheel package
- Move all sonic-config-engine ignores from the root .gitignore to src/sonic-config-engine/.gitignore

* Advance swss-common submodule. (sonic-net#5222)

9a7c9d Dbconnector namespace support (sonic-net#376)
c32f0b5 add state db entry for fgnhg route entry (sonic-net#374)

* [caclmgrd] Add support for multi-ASIC platforms (sonic-net#5022)

* Support for Control Plane ACL's for Multi-asic Platforms.
Following changes were done:
 1) Moved from using blocking listen() on Config DB to the select() model
 via python-swsscommon since we have to wait on event from multiple
 config db's
 2) Since  python-swsscommon is not available on host added libswsscommon and python-swsscommon
    and dependent packages in the base image (host enviroment)
 3) Made iptables programmed in all namespace using ip netns exec

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Fix Comments

* Added Change for Multi-asic to have iptables
rules to accept internal docker tcp/udp traffic
needed for syslog and redis-tcp connection.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Added more comments on logic.

* Fixed all warning/errors reported by http://pep8online.com/
other than line > 80 characters.

* Fix Comment
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Verified with swsscommon package. Fix issue for single asic platforms.

* Moved to new python package

* Address Review Comments.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments.

* Add support to VS platform for platform.json and DPB CLI Tests (sonic-net#5192)

- Reverts commit 457674c
- Creates "platform.json" for vs docker
- Adds test case for port breakout CLI
- Explicitly sets admin status of all the VS interfaces to down to be compatible with SWSS test cases, specifically vnet tests and sflow tests

Signed-off-by: Sangita Maity <sangitamaity0211@gmail.com>

* [iccpd] Fix uninitialized variable. (sonic-net#5112)

To declare *tb[] but do not initialize it, it might be very risky. We get iccpd exception during processing arp/nd event. Initialize it to {0};

* Fix unwanted python exception in syslog during database container (sonic-net#5227)

startup when doing redis PING since database_config.json getting
generated from jinja2 template is still not ready.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* [hostcfgd] Handle Both Service And Timer Units (sonic-net#5228)

Commit e484ae9 introduced systemd .timer unit to hostcfgd.
However, when stopping service that has timer, there is possibility that
timer is not running and the service would not be stopped. This PR
address this situation by handling both .timer and .service units.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [arista] Update driver submodules (sonic-net#5147)

- fix watchdog timeout units
- fix import path for thermal_manager
- remove arista bind mounts for docker-snmp
- improve arista bind mounts for pmon

* [docker-radv] Fix startup issues (sonic-net#5230)

**- Why I did it**

PR sonic-net#4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).

* [sonic-utilities] Update submodule (sonic-net#5233)

* src/sonic-utilities d5fdd74...17fb378 (7):
  > [sonic-installer] Import re module (sonic-net#1061)
  > [fast-reboot]: Fix fail to execute fast-reboot problem (sonic-net#1047)
  > [config] Reduce Calls to SONiC Cfggen (sonic-net#1052)
  > [filter-fdb] Call Filter FDB Main From Within Test Code (sonic-net#1051)
  > [sflow_test.py]: Fix show sflow display. (sonic-net#1054)
  > Change fast-reboot script to use swss and radv service script (sonic-net#1036)
  > Common functions for show CLI support on multi ASIC (sonic-net#999)

* [sonic-host-service]: Add SONiC Host Services infrastructure (sonic-net#4840)

- Why I did it

When SONiC is configured with the management framework and/or telemetry services, the applications running inside those containers need to access some functionality on the host system. The following is a non-exhaustive list of such functionality:

Image management
Configuration save and load
ZTP enable/disable and status
Show tech support
- How I did it

The host service is a Python process that listens for requests via D-Bus. It will then service those requests and send a response back to the requestor.

This PR only introduces the host service infrastructure. Applications that need access to the host services must add applets that will register on D-Bus endpoints to service the appropriate functionality.

- How to verify it

- Description for the changelog

Add SONiC Host Service for container to execute select commands in host

Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>

* Add common functions applicable to single/multi asic platforms (sonic-net#5224)

* Add common functions applicable to single/multi asic platforms
* Raise exception if invalid namespace is given as input.

* [sonic-swss] Update submodule (sonic-net#5231)

* src/sonic-swss d2bab10...c4949a2 (34):
  > [dvs] Add new common issues and TOC to DVS README (sonic-net#1405)
  > Avoid adding loopback interface (ip link add) when setting nat zone on loopback interface (sonic-net#1411)
  > [portsorch] add buffer drop FC group (sonic-net#1368)
  > [dvs/chassis] Bring up SONiC interfaces in virtual chassis (sonic-net#1410)
  > [chassis/dvs] Add support for virtual chassis to DVS testbed (sonic-net#1345)
  > [sonic-swsss] Fix the issue of field "next_hop_ip" not getting updated in state DB in ERSPAN Mirror (sonic-net#1375)
  > [intfmgr] Fix OA crash issue due to link local configurations (sonic-net#1195)
  > Fix the issue when persistent DVS is used to run pytest which has number of front-panel ports < 32 (sonic-net#1373)
  > [dvs] Refactor AsicDbValidator (sonic-net#1402)
  > [fec] Get FEC mode when port is already admin down (sonic-net#1403)
  > [fec] added logic that put port down before applying fec onfiguration (sonic-net#1399)
  > [dvs] Add performance test for adding and deleting routes (sonic-net#1392)
  > Ignore IPv6 link-local and multicast entries as Vnet routes (sonic-net#1401)
  > [vlanmgr] Support Jumbo Frame By Default (sonic-net#1393)
  > Fix log/syslog not being correct when last test fails for given module (sonic-net#1395)
  > Get initial speed from ASIC DB  (sonic-net#1390)
  > [dvs] Add options to limit CPU usage (sonic-net#1394)
  > [intfsorch] Retrieve Port object before setting NAT zone on router interfaces. (sonic-net#1372)
  > [.gitignore] Ignore gearsyncd binary (sonic-net#1381)
  > Added Max Nexthopgroup/ECMP Count supported by device into State DB. (sonic-net#1383)
  > [dvs] Upload logs even if failure occurs during startup (sonic-net#1389)
  > [rates] fix issue with rates init (sonic-net#1387)
  > [dvs] Validate that SWSS is ready to receive input before starting tests (sonic-net#1385)
  > [dvs] Convert sflow and speed tests to use dvslib (sonic-net#1382)
  > [dvs_acl] Refactor and document dvs_acl library (sonic-net#1378)
  > [dvs] Fix install instructions in README (sonic-net#1379)
  > [dvs] Update README with new flags, options, and known issues (sonic-net#1380)
  > swss: gearsyncd should return 0 on exit (sonic-net#1376)
  > Remove 00-copp.config.json from swss debian package. (sonic-net#1366)
  > fix undefined var in rates lua scripts (sonic-net#1365)
  > [fdborch] Fixed Orchagent crash in FDB flush on port disable. (sonic-net#1369)
  > [tlm_teamd]: Try to add LAG again, when teamd is not ready first time (sonic-net#1347)
  > [vs] Incorporate python3 best practices into DVSLib (sonic-net#1357)
  > [dvs] Mark unstable tests as xfail (sonic-net#1356)

* [arista/aboot]: Zero out 1st MB before repartitioning (sonic-net#5220)

The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>

* [sonic-py-common] Add unit test framework (sonic-net#5238)

**- Why I did it**

To install the framework for adding unit tests to the sonic-py-common package and report coverage.

** How I did it **

- Incorporate pytest and pytest-cov into sonic-py-common package build
- Updgrade version of 'mock' installed to version 3.0.5, the last version which supports Python 2. This fixes a bug where the file object returned from `mock_open()` was not iterable (see https://bugs.python.org/issue32933)
- Add support for Python 3 setuptools and pytest in sonic-slave-buster environment
- Add tests for `device_info.get_machine_info()` and `device_info.get_platform()` functions
- Also add a .gitignore in the root of the sonic-py-common directory, move all related ignores from main .gitignore file, and add ignores for files and dirs generated by pytest-cov

* Add switch for synchronous mode (sonic-net#5237)

Add a master switch so that the sync/async mode can be configured.
Example usage of the switch:
1.  Configure mode while building an image
    `make ENABLE_SYNCHRONOUS_MODE=y <target>`
2. Configure when the device is running 
    Change CONFIG_DB with `sonic-cfggen -a '{"DEVICE_METADATA":{"localhost": {"synchronous_mode": "enable"}}}' --write-to-db`
    Restart swss with `systemctl restart swss`

* [enable counters] Enable port buffer drops by default and update MLNX SAI submodule (sonic-net#5059)

* Enable port buffer drops by default
* [Mellanox] Update SAI_Implementation

Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>

* Platform monitor changes in daemon_base for multi_asic (sonic-net#4932)

Adding namespace support for db connect API.

Co-authored-by: Petro Bratash <68950226+bratashX@users.noreply.github.com>
Co-authored-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Co-authored-by: SuvarnaMeenakshi <50386592+SuvarnaMeenakshi@users.noreply.github.com>
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
Co-authored-by: Mahesh Maddikayala <10645050+smaheshm@users.noreply.github.com>
Co-authored-by: judyjoseph <53951155+judyjoseph@users.noreply.github.com>
Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>
Co-authored-by: Sangita Maity <sangitamaity0211@gmail.com>
Co-authored-by: Kelly Chen <kelly_chen@edge-core.com>
Co-authored-by: Samuel Angebault <staphylo@arista.com>
Co-authored-by: nirenjan <nirenjan@users.noreply.github.com>
Co-authored-by: Baptiste Covolato <b.covolato@gmail.com>
Co-authored-by: shi-su <67605788+shi-su@users.noreply.github.com>
Co-authored-by: Mykola F <37578614+mykolaf@users.noreply.github.com>
noaOrMlnx added a commit to noaOrMlnx/sonic-buildimage that referenced this pull request Aug 27, 2020
* [BFN] Add support pcied daemon for Montara and Newport (sonic-net#5199)

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>

* [cfggen] Allow Write To Redis DB With Template/Batch Mode (sonic-net#5203)

Argument to write to config-db is not allowed when using template.
This PR allows cfggen to write to redis db when using template
mode.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [submodule]: Advance sonic-snmpagent. (sonic-net#5213)

Update sonic-snmpagent submodule to include below commits:
1a2b62a [Namespace]: Fix SAI_ID key used in cpfcIfTable and csqIfQosGroupStatsTable implementation (sonic-net#138)
d06f00c [pytest/coverage]: add coverage support (sonic-net#156)
90e9f2e [Namespace]: Simplify sync_d functions to use higher order (sonic-net#154)
b5815d9 [LLDP]: Modify OID index of LLDPRemTableUpdater MIB (sonic-net#155)
d5f2b92 [Multiasic]: Provide namespace support for ipNetToMediaPhysAddress (sonic-net#129)
166c221 [Namespace]: Fix interface counters in RFC 1213 (sonic-net#145)

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>

* [cfggen] Conform With Python 3 Syntax (sonic-net#5154)

Preparing sonic-cfggen for migration to Python 3.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [redis-dump-load] Update submodule (sonic-net#5215)

* src/redis-dump-load 832a645...7585497 (2):
  > Merge pull request sonic-net#63 from jleveque/update_gitignore
  > Merge pull request sonic-net#59 from breser/redis-load-empty

* [services] Fix Delay Start of SNMP And Telemetry (sonic-net#5211)

SNMP and Telemetry services are not critical to switch startup.
They also cause fast-reboot not to meet timing requirements.
In order to delay start those service are associated with systemd
timer units, however when hostcfgd initiate service start, it start
the service and not the timer. This PR fixes this issue by
starting the timer associated with systemd unit.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [sonic-py-common][multi ASIC] API to get a list of frontend ports (sonic-net#5221)

* [sonic-py-common][multi ASIC] utility to get a list of frontend ports from a given list of ports

* [sonic-config-engine] Update .gitignore (sonic-net#5223)

- Ignore directories generated by building Python wheel package
- Move all sonic-config-engine ignores from the root .gitignore to src/sonic-config-engine/.gitignore

* Advance swss-common submodule. (sonic-net#5222)

9a7c9d Dbconnector namespace support (sonic-net#376)
c32f0b5 add state db entry for fgnhg route entry (sonic-net#374)

* [caclmgrd] Add support for multi-ASIC platforms (sonic-net#5022)

* Support for Control Plane ACL's for Multi-asic Platforms.
Following changes were done:
 1) Moved from using blocking listen() on Config DB to the select() model
 via python-swsscommon since we have to wait on event from multiple
 config db's
 2) Since  python-swsscommon is not available on host added libswsscommon and python-swsscommon
    and dependent packages in the base image (host enviroment)
 3) Made iptables programmed in all namespace using ip netns exec

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Fix Comments

* Added Change for Multi-asic to have iptables
rules to accept internal docker tcp/udp traffic
needed for syslog and redis-tcp connection.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Fix Review Comments

* Added more comments on logic.

* Fixed all warning/errors reported by http://pep8online.com/
other than line > 80 characters.

* Fix Comment
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Verified with swsscommon package. Fix issue for single asic platforms.

* Moved to new python package

* Address Review Comments.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Address Review Comments.

* Add support to VS platform for platform.json and DPB CLI Tests (sonic-net#5192)

- Reverts commit 457674c
- Creates "platform.json" for vs docker
- Adds test case for port breakout CLI
- Explicitly sets admin status of all the VS interfaces to down to be compatible with SWSS test cases, specifically vnet tests and sflow tests

Signed-off-by: Sangita Maity <sangitamaity0211@gmail.com>

* [iccpd] Fix uninitialized variable. (sonic-net#5112)

To declare *tb[] but do not initialize it, it might be very risky. We get iccpd exception during processing arp/nd event. Initialize it to {0};

* Fix unwanted python exception in syslog during database container (sonic-net#5227)

startup when doing redis PING since database_config.json getting
generated from jinja2 template is still not ready.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* [hostcfgd] Handle Both Service And Timer Units (sonic-net#5228)

Commit e484ae9 introduced systemd .timer unit to hostcfgd.
However, when stopping service that has timer, there is possibility that
timer is not running and the service would not be stopped. This PR
address this situation by handling both .timer and .service units.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [arista] Update driver submodules (sonic-net#5147)

- fix watchdog timeout units
- fix import path for thermal_manager
- remove arista bind mounts for docker-snmp
- improve arista bind mounts for pmon

* [docker-radv] Fix startup issues (sonic-net#5230)

**- Why I did it**

PR sonic-net#4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).

* [sonic-utilities] Update submodule (sonic-net#5233)

* src/sonic-utilities d5fdd74...17fb378 (7):
  > [sonic-installer] Import re module (sonic-net#1061)
  > [fast-reboot]: Fix fail to execute fast-reboot problem (sonic-net#1047)
  > [config] Reduce Calls to SONiC Cfggen (sonic-net#1052)
  > [filter-fdb] Call Filter FDB Main From Within Test Code (sonic-net#1051)
  > [sflow_test.py]: Fix show sflow display. (sonic-net#1054)
  > Change fast-reboot script to use swss and radv service script (sonic-net#1036)
  > Common functions for show CLI support on multi ASIC (sonic-net#999)

* [sonic-host-service]: Add SONiC Host Services infrastructure (sonic-net#4840)

- Why I did it

When SONiC is configured with the management framework and/or telemetry services, the applications running inside those containers need to access some functionality on the host system. The following is a non-exhaustive list of such functionality:

Image management
Configuration save and load
ZTP enable/disable and status
Show tech support
- How I did it

The host service is a Python process that listens for requests via D-Bus. It will then service those requests and send a response back to the requestor.

This PR only introduces the host service infrastructure. Applications that need access to the host services must add applets that will register on D-Bus endpoints to service the appropriate functionality.

- How to verify it

- Description for the changelog

Add SONiC Host Service for container to execute select commands in host

Signed-off-by: Nirenjan Krishnan <Nirenjan.Krishnan@dell.com>

* Add common functions applicable to single/multi asic platforms (sonic-net#5224)

* Add common functions applicable to single/multi asic platforms
* Raise exception if invalid namespace is given as input.

* [sonic-swss] Update submodule (sonic-net#5231)

* src/sonic-swss d2bab10...c4949a2 (34):
  > [dvs] Add new common issues and TOC to DVS README (sonic-net#1405)
  > Avoid adding loopback interface (ip link add) when setting nat zone on loopback interface (sonic-net#1411)
  > [portsorch] add buffer drop FC group (sonic-net#1368)
  > [dvs/chassis] Bring up SONiC interfaces in virtual chassis (sonic-net#1410)
  > [chassis/dvs] Add support for virtual chassis to DVS testbed (sonic-net#1345)
  > [sonic-swsss] Fix the issue of field "next_hop_ip" not getting updated in state DB in ERSPAN Mirror (sonic-net#1375)
  > [intfmgr] Fix OA crash issue due to link local configurations (sonic-net#1195)
  > Fix the issue when persistent DVS is used to run pytest which has number of front-panel ports < 32 (sonic-net#1373)
  > [dvs] Refactor AsicDbValidator (sonic-net#1402)
  > [fec] Get FEC mode when port is already admin down (sonic-net#1403)
  > [fec] added logic that put port down before applying fec onfiguration (sonic-net#1399)
  > [dvs] Add performance test for adding and deleting routes (sonic-net#1392)
  > Ignore IPv6 link-local and multicast entries as Vnet routes (sonic-net#1401)
  > [vlanmgr] Support Jumbo Frame By Default (sonic-net#1393)
  > Fix log/syslog not being correct when last test fails for given module (sonic-net#1395)
  > Get initial speed from ASIC DB  (sonic-net#1390)
  > [dvs] Add options to limit CPU usage (sonic-net#1394)
  > [intfsorch] Retrieve Port object before setting NAT zone on router interfaces. (sonic-net#1372)
  > [.gitignore] Ignore gearsyncd binary (sonic-net#1381)
  > Added Max Nexthopgroup/ECMP Count supported by device into State DB. (sonic-net#1383)
  > [dvs] Upload logs even if failure occurs during startup (sonic-net#1389)
  > [rates] fix issue with rates init (sonic-net#1387)
  > [dvs] Validate that SWSS is ready to receive input before starting tests (sonic-net#1385)
  > [dvs] Convert sflow and speed tests to use dvslib (sonic-net#1382)
  > [dvs_acl] Refactor and document dvs_acl library (sonic-net#1378)
  > [dvs] Fix install instructions in README (sonic-net#1379)
  > [dvs] Update README with new flags, options, and known issues (sonic-net#1380)
  > swss: gearsyncd should return 0 on exit (sonic-net#1376)
  > Remove 00-copp.config.json from swss debian package. (sonic-net#1366)
  > fix undefined var in rates lua scripts (sonic-net#1365)
  > [fdborch] Fixed Orchagent crash in FDB flush on port disable. (sonic-net#1369)
  > [tlm_teamd]: Try to add LAG again, when teamd is not ready first time (sonic-net#1347)
  > [vs] Incorporate python3 best practices into DVSLib (sonic-net#1357)
  > [dvs] Mark unstable tests as xfail (sonic-net#1356)

* [arista/aboot]: Zero out 1st MB before repartitioning (sonic-net#5220)

The first partition starting point was changed to be 1M as part of this
commit: 6ba2f97. On systems that are misaligned before conversion
(partition start is the first sector), the relica partition that is
left in the first MB can cause problems in Aboot and result in corruption
of the filesystem on the new aligned partition.

Zeroing this old relica makes sure that there is nothing left of the old
partition lying around. There won't be any risk of having Aboot corrupt
the new filesystem because of the old relica.

Signed-off-by: Baptiste Covolato <baptiste@arista.com>

* [sonic-py-common] Add unit test framework (sonic-net#5238)

**- Why I did it**

To install the framework for adding unit tests to the sonic-py-common package and report coverage.

** How I did it **

- Incorporate pytest and pytest-cov into sonic-py-common package build
- Updgrade version of 'mock' installed to version 3.0.5, the last version which supports Python 2. This fixes a bug where the file object returned from `mock_open()` was not iterable (see https://bugs.python.org/issue32933)
- Add support for Python 3 setuptools and pytest in sonic-slave-buster environment
- Add tests for `device_info.get_machine_info()` and `device_info.get_platform()` functions
- Also add a .gitignore in the root of the sonic-py-common directory, move all related ignores from main .gitignore file, and add ignores for files and dirs generated by pytest-cov

* Add switch for synchronous mode (sonic-net#5237)

Add a master switch so that the sync/async mode can be configured.
Example usage of the switch:
1.  Configure mode while building an image
    `make ENABLE_SYNCHRONOUS_MODE=y <target>`
2. Configure when the device is running 
    Change CONFIG_DB with `sonic-cfggen -a '{"DEVICE_METADATA":{"localhost": {"synchronous_mode": "enable"}}}' --write-to-db`
    Restart swss with `systemctl restart swss`

* [enable counters] Enable port buffer drops by default and update MLNX SAI submodule (sonic-net#5059)

* Enable port buffer drops by default
* [Mellanox] Update SAI_Implementation

Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>

* Platform monitor changes in daemon_base for multi_asic (sonic-net#4932)

Adding namespace support for db connect API.

* [py-swsssdk] Submodule Update (sonic-net#5249)

Change:
  c25d492 Merge pull request sonic-net#83 from tahmed-dev/taahme/add-redis-pipeline-operation
  198d143 review comments - part of [configdb] Add Ability to Query/Update Redis Using Pipelines
  994851c review comments - part of [configdb] Add Ability to Query/Update Redis Using Pipelines
  2d2b7e1 making lgtm happy - part of [configdb] Add Ability to Query/Update Redis Using Pipelines
  fa9093c [configdb] Add Ability to Query/Update Redis Using Pipelines

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

* [cfggen] Use Redis Pipeline (sonic-net#5250)

This PR enables cfggen to readr/write from Redis DB using pipelines.
Pipelines enables batch read/write from/to Redis DB.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>

Co-authored-by: Petro Bratash <68950226+bratashX@users.noreply.github.com>
Co-authored-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Co-authored-by: SuvarnaMeenakshi <50386592+SuvarnaMeenakshi@users.noreply.github.com>
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
Co-authored-by: Mahesh Maddikayala <10645050+smaheshm@users.noreply.github.com>
Co-authored-by: judyjoseph <53951155+judyjoseph@users.noreply.github.com>
Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>
Co-authored-by: Sangita Maity <sangitamaity0211@gmail.com>
Co-authored-by: Kelly Chen <kelly_chen@edge-core.com>
Co-authored-by: Samuel Angebault <staphylo@arista.com>
Co-authored-by: nirenjan <nirenjan@users.noreply.github.com>
Co-authored-by: Baptiste Covolato <b.covolato@gmail.com>
Co-authored-by: shi-su <67605788+shi-su@users.noreply.github.com>
Co-authored-by: Mykola F <37578614+mykolaf@users.noreply.github.com>
abdosi pushed a commit that referenced this pull request Sep 6, 2020
**- Why I did it**

PR #4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).
santhosh-kt pushed a commit to santhosh-kt/sonic-buildimage that referenced this pull request Feb 25, 2021
**- Why I did it**

PR sonic-net#4599 introduced two bugs in the startup of the router advertiser container:

1. References to the `wait_for_intf.sh` script were changed to `wait_for_link.sh`, but the actual script was not renamed
2. The `ipv6_found` Jinja2 variable added to the supervisor config file goes out of scope before it is read.

**- How I did it**
1. Rename the `wait_for_intf.sh` script to `wait_for_link.sh`
2. Use the Jinja2 "namespace" construct to fix the scope issue

**- How to verify it**

Ensure all processes in the radv container start properly under the correct conditions (i.e., whether or not there is at least one VLAN with an IPv6 address assigned).
tahmed-dev added a commit to tahmed-dev/sonic-buildimage that referenced this pull request Apr 7, 2021
PR sonic-net#4599 changed startup
script name from wait_for_intf.sh.j2 to wait_for_link.sh.j2, however
when PR sonic-net#5178 was cherry-
picked, the script name was not changed to wait_for_link.sh.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
lguohan pushed a commit that referenced this pull request Apr 8, 2021
PR #4599 changed startup
script name from wait_for_intf.sh.j2 to wait_for_link.sh.j2, however
when PR #5178 was cherry-
picked, the script name was not changed to wait_for_link.sh.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

supervisor consumes lot of CPU during switch startup
7 participants