Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic buffer calc] Support dynamic buffer calculation #973

Merged
merged 5 commits into from
Dec 15, 2020

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Jul 1, 2020

- What I did
Support dynamic buffer calculation

- How I did it

  1. Commands added:
    • config interface buffer priority-group lossless <add|set|remove>
    • config interface buffer priority-group lossless add [headroom-override-profile] for adding the PG for
    the first time providing option headroom-override-profile means to configure the PG as headroom override
    otherwise as dynamically calculated headroom
    • config interface buffer priority-group lossless set [headroom-override-profile] for modifying an existing
    PG, the option headroom-override-profile has the same meaning as "add"
    • config interface buffer priority-group lossless remove [PG] for removing the PG specified by option PG. If the
    option isn't provided, all lossless PGs on the port will be removed
    • config buffer-profile <add|set|remove> to add, modify or remove buffer profiles
    • show buffer <configuration|information>
  2. db_migrator:
    • migrate CONFIG_DB from old approach to the new approach
    • when system warm starts from old image to the new one, copies related tables from CONFIG_DB to APPL_DB for the
    purpose that buffermgrd can start smoothly
  3. Warm-reboot script: don't clear BUFFER_MAX_PARAM table across warm reboot
  4. CLI reference is also provided

- How to verify it

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

@lgtm-com
Copy link

lgtm-com bot commented Jul 1, 2020

This pull request introduces 7 alerts and fixes 1 when merging 45eaa6d5abf4b0d7fa82981e2ffc82c4a459835e into fd52e93 - view on LGTM.com

new alerts:

  • 4 for Except block handles 'BaseException'
  • 3 for Unused local variable

fixed alerts:

  • 1 for Unused local variable

@stephenxs stephenxs changed the title Support dynamic buffer calculation [Dynamic buffer calculation] Support dynamic buffer calculation Aug 6, 2020
@stephenxs stephenxs changed the title [Dynamic buffer calculation] Support dynamic buffer calculation [Dynamic buffer calc] Support dynamic buffer calculation Aug 6, 2020
@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from 45eaa6d to 4632e67 Compare August 14, 2020 11:40
@lgtm-com
Copy link

lgtm-com bot commented Aug 14, 2020

This pull request introduces 8 alerts when merging 4632e67e0f2df408160071e9af219c8366d56f9d into d5fdd74 - view on LGTM.com

new alerts:

  • 4 for Except block handles 'BaseException'
  • 3 for Unused local variable
  • 1 for Unused import

@stephenxs stephenxs marked this pull request as ready for review August 14, 2020 11:57
@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from 4632e67 to 8dad09b Compare August 15, 2020 14:07
@lgtm-com
Copy link

lgtm-com bot commented Aug 15, 2020

This pull request introduces 8 alerts when merging 8dad09bd76832e4581dd7780d1ae46c97f89b437 into 37f131e - view on LGTM.com

new alerts:

  • 4 for Except block handles 'BaseException'
  • 3 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Aug 26, 2020

This pull request introduces 8 alerts when merging 7257fbf2f12d5d8db502b4613cad6b2edad4ed67 into ca8ffe7 - view on LGTM.com

new alerts:

  • 4 for Except block handles 'BaseException'
  • 3 for Unused local variable
  • 1 for Unused import

@neethajohn
Copy link
Contributor

please add some tests for the new show and config commands

@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from 3aa39aa to d17a143 Compare September 11, 2020 06:10
@lgtm-com
Copy link

lgtm-com bot commented Sep 12, 2020

This pull request introduces 1 alert when merging 1d822ac3fcc1a6de749385f97b12ec80d5b9b66f into 144bccb - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from 1d822ac to dae7df5 Compare September 12, 2020 13:55
@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from e8522fb to 1f81d20 Compare October 21, 2020 02:56
@stephenxs

This comment has been minimized.

@keboliu
Copy link
Collaborator

keboliu commented Nov 4, 2020

retest this please

@stephenxs stephenxs force-pushed the dynamic-buffer-calculation branch from fbe9462 to 2a2e96b Compare November 17, 2020 02:36
@lgtm-com
Copy link

lgtm-com bot commented Nov 17, 2020

This pull request introduces 2 alerts when merging 2a2e96bf08e38c2403c8c656694e8142859019f0 into 1c45ca1 - view on LGTM.com

new alerts:

  • 2 for Wrong name for an argument in a class instantiation

show/main.py Outdated Show resolved Hide resolved
@neethajohn
Copy link
Contributor

Please add tests for show commands as well. You can refer to watermarkstat_test

@stephenxs
Copy link
Collaborator Author

Please add tests for show commands as well. You can refer to watermarkstat_test

Very useful suggestions, thanks!
Done.

@lgtm-com
Copy link

lgtm-com bot commented Nov 19, 2020

This pull request introduces 1 alert when merging 82bf2a66ea416dff0c0e2b5aeeeb79de1bb3a59a into 9d55082 - view on LGTM.com

new alerts:

  • 1 for Unused import

doc/Command-Reference.md Outdated Show resolved Hide resolved
doc/Command-Reference.md Outdated Show resolved Hide resolved
doc/Command-Reference.md Outdated Show resolved Hide resolved
doc/Command-Reference.md Outdated Show resolved Hide resolved
doc/Command-Reference.md Outdated Show resolved Hide resolved
scripts/db_migrator.py Show resolved Hide resolved
scripts/db_migrator.py Show resolved Hide resolved
scripts/db_migrator.py Outdated Show resolved Hide resolved
tests/buffer_test.py Outdated Show resolved Hide resolved
tests/buffer_test.py Outdated Show resolved Hide resolved
@lgtm-com
Copy link

lgtm-com bot commented Nov 20, 2020

This pull request introduces 1 alert when merging 79440b85ab4fa4f6d7e203ab021c14206992c0e0 into 05c8e33 - view on LGTM.com

new alerts:

  • 1 for Unused import

neethajohn
neethajohn previously approved these changes Nov 20, 2020
- Example:

```
admin@sonic:~$ buffershow -l
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example command should be 'show buffer information' and not the script executed by it.

@liat-grozovik
Copy link
Collaborator

@stephenxs can you please resolve conflicts and double check the version for the db migration?

1. commands added(see below) and testcases for new commands
2. db_migrator:
   - migrate CONFIG_DB from old approach to the new approach
   - when system warm starts from old image to the new one,
     copies related tables from CONFIG_DB to APPL_DB for the purpose
     that buffermgrd can start smoothly
3. warm-reboot script: don't clear BUFFER_MAX_PARAM table across warm reboot

CLI list
- config interface buffer priority-group lossless <add|set|remove>
  - config interface buffer priority-group lossless add <port> <PG> [headroom-override-profile]
    for adding the PG for the first time
    providing option headroom-override-profile means to configure the PG as headroom override
    otherwise as dynamically calculated headroom
  - config interface buffer priority-group lossless set <port> <PG> [headroom-override-profile]
    for modifying an existing PG, the option headroom-override-profile has the same meaning as "add"
  - config interface buffer priority-group lossless remove <port> [PG]
    for removing the PG specified by option PG.
    if the option isn't provided, all lossless PGs on the port will be removed
- config buffer-profile <add|set|remove>
  To add, modify or remove buffer profiles
- show buffer <configuration|information>

Testcase covered for global config commands.
show command unconvered due to subprocess call isn't supported by test infra

Signed-off-by: Stephen Sun <stephens@nvidia.com>
1. If all the buffer configuration aligns the default, use dynamic
buffer calculation mode. Otherwise, use the traditional mode
2. Dynamic mode is adopted in switches newly installed from scratch
and traditional mode in switches installed from minigraph
This is done by:
- introducing the option --no-dynamic-buffer in "config qos reload"
  to designate whether the dynamic or traditional mode is used
- introducing a new filed "buffer_model" in DEVICE_METADATA|localhost
  to store which buffer model currently is used
- updating db_migrator accordingly

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Review comments:
- Add testcases for show and config command
- Address review comments in db_migrator
- Use "size" to represent the size of the PG and "headroom" for xoff
- Fix typo

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Dec 4, 2020

This pull request introduces 1 alert when merging c237619 into 9a17108 - view on LGTM.com

new alerts:

  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Dec 4, 2020

This pull request introduces 1 alert when merging a63b850 into 9a17108 - view on LGTM.com

new alerts:

  • 1 for Unused import

@stephenxs
Copy link
Collaborator Author

retest this please

tests/mock_tables/config_db.json Outdated Show resolved Hide resolved
tests/mock_tables/state_db.json Outdated Show resolved Hide resolved
config/main.py Outdated Show resolved Hide resolved
#
# 'buffer' subgroup ('config interface buffer ...')
#
@interface.group(cls=clicommon.AbbreviationGroup)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit testing missing for interface group. Please add

Copy link
Collaborator Author

@stephenxs stephenxs Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neethajohn ,
this needs an infrastructure level update.
For unit testing to work for a command, the fixture @clicommon.pass_db is required. However, the db objects are generated in interface subcommand instead of passed by @clicommon.pass_db.
So we are unable to add unit testing here until it is updated. There isn't any unit test for any of config interface commands now for the same cause.
If we would like to enable unit testing for all config interface commands, I suggest updating the interface command first and the updating unit test cases. It's better to do it in another PR.
How do you think?

All other comments have been fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan, how do we want to proceed on this? I don't see any unit tests for any of the 'config interface' commands currently

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neethajohn @lguohan ,
Can we merge it and add unit test cases for config interface buffer when the infrastructure is ready?
Thanks.
Stephen

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue #1301 opened for tracking this.

- Fix alignment error
- Update DEVICE_METADATA|localhost.buffer_model only if it is changed

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Dec 8, 2020

This pull request introduces 1 alert when merging aee20c1 into 326e534 - view on LGTM.com

new alerts:

  • 1 for Unused import

@stephenxs
Copy link
Collaborator Author

This pull request introduces 1 alert when merging aee20c1 into 326e534 - view on LGTM.com

new alerts:

  • 1 for Unused import

I don't quite understand why LGTM reports the following import is unused. Probably because it is only imported in the test code which is skipped by LGTM?
Is that a false alarm?

    if os.environ["UTILITIES_UNIT_TESTING"] == "2":
        modules_path = os.path.join(os.path.dirname(__file__), "..")
NEWImport of 'mock_tables' is not used.

@stephenxs
Copy link
Collaborator Author

retest this please.

@stephenxs
Copy link
Collaborator Author

retest this, please

qiluo-msft pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Dec 13, 2020
**- Why I did it**
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- [sonic-swss: [buffermgr/bufferorch] Support dynamic buffer calculation #1338](sonic-net/sonic-swss#1338)
- [sonic-swss-common: Dynamic buffer calculation #361](sonic-net/sonic-swss-common#361)
- [sonic-utilities: Support dynamic buffer calculation #973](sonic-net/sonic-utilities#973)

**- How I did it**
1. Introduce field `buffer_model` in `DEVICE_METADATA|localhost` to represent which buffer model is running in the system currently:
    - `dynamic` for the dynamic buffer calculation model
    - `traditional` for the traditional model in which the `pg_profile_lookup.ini` is used
2. Add the tables required for the feature:
   - ASIC_TABLE in platform/\<vendor\>/asic_table.j2
   - PERIPHERAL_TABLE in platform/\<vendor\>/peripheral_table.j2
   - PORT_PERIPHERAL_TABLE on a per-platform basis in device/\<vendor\>/\<platform\>/port_peripheral_config.j2 for each platform with gearbox installed.
   - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2
   - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2
3. Copy the newly introduced j2 files into the image and rendering them when the system starts
4. Update the CLI options for buffermgrd so that it can start with dynamic mode
5. Fetches the ASIC vendor name in orchagent:
   - fetch the vendor name when creates the docker and pass it as a docker environment variable
   - `buffermgrd` can use this passed-in variable
6. Clear buffer related tables from STATE_DB when swss docker starts
7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2
8. Remove buffer pool sizes for ingress pools and egress_lossy_pool
   Update the buffer settings for dynamic buffer calculation
liat-grozovik pushed a commit to sonic-net/sonic-swss that referenced this pull request Dec 15, 2020
**What I did**

***Support dynamic buffer calculation***

1. Extend the CLI options for buffermgrd:
   - -a: asic_table provided,
   - -p: peripheral_table provided

   The `buffermgrd` will start the dynamic headroom calculation mode with -a provided.
   Otherwise, it will start the legacy mode (pg_headroom_profile looking up)
2. A new class is provided for dynamic buffer calculation while the old one remains.
   The daemon will instantiate the corresponding class according to the CLI option when it starts.
3. In both modes, the `buffermgrd` will copy BUFFER_XXX tables from CONFIG_DB to APPL_DB and the `bufferorch` will consume BUFFER_XXX tables from APPL_DB

***Backward compatibility***
For legacy mode, the backward compatibility is provided. As mentioned above, `buffermgrd` will check whether the json file representing the `ASIC_TABLE` exists when it starts.
- If yes it will start the dynamic buffer calculating mode
- Otherwise, it will start the compatible mode which is the old looking up mode in the new code committed in this PR.
This logic is in `cfgmgr/buffermgrd.cpp`.

The logic of buffer handling in `buffermgrd` isn't changed in the legacy mode. The differences are:
- in legacy mode which is the old code, there isn't any buffer related table in `APPL_DB`. All tables are in `CONFIG_DB`.
  - `buffermgrd` listens to `PORT` and `CABLE_LENGTH` tables in `CONFIG_DB` and inserts the buffer profiles into `BUFFER_PROFILE` table.
  - `bufferorch` listens to buffer related tables in `CONFIG_DB` and call SAI API correspondingly.
- In the compatible mode, `buffermgrd` listens to tables in `CONFIG_DB` and copies them into `APPL_DB` 
  - `buffermgrd`
    - listens to `PORT` and `CABLE_LENGTH` tables in `CONFIG_DB` and inserts the buffer profiles into `BUFFER_PROFILE` table in `CONFIG_DB` (not changed)
    - listens to buffer related tables in `CONFIG_DB` and copies them into `APPL_DB`
  - `bufferorch` listens to `APPL_DB` and call SAI API correspondingly. (the difference is the db it listens to).
  - `db_migrator` is responsible to copy the buffer related tables from `CONFIG_DB` to `APPL_DB` when system is warmbooted from the old image to the new image for the first time.

The compatible code is in `cfgmgr/buffermgr.cpp`, `orchagent/bufferorch.cpp` and `db_migrator` (in the [sonic-utilities PR](sonic-net/sonic-utilities#973)).

**Why I did it**

**How I verified it**

1. vs test
2. regression test [PR: [Dynamic buffer calc] Test cases for dynamic buffer calculation](sonic-net/sonic-mgmt#1971)

**Dynamic buffer details**

1. In the dynamic buffer calculation mode, there are 3 lua plugins are provided for vendor-specific operations:
   - buffer_headroom_<vendor>.lua, for calculating headroom size.
   - buffer_pool_<vendor>.lua, for calculating buffer pool size.
   - buffer_check_headroom_<vendor>.lua, for checking whether headroom exceeds the limit
2. During initialization, The daemon will:
   - load asic_table and peripheral_table from the given json file, parse them and push them into STATE_DB.ASIC_TABLE and STATE_DB.PERIPHERAL_TABLE respectively
   - load all plugins
   - try to load the STATE_DB.BUFFER_MAX_PARAM.mmu_size which is used for updating buffer pool size
   - a timer will be started for periodic buffer pool size audit
3. The daemon will listen to and handle the following tables from CONFIG_DB
   The tables will be cached internally in the daemon for the purpose of saving access time
   - BUFFER_POOL:
     - if the size is provided: insert the entry to APPL_DB
     - otherwise: cache them and push to APPL_DB after the size is calculated by lua plugin
   - BUFFER_PROFILE and BUFFER_PG:
     - items for ingress lossless headroom need to be cached and handled (according to the design)
     - other items will be inserted to the APPL_DB directly
   - PORT_TABLE, for ports' speed and MTU update
   - CABLE_LENGTH, for ports' cable length
4. Other tables will be copied to APPL_DB directly:
   - BUFFER_QUEUE
   - BUFFER_PORT_INGRESS_PROFILE_LIST
   - BUFFER_PORT_EGRESS_PROFILE_LIST
5. BufferOrch modified accordingly:
   - Consume buffer relevant tables from APPL_DB instead of CONFIG_DB
   - For BUFFER_POOL, don't set ingress/egress and static/dynamic to sai if the pool has already existed because they are create-only
   - For BUFFER_PROFILE, don't set pool for the same reason
6. Warm reboot:
   - db_migrator is responsible for copying the data from CONFIG_DB to APPL_DB if the switch is warm-rebooted from an old image to the new image for the first time
   - no specific handling in the daemon side
7. Provide vstest script
@liat-grozovik liat-grozovik merged commit 394b202 into sonic-net:master Dec 15, 2020
@stephenxs stephenxs deleted the dynamic-buffer-calculation branch December 17, 2020 03:43
anand-kumar-subramanian pushed a commit to anand-kumar-subramanian/sonic-utilities that referenced this pull request Mar 2, 2021
**- What I did**
Support dynamic buffer calculation

**- How I did it**
1. Commands added:
    •	config interface buffer priority-group lossless <add|set|remove>
    •	config interface buffer priority-group lossless add <port> <PG> [headroom-override-profile] for adding the PG for
        the first time providing option headroom-override-profile means to configure the PG as headroom override 
        otherwise as dynamically calculated headroom
    •	config interface buffer priority-group lossless set <port> <PG> [headroom-override-profile] for modifying an existing 
        PG, the option headroom-override-profile has the same meaning as "add"
    •	config interface buffer priority-group lossless remove <port> [PG] for removing the PG specified by option PG. If the 
        option isn't provided, all lossless PGs on the port will be removed
    •	config buffer-profile <add|set|remove> to add, modify or remove buffer profiles
    •	show buffer <configuration|information>
2. db_migrator:
    •	migrate CONFIG_DB from old approach to the new approach 
    •	when system warm starts from old image to the new one, copies related tables from CONFIG_DB to APPL_DB  for the 
        purpose that buffermgrd can start smoothly
3. Warm-reboot script: don't clear BUFFER_MAX_PARAM table across warm reboot
4. CLI reference is also provided

**- How to verify it**

**- Previous command output (if the output of a command-line utility has changed)**

**- New command output (if the output of a command-line utility has changed)**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants