Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QoS] Support dynamic headroom calculation for Barefoot platforms #2323

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mmysle
Copy link

@mmysle mmysle commented Jun 10, 2022

Signed-off-by: Michał Mysłek michal.myslek@intel.com

What I did
Adding the dynamic headroom calculation support for Barefoot platforms.

Why I did it
Enabling dynamic mode for barefoot case.

How I verified it
The community tests are adjusted and pass.

Details if related

Signed-off-by: Michał Mysłek <michal.myslek@intel.com>
@stephenxs
Copy link
Collaborator

FYI. Comments in #2317 are also applicable here.

@mmysle
Copy link
Author

mmysle commented Jul 6, 2022

FYI.
BUFFER_MAX_PARAM_TABLE||max_headroom_size is the maximum accumulative headroom that can be applied to a port. It will always be exposed to STATE_DB during orchagent initialization as long as being supported by the vendor SAI. It won't be changed after initialization. It is not relevant to whether the shared headroom pool is enabled.
So In case the shared headroom pool is a must on Barefoot's platforms, I would like to suggest enabling it in buffers_config.j2 in DEFAULT_LOSSLESS_BUFFER_PARAMETER and always returning true here.

@stephenxs, I did it to be consistent with the design you created. The condition is always true due to the attribute setting - it returns 0.
If I understand you correctly, your suggestion with regard to handling max_headroom_size is to:
1) remove the condition in buffer_check_headroom_barefoot.lua script
2) add in the device/barefoot/x86_64-accton_as9516_32d-r0/newport/buffers_dynamic.json.j2 file something like:
{%- set shp = 'true' %}
3) add in the files/build_templates/buffers_config.j2 file something like:

	    "DEFAULT_LOSSLESS_BUFFER_PARAMETER": {
	        "AZURE": {
	            "default_dynamic_th": "0"
	            {%- if shp is defined -%}
	            ,
	            "max_headroom_size" : "0"
	            {%- endif -%}
	        }
	    },

If I understand your proposal incorrectly, could you set me straight on it?

@mmysle
Copy link
Author

mmysle commented Jul 6, 2022

In case the shared buffer pool sizes are always fixed, I would like to suggest reading them from CONFIG_DB and exposing it for ingress_lossless_pool only.
By doing so you do not need to change lua script next time a new platform with different buffer pool sizes is introduced.

@stephenxs, the buffer pool script was simplified to these fixed numbers as a first try to handle this, but in the future it is necessary to support the buffer reconfiguration. You are right - so far, it would be better to read it in this lua script from CONFIG_DB to improve the extensibility.

@mmysle
Copy link
Author

mmysle commented Jul 6, 2022

maybe you can use DEFAULT_LOSSLESS_BUFFER_PARAMETER.over_subscribe_ratio here?
local shp_size = ports_num * 2 * ppg_headroom / over_subscribe_ratio
A problem here is that the over_subscribe_ratio is an intergeter. Not sure whether it satisfies your req.

@stephenxs, in my opinion It is not satisfactory in this case. Anyway, I believe it would be worth to add the oversubscribe ratio parameter for Barefoot's platforms too as a improvement in other PR. As far as I understand the meaning of the parameter, it refers to the number of ports and it is set by a network operator based on the packet flow analysis. Is it correct? Could you help me understand what it means when we have shared headroom pool model? One day I was considering if it could be represented as the number of lossless PGs in the system. What do you think?

@stephenxs
Copy link
Collaborator

@stephenxs, I did it to be consistent with the design you created. The condition is always true due to the attribute setting - it returns 0. If I understand you correctly, your suggestion with regard to handling max_headroom_size is to:

  1. remove the condition in buffer_check_headroom_barefoot.lua script

[SS]
Yes. You can always return true if the max_headroom_size is 0

  1. add in the device/barefoot/x86_64-accton_as9516_32d-r0/newport/buffers_dynamic.json.j2 file something like: {%- set shp = 'true' %}
  2. add in the files/build_templates/buffers_config.j2 file something like:
	    "DEFAULT_LOSSLESS_BUFFER_PARAMETER": {
	        "AZURE": {
	            "default_dynamic_th": "0"
	            {%- if shp is defined -%}
	            ,
	            "max_headroom_size" : "0"

[SS]
"max_headroom_size" : "0" => "over_subscribe_ratio" : "2" <== any non zero integer is OK.

          {%- endif -%}
      }
  },

If I understand your proposal incorrectly, could you set me straight on it?

[SS]
As long as the shared headroom pool is a must in the Barefoot platform, I would like to suggest enabling it from the CONFIG_DB.
The way to enable it from CONFIG_DB:

  1. Define over_subscribe_ratio as above
  2. Define xoff in ingress_lossless_pool in buffer templates like device/barefoot/x86_64-accton_as9516_32d-r0/newport/buffers_defaults_t1.json.j2

I see you enabled it via always returning non-zero xoff in buffer_pool_farefoot.lua. I'm not sure whether it works correctly but in case it works, I'm also ok with this way.

@stephenxs
Copy link
Collaborator

maybe you can use DEFAULT_LOSSLESS_BUFFER_PARAMETER.over_subscribe_ratio here?
local shp_size = ports_num * 2 * ppg_headroom / over_subscribe_ratio
A problem here is that the over_subscribe_ratio is an intergeter. Not sure whether it satisfies your req.

@stephenxs, in my opinion It is not satisfactory in this case. Anyway, I believe it would be worth to add the oversubscribe ratio parameter for Barefoot's platforms too as a improvement in other PR. As far as I understand the meaning of the parameter, it refers to the number of ports and it is set by a network operator based on the packet flow analysis. Is it correct? Could you help me understand what it means when we have shared headroom pool model? One day I was considering if it could be represented as the number of lossless PGs in the system. What do you think?

Assume the accumulative headroom sizes of all PGs on all ports is H, then the shared headroom pool size is M/over_subscribe_ratio.
For example, there are:

  • 10 ports
  • 2 PGs on each port
  • 20kB or 20480 bytes headroom for each PG

The accumulative headroom is 20480*2*10 = 409600
If the over_subscribe_ratio is 2, the shared headroom pool size is 409600 / 2 = 204800.

The problem here is that in your code, the size is calculated via multiplying 0.7, which can not be represented via dividing an integer. I'm ok to extend it to a float if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants