Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(eos_designs): Support EVPN multihoming features link-tracking and lacp port-id range #1441

Merged
merged 3 commits into from
Feb 8, 2022

Conversation

tgodaA
Copy link
Contributor

@tgodaA tgodaA commented Jan 18, 2022

Change Summary

The PR #1285 and #1301 were related to the eos_cli_config_gen role and the same features were supported in eos_designs now. These features are related to the EVPN multihoming, 1) Link-Tracking Group 2) LACP Port-id

Design Guide Page 16 and 17

Related Issue(s)

Fixes #1043 #1074

Component(s) name

arista.avd.eos_designs

Proposed changes

Need to support: 1) Link tracking groups 2) Different Lacp Port-id in different vtep groups

A) Fabric variables:
To create Link tracking groups under any type of node as well as adds all the switch uplink interfaces as upstream interfaces in that group:

<node_type_keys>:
  default:
    link_tracking:
      enabled: < true | false >
      # Optional default group name -> "LT_GROUP1", recovery_delay -> 500
      groups:
        - name: < tracking_group_name >
          recovery_delay: < 0-3600 | default -> 500 >
          links_minimum: < 1-100000 >

Lacp Port-id range must be set in Vteps only (l3leaf) to make sure the end-point can differentiate the different switches(l3leafs), it is connected to in the ES-LAG group.
To support different lacp port-id in different Vtep groups:

<node_type_keys>:
  default:
    lacp_port_id_range:
      enabled: < true | false | default -> false >
      # Recommended size > = number of ports in the switch.
      size: < 1-65535 | default -> 128 >
      # Offset is used to avoid overlapping port-id ranges of different switches | Optional
      # Useful when a "connected-endpoint" is connected to switches in different "node_groups".
      offset: < offset_for_lacp_port_id_range | default -> 0 >

B) On connected-endpoints we set up the link tracking group downstream interfaces:

link_tracking:
  enabled: < true | false >
  # The default group name is taken from fabric variable of the switch, link_tracking.groups[0].name with default value being "LT_GROUP1".
  # Optional if default link_tracking settings are configured on the node.
  name: < tracking_group_name >

How to test

Molecule

Checklist

User Checklist

  • N/A

Repository Checklist

  • My code has been rebased from devel before I start
  • I have read the CONTRIBUTING document.
  • My change requires a change to the documentation and documentation have been updated accordingly.
  • I have updated molecule CI testing accordingly. (check the box if not applicable)

@tgodaA tgodaA force-pushed the evpn_mh branch 2 times, most recently from ee73807 to 772267a Compare January 18, 2022 05:08
@tgodaA tgodaA marked this pull request as draft January 18, 2022 05:11
@github-actions github-actions bot added the state: CI Updated CI scenario have been updated in the PR label Jan 21, 2022
@github-actions github-actions bot requested a review from titom73 January 21, 2022 03:03
@tgodaA tgodaA marked this pull request as ready for review January 21, 2022 03:03
@tgodaA
Copy link
Contributor Author

tgodaA commented Jan 21, 2022

I have added new molecule scenario. Need some help with the review. If everything is good, let me know how to add it to the workflow. Thank you!

@tgodaA tgodaA added this to the v3.3.0 milestone Jan 21, 2022
@tgodaA tgodaA changed the title Feat(eos_designs): Support EVPN multihoming link-tracking and different lacp port-id Feat(eos_designs): Support EVPN multihoming features link-tracking and lacp port-id range Jan 21, 2022
@tgodaA tgodaA requested review from ClausHolbechArista and removed request for ClausHolbechArista January 24, 2022 13:59
@ClausHolbechArista
Copy link
Contributor

I have added new molecule scenario. Need some help with the review. If everything is good, let me know how to add it to the workflow. Thank you!

Please do not add extra molecule scenarios. We are trying to get fewer, since the CI runtime is getting very long (and expensive). Best would be to just add the data you need in a hostvars file and add the device(s) under the unit_test scenario. Avoid using any existing groups. Just add a new UNIT_TESTS group under all and supply all the required vars in hostvars. Ping me if we should work this out together.

@tgodaA
Copy link
Contributor Author

tgodaA commented Jan 25, 2022

@ClausHolbechArista , I saw an example of UNIT_TESTS in eos_designs_unit_tests. But for the multihoming example, I need the whole spine-leaf-end_point architecture to test link_tracking_groups and just a l3leaf to test lacp_port_id_range. But I am working to move this to eos_designs_unit_tests.

@tgodaA
Copy link
Contributor Author

tgodaA commented Jan 26, 2022

@ClausHolbechArista let me know if these artifacts looks good to you. I will try to remove few more knobs in the example to make it a unit test only for those new knobs.

@tgodaA tgodaA removed this from the v3.3.0 milestone Jan 27, 2022
@tgodaA tgodaA added this to the v3.3.0 milestone Jan 31, 2022
{% for lt_group in switch_data.combined.link_tracking.groups %}
{% if lt_group.name is arista.avd.defined %}
- name: {{ lt_group.name }}
recovery_delay: {{ lt_group.recovery_delay | arista.avd.default(500) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should default to 0 / no delay. If this is seconds, it will have severe impacts on the network.

Copy link
Contributor Author

@tgodaA tgodaA Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, recovery_delay is required as far as I understand. 500 was the value used in document. We are allowing the BGP to come up and then LAGs to work to avoid the traffic arrival before the network is actually up. I understand you are presenting the generic config, but I think mostly this knob is going to be used for multihoming as far as I guess.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for a regular use case, I think there should be a delay to allow for route convergence unless if static routing is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For l3 devices (check switch.underlay_router) I think the default should be 300 to match the default mlag reload delay. For pure l2 devices where we just have a port-channel uplink, IMO the delay should be minimum. Maybe 10 seconds. Just for spanning-tree to settle.

@tgodaA
Copy link
Contributor Author

tgodaA commented Feb 3, 2022

Tested link_tracking on a l2leaf. It works now. Couldn't test on SPINE though!

@tgodaA tgodaA force-pushed the evpn_mh branch 2 times, most recently from 1e74df3 to a277010 Compare February 3, 2022 05:05
{% for lt_group in switch_data.combined.link_tracking.groups %}
{% if lt_group.name is arista.avd.defined %}
- name: {{ lt_group.name }}
recovery_delay: {{ lt_group.recovery_delay | arista.avd.default(500) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For l3 devices (check switch.underlay_router) I think the default should be 300 to match the default mlag reload delay. For pure l2 devices where we just have a port-channel uplink, IMO the delay should be minimum. Maybe 10 seconds. Just for spanning-tree to settle.

@tgodaA
Copy link
Contributor Author

tgodaA commented Feb 3, 2022

Mlag reload delay:

Here is the info I found on https://www.arista.com/en/um-eos/eos-multi-chassis-link-aggregation?searchword=mlag%20timer

image


image


image

Copy link
Contributor

@ClausHolbechArista ClausHolbechArista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for this great contribution.

@ClausHolbechArista ClausHolbechArista requested a review from a team February 4, 2022 09:12
@tgodaA tgodaA removed the request for review from titom73 February 6, 2022 04:30
Copy link
Member

@carlbuchmann carlbuchmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@carlbuchmann carlbuchmann merged commit 76d3480 into aristanetworks:devel Feb 8, 2022
@tgodaA tgodaA deleted the evpn_mh branch February 10, 2022 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
role: eos_designs issue related to eos_designs role state: CI Updated CI scenario have been updated in the PR state: Documentation role Updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for link tracking in EVPN environments.
4 participants