Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease link probing interval after switchover to better determine the overhead of a toggle #43

Merged

Conversation

zjswhhh
Copy link
Contributor

@zjswhhh zjswhhh commented Mar 16, 2022

Description of PR

Summary:
Fixes # (issue)

This PR is to get more accurate timestamp of when toggle completes on mux.

The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db LINK_PROBE_STATS table.

When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well.

sign-off: Jing Zhang zhangjing@microsoft.com

Type of change

  • Bug fix
  • New feature
  • Doc/Design
  • Unit test

Approach

What is the motivation for this PR?

To better determine the overhead of a toggle.

How did you do it?

Decrease link probing interval after switchover is triggered.

How did you verify/test it?

Tested cases below on dual testbed:

  1. switchover succeeds, icmp_respnder is on.
  2. switchover completes but icmp_responder is off.

In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.

Any platform specific information?

Documentation

//
// get link prober interval
//
uint32_t LinkProber::getProbingInterval()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it better to have this function as inline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated accordingly.

Copy link
Contributor

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zjswhhh zjswhhh merged commit c43cf7a into sonic-net:master Mar 22, 2022
@zjswhhh zjswhhh deleted the decreaseLinkProberIntervalAfterSwitchover branch March 22, 2022 23:22
zjswhhh added a commit that referenced this pull request Mar 22, 2022
…he overhead of a toggle (#43)

### Description of PR
Summary:
Fixes # (issue)

This PR is to get more accurate timestamp of when toggle completes on mux.  

The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```.

When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. 

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
To better determine the overhead of a toggle. 

#### How did you do it?
Decrease link probing interval after switchover is triggered. 

#### How did you verify/test it?
Tested cases below on dual testbed: 
1. switchover succeeds, icmp_respnder is on. 
2. switchover completes but icmp_responder is off. 

In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh added a commit to zjswhhh/sonic-linkmgrd that referenced this pull request Mar 23, 2022
…he overhead of a toggle (sonic-net#43)

### Description of PR
Summary:
Fixes # (issue)

This PR is to get more accurate timestamp of when toggle completes on mux.  

The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```.

When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. 

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
To better determine the overhead of a toggle. 

#### How did you do it?
Decrease link probing interval after switchover is triggered. 

#### How did you verify/test it?
Tested cases below on dual testbed: 
1. switchover succeeds, icmp_respnder is on. 
2. switchover completes but icmp_responder is off. 

In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh added a commit to zjswhhh/sonic-linkmgrd that referenced this pull request Mar 23, 2022
…he overhead of a toggle (sonic-net#43)

### Description of PR
Summary:
Fixes # (issue)

This PR is to get more accurate timestamp of when toggle completes on mux.  

The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```.

When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. 

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
To better determine the overhead of a toggle. 

#### How did you do it?
Decrease link probing interval after switchover is triggered. 

#### How did you verify/test it?
Tested cases below on dual testbed: 
1. switchover succeeds, icmp_respnder is on. 
2. switchover completes but icmp_responder is off. 

In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh added a commit that referenced this pull request Mar 23, 2022
…he overhead of a toggle #43 (#48)

### Description of PR
Original commit & PR in master branch: 

c43cf7a Jing Zhang      Tue Mar 22 16:22:00 2022 -0700  Decrease link probing interval after switchover to better determine the overhead of a toggle (#43)

Summary:
Fixes # (issue)

This PR is to get more accurate timestamp of when toggle completes on mux.  

The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```.

When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. 

sign-off: Jing Zhang zhangjing@microsoft.com

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
To better determine the overhead of a toggle. 

#### How did you do it?
Decrease link probing interval after switchover is triggered. 

#### How did you verify/test it?
Tested cases below on dual testbed: 
1. switchover succeeds, icmp_respnder is on. 
2. switchover completes but icmp_responder is off. 

In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh added a commit that referenced this pull request Apr 1, 2022
### Description of PR
Summary:
Fixes # (issue)

Disable part of the feature introduced in #43. 

The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. 

sign-off: Jing Zhang zhangjing@microsoft.com

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
We need to reconsider the design of this feature. 

To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
zjswhhh added a commit that referenced this pull request Apr 1, 2022
…switch overhead #49 (#54)

### Description of PR
Can't cleanly cherry pick the commit from master branch: 
34a68d1 disable switchover measuring based on link prober (#49)

Summary:
Fixes # (issue)

Disable part of the feature introduced in #43. 

The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. 

sign-off: Jing Zhang zhangjing@microsoft.com

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
We need to reconsider the design of this feature. 

To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
zjswhhh added a commit to zjswhhh/sonic-linkmgrd that referenced this pull request Apr 15, 2022
…switch overhead sonic-net#49 (sonic-net#54)

### Description of PR
Can't cleanly cherry pick the commit from master branch: 
34a68d1 disable switchover measuring based on link prober (sonic-net#49)

Summary:
Fixes # (issue)

Disable part of the feature introduced in sonic-net#43. 

The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. 

sign-off: Jing Zhang zhangjing@microsoft.com

### Type of change
- [x] New feature

### Approach
#### What is the motivation for this PR?
We need to reconsider the design of this feature. 

To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants