-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
portsorch: initial support for link-training #2359
Conversation
DVS tarball: Test result
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
|
||
void PortsOrch::initPortCapLinkTraining(Port &port) | ||
{ | ||
// TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about the SAI commit associated with sonic-sairedis, hence there is not much we can do right now, this will need to be revisited once the SAI commit is updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for a record, this port attribute is already merged into the master branch of SAI repo. a few months ago.
@ds952811 please resolve merge conflict |
Done, please check it out |
orchagent/portsorch.cpp
Outdated
// autoneg is not supported, don't retry | ||
it = consumer.m_toSync.erase(it); | ||
continue; | ||
} | ||
if (autoneg_mode_map.find(an_str) == autoneg_mode_map.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add your changes after this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
initPortCapAutoNeg(p); | ||
m_portList[alias] = p; | ||
} | ||
if (p.m_cap_an < 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this check better? because -1 means not initialized?
if (p.m_cap_an == 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's okay to use (p.m_cap_an == 0) right here, but please note we don't want to process the AN request if it's neither initialized nor supported. Hence I prefer keeping 'if (p.m_cap_an < 1)' right here.
orchagent/portsorch.cpp
Outdated
} | ||
if (p.m_cap_an < 1) | ||
{ | ||
SWSS_LOG_ERROR("%s: autoneg is not supported", p.m_alias.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the p.m_cap_an value in the log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -266,6 +268,7 @@ class PortsOrch : public Orch, public Subject | |||
void doLagMemberTask(Consumer &consumer); | |||
|
|||
void doTask(NotificationConsumer &consumer); | |||
void doTask(swss::SelectableTimer &timer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there will be too many doTask, for redability, could rename to doRefreshPortAnLtStateTask()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, this will become an infra. change to the orchagent, and is outside the scope of this PR, it's better to revisit the timer support in the orchagent if this is a concern
} | ||
|
||
/* Activate/De-activate a specific port state poller task */ | ||
void PortsOrch::updatePortStatePoll(const Port &port, port_state_poll_t type, bool active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_port_state_poll[] is getting updated in two different contexts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's a selectable timer built on top of file descriptors, and it's handled in the same context of other tasks.
i.e.
https://github.com/Azure/sonic-swss/blob/eb79ca4a524899f57995abd5ae5ec5f773ec85d0/orchagent/orchdaemon.cpp#L620
orchagent/portsorch.cpp
Outdated
|
||
for (auto it = m_port_state_poll.begin(); it != m_port_state_poll.end(); ) | ||
{ | ||
if ((it->second == 0) || !getPort(it->first, port)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use PORT_STATE_POLL_NONE instead of 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
if (!getPortLinkTrainingRxStatus(port, rx_status)) | ||
{ | ||
status = "on"; // LT is enabled, while the rx status is unavailable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why assume LT is enabled? it should be "unknown"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rx status is either 'trained' or 'not trained', and we already have a checker in line 7337 to only process this timer task when LT is enabled, and the status is supposed to be 'on'
i.e. 'on' means LT is enabled, while none of further failure/status is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e.
If both SAI_PORT_ATTR_LINK_TRAINING_RX_STATUS and SAI_PORT_ATTR_LINK_TRAINING_FAILURE_STATUS are not supported in the SAI, the possible LT status is as follows:
- off
- on
If both SAI_PORT_ATTR_LINK_TRAINING_RX_STATUS is supported while SAI_PORT_ATTR_LINK_TRAINING_FAILURE_STATUS is not, the possible LT status is as follows:
- off
- on (When the poller is not yet executed)
- not_trained
- trained
If both SAI_PORT_ATTR_LINK_TRAINING_RX_STATUS and SAI_PORT_ATTR_LINK_TRAINING_FAILURE_STATUS are supported, the possible LT status is as follows:
- off
- on (When the poller is not yet executed)
- trained
- frame_lock (i.e. not_trained)
- snr_low (i.e. not_trained)
- timeout (i.e. not_trained)
@ds952811 could you please rebase with latest master? |
Done, please check it out, thanks |
/azp run |
Commenter does not have sufficient privileges for PR 2359 in repo Azure/sonic-swss |
c70b1f9
to
1254268
Compare
The sanity check failed as follows, which has nothing to do with this PR
|
commit 18632a3 Author: Dante Su <dante.su@broadcom.com> Date: Mon May 23 12:22:49 2022 +0000 optimize port state refresh logic Signed-off-by: Dante Su <dante.su@broadcom.com> commit 081d491 Author: ds952811 <ds952811@broadcom.com> Date: Mon May 23 02:33:56 2022 +0000 address review comments Signed-off-by: ds952811 <ds952811@broadcom.com> commit 84bdde4 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 20 02:15:59 2022 +0000 update the default LT capability upon get failures Signed-off-by: Dante Su <dante.su@broadcom.com> commit 0f73666 Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 11:28:38 2022 +0000 Rename updatePortStatesXX as refreshPortStatesXX Signed-off-by: Dante Su <dante.su@broadcom.com> commit ddd57fe Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 04:03:13 2022 +0000 Have AN cap defaults to 1, and use AN attr for LT cap query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 876e605 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 13 11:15:12 2022 +0000 drop LT capability query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 55ced7d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 29 13:53:17 2022 +0000 incorporate autoneg support from PR#2215 Signed-off-by: Dante Su <dante.su@broadcom.com> commit a04594e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 16:33:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit e9eeb9a Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 15:00:04 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 4ff604d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:51:56 2022 +0000 Stop the port state poll by default Signed-off-by: Dante Su <dante.su@broadcom.com> commit bdfb8d8 Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:48:07 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 1c6bda8 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 08:46:21 2022 +0000 Restore pre-emphasis when LT is transitioned from ON to OFF Signed-off-by: Dante Su <dante.su@broadcom.com> commit 09a9b33 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 02:33:11 2022 +0000 fix build failure due to SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE Signed-off-by: Dante Su <dante.su@broadcom.com> commit b0bee3e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 14 07:54:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit c4345ef Author: Dante Su <dante.su@broadcom.com> Date: Fri Mar 25 02:26:05 2022 +0000 portsorch: initial support for link-training - What I did Add Link-Training support to portsorch, while Gearbox is not in the scope - Why I did it In the case of DAC, static pre-calibrated pre-emphasis is rarely available on SONIC, as most of the ODM are expecting this to be done dynamically at runtime via link-training, hence we'll need this feature to improve the link quality - How I verified it Manual test Ran the Unit-tests to the corresponding changes Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su <dante.su@broadcom.com>
This reverts commit 8e2f7a4.
Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su <dante.su@broadcom.com>
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
It failed as follows, which has nothing to do with this PR
|
@prsunny please help merge |
@ds952811 is it possible to monitor ilnk-training status in case setting LT on/off is not supported? |
It's done by dropping the checker in the status poller, but why do we need
this?
If the LT on/off is not supported (i.e. standalone link-training), it's
most unlikely the LT status get operations are implemented.
Best Wishes,
Dante (Kuo-Jung) Su
…On Fri, Aug 5, 2022 at 1:56 AM Volodymyr Boiko ***@***.***> wrote:
@ds952811 <https://github.com/ds952811> is it possible to monitor
ilnk-training status in case setting LT on/off is not supported?
—
Reply to this email directly, view it on GitHub
<#2359 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMLISKNUAJMIA7B675PZEL3VXP74TANCNFSM52BMTXOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
|
Even if standalone is not supported we might want to provide oper status after running AN - if AN is completed we might want to show the link is trained. |
Yes, this is a valid use case, we could activate the LT status poller when either AN or LT is enabled. |
@ds952811 do you mean PR for documentation? I think yes, better to have the idea reviewed in community and documented first before chaning the behavior. |
While the requested feature is working on the BRCM chips, I do not know what will be the case for the non-BRCM chips, and the poller will generate a series of error logs if the LT status retrieval is not supported by the chip. (Either PHY or MAC). Note: In the case of BRCM private image, we do have LT status poller enabled for both AN and LT, and it's for debugging the link failure for AN. e.g. LT is likely to fail on the optics, hence having AN enabled on a optic is most likely fail at this case. |
@ds952811 the question why the poller should generate the error logs if status "not supported" is returned from SAI call? |
Yep, you're absolutely right about it, we could suppress the error logs if NOT_SUPPORTED is returned from the SAI library. Let's raise a new PR for this enhancement. |
* Squashed commit of the following: commit 18632a3 Author: Dante Su <dante.su@broadcom.com> Date: Mon May 23 12:22:49 2022 +0000 optimize port state refresh logic Signed-off-by: Dante Su <dante.su@broadcom.com> commit 081d491 Author: ds952811 <ds952811@broadcom.com> Date: Mon May 23 02:33:56 2022 +0000 address review comments Signed-off-by: ds952811 <ds952811@broadcom.com> commit 84bdde4 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 20 02:15:59 2022 +0000 update the default LT capability upon get failures Signed-off-by: Dante Su <dante.su@broadcom.com> commit 0f73666 Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 11:28:38 2022 +0000 Rename updatePortStatesXX as refreshPortStatesXX Signed-off-by: Dante Su <dante.su@broadcom.com> commit ddd57fe Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 04:03:13 2022 +0000 Have AN cap defaults to 1, and use AN attr for LT cap query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 876e605 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 13 11:15:12 2022 +0000 drop LT capability query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 55ced7d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 29 13:53:17 2022 +0000 incorporate autoneg support from PR#2215 Signed-off-by: Dante Su <dante.su@broadcom.com> commit a04594e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 16:33:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit e9eeb9a Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 15:00:04 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 4ff604d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:51:56 2022 +0000 Stop the port state poll by default Signed-off-by: Dante Su <dante.su@broadcom.com> commit bdfb8d8 Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:48:07 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 1c6bda8 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 08:46:21 2022 +0000 Restore pre-emphasis when LT is transitioned from ON to OFF Signed-off-by: Dante Su <dante.su@broadcom.com> commit 09a9b33 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 02:33:11 2022 +0000 fix build failure due to SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE Signed-off-by: Dante Su <dante.su@broadcom.com> commit b0bee3e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 14 07:54:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit c4345ef Author: Dante Su <dante.su@broadcom.com> Date: Fri Mar 25 02:26:05 2022 +0000 portsorch: initial support for link-training - What I did Add Link-Training support to portsorch, while Gearbox is not in the scope - Why I did it In the case of DAC, static pre-calibrated pre-emphasis is rarely available on SONIC, as most of the ODM are expecting this to be done dynamically at runtime via link-training, hence we'll need this feature to improve the link quality - How I verified it Manual test Ran the Unit-tests to the corresponding changes Signed-off-by: Dante Su <dante.su@broadcom.com> * Add support for selected multiple tests Signed-off-by: Dante Su <dante.su@broadcom.com> * Revert "Add support for selected multiple tests" This reverts commit 8e2f7a4. * fix the comment for 'autoneg is not supported' Signed-off-by: Dante Su <dante.su@broadcom.com> * address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> * validate AN cap only when there is an update to AN config Signed-off-by: Dante Su <dante.su@broadcom.com> * drop the changes to tests/conftest.py Signed-off-by: Dante Su <dante.su@broadcom.com> * fix link failure in p4orch_tests-fake_portorch.o Signed-off-by: Dante Su <dante.su@broadcom.com>
* Squashed commit of the following: commit 18632a3 Author: Dante Su <dante.su@broadcom.com> Date: Mon May 23 12:22:49 2022 +0000 optimize port state refresh logic Signed-off-by: Dante Su <dante.su@broadcom.com> commit 081d491 Author: ds952811 <ds952811@broadcom.com> Date: Mon May 23 02:33:56 2022 +0000 address review comments Signed-off-by: ds952811 <ds952811@broadcom.com> commit 84bdde4 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 20 02:15:59 2022 +0000 update the default LT capability upon get failures Signed-off-by: Dante Su <dante.su@broadcom.com> commit 0f73666 Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 11:28:38 2022 +0000 Rename updatePortStatesXX as refreshPortStatesXX Signed-off-by: Dante Su <dante.su@broadcom.com> commit ddd57fe Author: Dante Su <dante.su@broadcom.com> Date: Thu May 19 04:03:13 2022 +0000 Have AN cap defaults to 1, and use AN attr for LT cap query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 876e605 Author: Dante Su <dante.su@broadcom.com> Date: Fri May 13 11:15:12 2022 +0000 drop LT capability query Signed-off-by: Dante Su <dante.su@broadcom.com> commit 55ced7d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 29 13:53:17 2022 +0000 incorporate autoneg support from PR#2215 Signed-off-by: Dante Su <dante.su@broadcom.com> commit a04594e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 16:33:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit e9eeb9a Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 28 15:00:04 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 4ff604d Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:51:56 2022 +0000 Stop the port state poll by default Signed-off-by: Dante Su <dante.su@broadcom.com> commit bdfb8d8 Author: Dante Su <dante.su@broadcom.com> Date: Fri Apr 22 03:48:07 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit 1c6bda8 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 08:46:21 2022 +0000 Restore pre-emphasis when LT is transitioned from ON to OFF Signed-off-by: Dante Su <dante.su@broadcom.com> commit 09a9b33 Author: Dante Su <dante.su@broadcom.com> Date: Mon Apr 18 02:33:11 2022 +0000 fix build failure due to SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE Signed-off-by: Dante Su <dante.su@broadcom.com> commit b0bee3e Author: Dante Su <dante.su@broadcom.com> Date: Thu Apr 14 07:54:14 2022 +0000 address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> commit c4345ef Author: Dante Su <dante.su@broadcom.com> Date: Fri Mar 25 02:26:05 2022 +0000 portsorch: initial support for link-training - What I did Add Link-Training support to portsorch, while Gearbox is not in the scope - Why I did it In the case of DAC, static pre-calibrated pre-emphasis is rarely available on SONIC, as most of the ODM are expecting this to be done dynamically at runtime via link-training, hence we'll need this feature to improve the link quality - How I verified it Manual test Ran the Unit-tests to the corresponding changes Signed-off-by: Dante Su <dante.su@broadcom.com> * Add support for selected multiple tests Signed-off-by: Dante Su <dante.su@broadcom.com> * Revert "Add support for selected multiple tests" This reverts commit 8e2f7a4. * fix the comment for 'autoneg is not supported' Signed-off-by: Dante Su <dante.su@broadcom.com> * address review comments Signed-off-by: Dante Su <dante.su@broadcom.com> * validate AN cap only when there is an update to AN config Signed-off-by: Dante Su <dante.su@broadcom.com> * drop the changes to tests/conftest.py Signed-off-by: Dante Su <dante.su@broadcom.com> * fix link failure in p4orch_tests-fake_portorch.o Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Dante Su dante.su@broadcom.com
What I did
Why I did it
How I verified it
Details if related
This is a subsequent PR of #2149
HLD: sonic-net/SONiC#924, sonic-net/SONiC#925