Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xcvrd] Implement to read sfp info from eeprom several times while de… #230

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

dennis0113
Copy link

Description

It is known that some sfp modules will need more time to power up its
data path and sometimes it caused that the application takes the
incorrect or uninitialized data to display.

The original way is to read eeprom as soon as the sfp is inserted. And
the latest modification in PR(#201)
is to add a 2 seconds delay after sfp is detected.

Our modification is to keep the original way which will read eeprom as
soon as the sfp is detected. It is expected that the eeprom data in
some sfp modules which have shorter warm up time can be obtained
immediately. For the other modules which has longer init time, xcvrd
will have a second try to read eeprom when sfp is inserted for 2 seconds
and the third try will be applied when sfp is inserted for 5 seconds.
Record to syslog when the checksum of sfp info read from eeprom is
different from the last try.

Motivation and Context

How Has This Been Tested?

1.Build sonic_xcvrd python .whl
2.Install the python wheel on switch and insert sfp transceiver

Additional Information (Optional)

…tected sfp inserted

It is known that some sfp modules will need more time to power up its
data path and sometimes it caused that the application takes the
incorrect or uninitialized data to display.

The original way is to read eeprom as soon as the sfp is inserted. And
the latest modification in PR(sonic-net#201)
is to add a 2 seconds delay after sfp is detected.

Our modification is to keep the original way which will read eeprom as
soon as the sfp is detected. It is expected that the eeprom data in
some sfp modules which have shorter warm up time can be obtained
immediately. For the other modules which has longer init time, xcvrd
will have a second try to read eeprom when sfp is inserted for 2 seconds
and the third try will be applied when sfp is inserted for 5 seconds.
Record to syslog when the checksum of sfp info read from eeprom is
different from the last try.
@prgeor
Copy link
Collaborator

prgeor commented Dec 10, 2021

Please note, datapath init is different from management init. Management init can take at max 2 seconds for any QSFP28,QSFP+,QSFP-DD.

  1. What is the motivation of this PR?
  2. What issue are you facing? Which spec does your transceiver follow ?

@dennis0113
Copy link
Author

dennis0113 commented Dec 14, 2021

1.What is the motivation of this PR?

For some kinds of QSFP which has faster warm up time, we can read its eeprom data
directly as soon as it was detected as presence by xcvrd.
For some kinds of QSFP which has slower warm up time,
we can have more tries to read its eeprom data.

2.What issue are you facing? Which spec does your transceiver follow ?

We found that some QSFP's eeprom data could not be updated to redis-db correctly by xcvrd.
When this situation occurred, we compared the hexdump of sys node where xcvrd extracted eeprom data.
and find that the data in sys node is correct, but xcvrd just could not update it to db.
We considered that this kinds of QSFP may have long warm up time to prepare the data in eeprom.
Therefore, have this implement to let xcvrd can read eeprom data more times once the transceiver is detected.

@prgeor
Copy link
Collaborator

prgeor commented Dec 14, 2021

1.What is the motivation of this PR?

For some kinds of QSFP which has faster warm up time, we can read its eeprom data
directly as soon as it was detected as presence by xcvrd.

What spec does these kind of QSFP follow? Who is the vendor of such QSFP?

For some kinds of QSFP which has slower warm up time,
we can have more tries to read its eeprom data.

What spec does these kind of QSFP follow? Who is the vendor of such QSFP?

We found that some QSFP's eeprom data could not be updated to redis-db correctly by xcvrd.
When this situation occurred, we compared the hexdump of sys node where xcvrd extracted eeprom data.
and find that the data in sys node is correct, but xcvrd just could not update it to db.
We considered that this kinds of QSFP may have long warm up time to prepare the data in eeprom.
Therefore, have this implement to let xcvrd can read eeprom data more times once the transceiver is detected.

From SFF-8679 QSFP28-hw spec the initialization time for i2c to be ready is 2000 msec MAX. Looks like the QSFP module in your case doesn't seem to be compliant with the spec, if so please check with the QSFP vendor. If the module i2c is ready before 2000 msec timeout, then Xcvrd will wait at max 2000 msec before it does the first i2c transaction. Why do you think 2 seconds wait time before reading and updating the DB is a concern? Which platform are you testing?

image

@prgeor
Copy link
Collaborator

prgeor commented Feb 8, 2022

@dennis0113 are you still looking for this PR?

@dennis0113
Copy link
Author

Sorry... I will keep working on this PR and update the reply for the question you had mentioned.

vdahiya12 added a commit to vdahiya12/sonic-platform-daemons that referenced this pull request Apr 4, 2022
sonic-net#230)

This release goes in sync with the following firmware version of Broadcom Y cable, which is consistent with release 7
{
"version_nic_active": "D207.1.D103.1",
"version_nic_inactive": "D207.1.D103.1",
"version_nic_next": "D207.1.D103.1",
"version_peer_active": "D307.1",
"version_peer_inactive": "D307.1",
"version_peer_next": "D307.1",
"version_self_active": "D307.1",
"version_self_inactive": "D307.1",
"version_self_next": "D307.1"
}

Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com

Description
Basically a vendor specific implementation of abstract YCableBase class .
detailed design discussion can be found https://github.com/Azure/SONiC/pull/757/files

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants