-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sonic_y_cable] add abstract class YCableBase required for Y-cable API support for multiple vendors #186
Conversation
…I support for multiple vendors Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py
Outdated
TOR A. This means if the Y cable is actively routing, the "check_active_linked_tor_side" | ||
API will now return Tor A. It also implies that if the link is actively routing on this port, Y cable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe change 'routing' to 'sending', since 'routing' tends to imply some decision-making capability which is not the case for the cable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it still says 'routing'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sonic_y_cable/y_cable_base.py
Outdated
TOR B. This means if the Y cable is actively routing, the "check_active_linked_tor_side" | ||
API will now return Tor B. It also implies that if the link is actively routing on this port, Y cable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sonic_y_cable/y_cable_base.py
Outdated
|
||
def check_if_link_is_active_for_NIC(self, port): | ||
""" | ||
This API specifically checks if NIC side of the Y cable's link is active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the definition we are using for 'active'? Does that mean the cable is plugged in, link is set to admin up, traffic is flowing through this link, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes basically link is up means cable is up and configured correctly, show int status should show "oper" up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this definition commonly known? It might be worth including this info in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe I can elaborate a little. I think "oper" up on Cli show interfaces status
might confuse the vendors, but objectively this API it's just their capability to check if all the parameters for transceiver initialization happened correctly, and the link is active. Actually when the lanes come up they send some packets internally on Serdes to check if lanes are functioning properly and a bunch of other things to check if link is healthy. While this generally translates to link is up, they cant tell if the link will be up on SONiC side. So I think we can make this definition more concrete but maybe we should not put show int status
output a cross check for them. Thoughts ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as long as the vendors know what this means, it's fine as is. I'll defer to your judgement here.
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py
Outdated
""" | ||
self.port = port | ||
|
||
def toggle_mux_to_torA(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest renaming to toggle_mux_to_tor_a
to make snake case consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sonic_y_cable/y_cable_base.py
Outdated
|
||
raise NotImplementedError | ||
|
||
def enable_loopback_mode(self, target, mode=NEAR_END_LOOPBACK, lane_mask): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def enable_loopback_mode(self, target, mode=NEAR_END_LOOPBACK, lane_mask): | |
def enable_loopback_mode(self, target, lane_mask, mode=LOOPBACK_MODE_NEAR_END): |
Suggest to swap the order of lane_mask and mode to fix 'non-default argument follows default argument' syntax error and correct the variable name with 'LOOPBACK_MODE_NEAR_END'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
raise NotImplementedError | ||
|
||
def get_switch_count_tor_a(self, clear_on_read=False): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both get_switch_count_tor_a() and get_switch_count_tor_b() are not supported in current MCU firmware. The cable can only return the total auto/manual switch count by reading the upper page 0x5, byte 141 to 148. If this information is useful for diagnostics, I will update the API document and MCU FW to support these two APIs from next firmware release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, we defined a total manual switching count in upper page 0x4, byte 141-144. In order to support these two APIs
- get_switch_count_tor_a()
- get_switch_count_tor_b()
I'd like to change the definition as below
- upper page 0x4, byte 141-144: number of manual switching count initiated from TOR A
- upper page 0x4, byte 225-229: number of manual switching count initiated from TOR B
and the get_switch_count_total() will return the sum of TOR A (byte 141-144) and TOR B (byte 225 - 228).
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
|
||
############################################################################################# | ||
### Debug Functionality ### | ||
############################################################################################# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a general question of the debug functionality. We observed that those diagnostic functions (get_ber_inf(), enable_loopback_mode()...etc.) can not work well when AN/LT enabled. Both sides receiver will jump to wait for AN (auto-negotiation) state after we enabled the debug mode (ex. set prbs generation, set loopback mode...). The AN/LT can no longer be finished because the TX were constantly transmit PRBS data instead of responding the AN signal. User cannot get correct BER info due to the receiver is not ready.
To solve the deadlock between AN/LT and debug function, the alternative solution is to re-configure both sides (SONiC and y cable) to disable AN/LT first, then run the debugging functions (PRBS, BER, Loopback)
For the next new firmware, the y_cable will support a mode configuration register in upper page 0x4, byte 209. User can configure the speed (50G, 100G) and ANLT on/off via this register.
For SONiC OS, is there any simple way to enable/disable the AN/LT without reloading the Config_db?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think there is any other way for now, other than to reload the confg_db.json
sonic_y_cable/y_cable_base.py
Outdated
raise NotImplementedError | ||
|
||
def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE, | ||
fec_mode_nic=FEC_MODE_NONE, anlt_tor_a=False, anlt_tor_b= False, anlt_nic=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it make sense to change the input parameter as suggestion?
For current y_cable firmware implementation, both TOR A and TOR B must be operated in the same mode. it is not allowed to configure AN/LT, FEC on TOR A and B individually.
fec_mode_nic=FEC_MODE_NONE, anlt_tor_a=False, anlt_tor_b= False, anlt_nic=False): | |
fec_mode_nic=FEC_MODE_NONE, anlt_tor=False, anlt_nic=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, can you please check
sonic_y_cable/y_cable_base.py
Outdated
|
||
raise NotImplementedError | ||
|
||
def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE, | |
def create_port(self, speed, fec_mode_tor=FEC_MODE_NONE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, can you please check
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py
Outdated
|
||
def log_warning(self, msg): | ||
self.main_logger.log_warning("y_cable_port {} {}".format(self.port, msg)) | ||
self._logger.log_warning("y_cable_port {} {}".format(self.port, msg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to add something like a colon as a delimiter. E.g, y_cable_port {}: {}
. Do this for all log functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
FEC_MODE_FC = 2 | ||
|
||
# definitions of the modes to be run for loopback mode | ||
# on the port/cable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add LOOPBACK_MODE_NONE if the loopback mode is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
mode_value: | ||
One of the following predefined constants, the mode to be run for loopback: | ||
LOOPBACK_MODE_NEAR_END | ||
LOOPBACK_MODE_FAR_END |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we return LOOPBACK_MODE_NONE if this feature is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: xinyu <xinyu0123@gmail.com>
|
||
raise NotImplementedError | ||
|
||
def get_pcs_stats(self, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_pcs_stats(self, target): | |
def get_pcs_stats(self, target, clear_on_read=False): |
Is it necessary to add clear_on_read to reset the pcs statistics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the feature cannot be supported, you can return an error when the caller tries to enable clear_on_read= True;
I believe other vendors might be able to support this, hence it is kept as an argument.
|
||
raise NotImplementedError | ||
|
||
def get_fec_stats(self, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_fec_stats(self, target): | |
def get_fec_stats(self, target, clear_on_read=False): |
Is it necessary to add clear_on_read to reset the fec statistics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the feature cannot be supported, you can return an error when the caller tries to enable clear_on_read= True;
I believe other vendors might be able to support this, hence it is kept as an argument.
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
raise NotImplementedError | ||
|
||
def get_target_cursor_values(self, lane, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of cursor, can we rename to get_target_equalization() as that would be more generic? pre-tap, post-tap, pre-emphasis, pre-cursor, post-cursor all fall under equalization settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make a note for this, all vendors will also need to be notfied for this. We can put this in a future PR.
|
||
raise NotImplementedError | ||
|
||
def set_target_cursor_values(self, lane, cursor_values, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of cursor, can we rename to set_target_equalization() as that would be more generic? pre-tap, post-tap, pre-emphasis, pre-cursor, post-cursor all fall under equalization settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make a note for this, all vendors will also need to be notfied for this. We can put this in a future PR
a boolean, False, Non-hitless upgrade: it will update the firmware regardless | ||
the current status, a link flip can be observed during the upgrade. | ||
Returns: | ||
One of the following predefined constants: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a vendor doesn't support hitless upgrade, then we should have a return value indicating the failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently all vendors need to support hitless by default. The firmware error codes are going to be in a future PR as they are under discussion by vendors. Currently the recommendation is if there is an error which is not defined, vendor can return a generic error EEPROM_ERROR, which the API is called for.
|
||
raise NotImplementedError | ||
|
||
def get_firmware_version(self, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can Y-cable function with different targets running different firmware versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, depends on vendor. Some vendors have a unique version on all MCU's whereas some could have different versions.
|
||
raise NotImplementedError | ||
|
||
def get_eye_heights(self, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NRZ signal has single eye, whereas PAM4 signal has 3 eyes. So we should return something like.
{lane#: eye1, eye2, eye3} for PAM4 where eye1 is for upper eye, eye2 is for middle eye and eye3 for lower eye heights respectively.
{lane#: eye} for NRZ signal
PAM4 is used in QSFP-DD, OSFP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently all vendors are on NRZ and that's why we have a single eye value for each lane.
|
||
raise NotImplementedError | ||
|
||
def get_switch_count_target(self, switch_count_type, target, clear_on_read=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API will fail for target=TARGET_NIC but the return type don't indicate the failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted for a further enhancement PR
raise NotImplementedError | ||
|
||
def get_switching_mode(self): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will this api return if set_switching_mode() is not called? or in other words what is the default mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by default the mode has to be manual. There can never be a no mode value. it is either auto or manual
|
||
raise NotImplementedError | ||
|
||
def get_alive_status(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this API return true if all the API is_link_active() returns true for ALL targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No link active is different from get_alive_status. get_alive_status just checks whether all sides of the cable are plugged in or not.
|
||
raise NotImplementedError | ||
|
||
def get_autoswitch_hysteresis_timer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this return 0 if hystersis timer is not configured or return the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should return the value corresponding to the eeprom offset where the definition is. All vendors could have different hysteresis_timer values but default is 5 s
|
||
raise NotImplementedError | ||
|
||
def reset(self, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please specify if this API affect the ongoing traffic on the link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if target is NIC this will reset.
…I support for multiple vendors (#186) Description For multiple Y-Cable vendor support once we do have a mapping from vendor/part number to the appropriate Y-Cable module to load, we need to map appropriate port to a module as well. This PR adds definition for base abstract class. Motivation and Context Basically, the key idea is once we have a port identified as being of a certain vendor and it has been identified to load a certain module, it is then needed to call the correct module/API on each port each time we call the API on the port it is required to maintain this mapping in memory since xcvrd does not want to read/parse this y_cable_vendor_mapping.py file again and again each time we call the Y-Cable API Also note that the module loaded might change during xcvrd running lifetime since cable/transceiver can be changed from one vendor to another. So we need to take this into consideration as well Proposed Solution for this Each module of the Y-Cable vendor can be a class (of each transceiver type) and all we need to do is instantiate the objects of these classes as class instances and these objects will provide the interface of calling the API's for the appropriate vendor Y-Cable. This instantiation will be done inside xcvrd, when xcvrd starts These objects basically emulate Y-Cable instances and whatever action/interaction needs to be done with the Y-Cable the methods of these objects would provide that each vendor in their implementation can inherit from a base class where there will be definitions for all the supported capabilities of the Y-Cable. for vendors the recommended approach in case their subclass implementation does not support a method, is to set the method equal to None. This differentiates it from a method they forgot to implement. Then, the calling code should first check if the method is None before attempting to call it. Design document for the support is sonic-net/SONiC#757 Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Description
For multiple Y-Cable vendor support once we do have a mapping from vendor/part number to the appropriate Y-Cable module to load, we need to map appropriate port to a module as well. This PR adds definition for base abstract class.
Motivation and Context
Basically, the key idea is once we have a port identified as being of a certain vendor and it has been identified to load a certain module, it is then needed to call the correct module/API on each port each time we call the API on the port
it is required to maintain this mapping in memory since xcvrd does not want to read/parse this y_cable_vendor_mapping.py file again and again each time we call the Y-Cable API
Also note that the module loaded might change during xcvrd running lifetime since cable/transceiver can be changed from one vendor to another. So we need to take this into consideration as well
Proposed Solution for this
Each module of the Y-Cable vendor can be a class (of each transceiver type) and all we need to do is instantiate the objects of these classes as class instances and these objects will provide the interface of calling the API's for the appropriate vendor Y-Cable.
This instantiation will be done inside xcvrd, when xcvrd starts
These objects basically emulate Y-Cable instances and whatever action/interaction needs to be done with the Y-Cable the methods of these objects would provide that
each vendor in their implementation can inherit from a base class where there will be definitions for all the supported capabilities of the Y-Cable.
for vendors the recommended approach in case their subclass implementation does not support a method, is to set the method equal to None. This differentiates it from a method they forgot to implement. Then, the calling code should first check if the method is None before attempting to call it.
Design document for the support is here
Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com
How Has This Been Tested?
Not tested, since this PR just defines the abstract base class implementation.
Additional Information (Optional)