Skip to content

Releases: Mellanox/hw-mgmt

V.7.0040.0033

18 Jun 11:07
3eb2646
Compare
Choose a tag to compare

Issue Title
#3649551 SN4700 : [Independent Module] | on r-leopard-41 with IM enabled, there was a thermal overload.
#3878328 SN4700 : Switch rebooted with "Thermal Overload" because ASIC thermal is not available
#3885405 TC: [Thermal Algorithm] | Blacklist is malfunctions
#3883147 TC: [Thermal Algorithm] | Counts errors even it was paused by black list
#3879220 SN3420 : Thermal control: increase PWM minimum speed (20%->25%) to work around fan state issue reported by smond
#3895891 SPC1: [systemctl is-system-running] | SPC1 stuck in starting state after config reload - System was not started – lmsensor dependency issue
#3900159 QM3400: Kernel 6.1] thermal/module#temp_crit: Input/output error
#3900138 QM3400: [Kernel 6.1] Can't get value of subfeature temp input for front panel
#3882472 QM3000 | QM3400: Mismatch system names in TC config (qm3400 instead q3200)
#3948113 Switch is freezing after generating hw-mgmt dump few times in row
NA msn5400 | msn5600 | sn4280 :TC: fix asic sensor mask in sensor_parameters
NA Fix vpd parser sanity check is done only for 'MLNX' fru types
NA Multi ASIC system: kernel config CONFIG_HOTPLUG_PCI_PCIE (kernel 6.1) is required to be disabled for the sw_reset on multi asic systems
NA TC: missing support of correct PWM calculation for systems with amb
{X} sensor count != 2
NA MSN4700 | MQM9520:Some PSU1 labels are incorrectly marked as PSU2.
NA QM3000 | QM3400: voltmon1 and voltmon4 symlinks pointing to curr2 sensors Instead of curr3 sensors.
NA QM3000 ASIC PCIE mapping was wrong
NA Missing support for Kconfig per Kernel major version

V.7.0040.0031

14 Jun 00:21
d4d82b4
Compare
Choose a tag to compare
Update changelog V.7.0040.0031

V.7.0040.0031

V.7.0030.2300

28 May 11:55
f5f4975
Compare
Choose a tag to compare

================================================================================

  • V.7.0030.2300
  • Tue , 28 May 2023

  • Bug fixes
    Issue Title
    #3706151 [MSN4600-VS2RC] : Fans running at high speed when PN of switch is SSG7B27990
    #3649551 [SPC2|SPC3|QM1|QM2|QM3] TC - dynamic minimum table RPM values for "sensor_read_error" were too low causing system hitting up
    #3651819 [SN2410] in systems with customer-adjusted PN, the fan direction is not recognized , it appears wrong and causing faulty TC behaviour
    #3723906 systems occasionally report errors for FAN2 & FAN3 although only PSU FAN1 exists
    #3733632 [MSN2410|MSN2100|MSN2010] sensors.conf "Chassis Fan Drawer x fan y" labels were mistakly defined as "fan z"
    #3726901 [MSN27002] voltmon6 mistakenly appearing in sensor_list at tc_config.json file causing ERROR of reading file voltmon6_temp
    #3748535 [SP1|SPC2] chipup timeout is too short over legacy system causing sometimes failure
    NA When I2C device’s 1st probe fails there is no retry performed.
    NA Hw mgmt started handling udev events before basic hw mgmt initialization is done
    NA udev events were handled randomly by hw mgmt since udev settle command was missed in hw mgmt init
    NA [MSN5400|MSN5600|QM3400] TC – systems with a single Fan direction always considered in TC to be P2C
    NA [MSN5600|MSN5400] sensors: mistakes in labels of sensor conf.
    NA [MSN5600|MSN5400] sensors: missed sensors config rule for 2nd PSU
    NA [MSN3420] symlink mistakenly pointing to MSN3700 sensors config file

    o For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0030.2300_BR/recipes-kernel/linux/Patch_Status_Table.txt

  • Known issues and limitations:

    o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
    which is a temporary inaccessibility of certain alarm attributes read from the PSU
    o Patch 0181-Revert-Fix-out-of-bounds-memory-accesses-in-thermal.patch should be applied
    for kernel >= 5.10.74 only, to avoid thermal control interface issues
    o This version disables system reset in thermal algorithm
    o Kernel patch 4.9 #60 is available upstream from kernel 4.9.207 and
    Kernel patch 4.19 #28 is available upstream from kernel 4.19.89.
    - No need to apply these patches when working with these kernel versions
    or above
    o ethtool for QSFP-DD is working only in raw mode.
    o SN4700 PSU (Murata) sensors PSU2 and PSU3 might be not available after insertion/removal.
    o PSUs inventory read via PMBus require the following packages:
    - i2c-tools_4.1-1_amd64.deb
    - libi2c0_4.1-1_amd64.deb
    o I2C Asic driver take up to 5 second to complete initialization. When
    sending ADD even need to make sure to wait at list 5 second before
    reset of ASIC.
    o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their
    "-B" variants) require the following flag in kernel cmdline:
    "acpi_enforce_resources=lax acpi=noirq".
    o Few bug fixes introduced in upstream kernel 4.19, whoever use older
    v4.19 kernel then v4.19.58, should cherry pick the following commits:
    - Fix wrong order in probing routine initialization:
    d2d8f64012543898a0158b1fc5c07af3d41c89d8 (available in v4.19.49)
    - Fix parent device in i2c-mux-reg device registration
    c241f3fbfa1af86f572a92f2e4d708358e163806 (available in v4.19.58)
    o Kernel patch 4.9 #37 is available upstream from kernel 4.9.197 and
    Kernel patch 4.19 #9 is available upstream from kernel 4.19.79.
    - No need to apply these patches when working with these kernel versions
    or above
    o This version requires FW version 29.2000.1886 or higher for spectrum-2
    and 13.2000.1886 or higher for spectrum-1.

V.7.0030.3992

21 May 09:03
de54e85
Compare
Choose a tag to compare
Update changelog V.7.0030.3992

V.7.0030.4001

05 May 13:36
12314cd
Compare
Choose a tag to compare

Update hw mgmt. deployment tool to include flags parameter option for each sub-Sonic noses (NVOS and DVS) in addition to existing Sonic flag

V.7.0030.3985.bobcat

23 Apr 11:24
0c05aae
Compare
Choose a tag to compare

V.7.0030.3985.bobcat

V.7.0030.4050

11 Apr 16:22
9d7730e
Compare
Choose a tag to compare

V.7.0030.4050

V.7.0030.4000

05 Apr 08:49
3171acc
Compare
Choose a tag to compare

================================================================================

  • V.7.0030.4000
  • Mon , 1 Apr 2024

  • New features
    o Add support for QM3400 Crocodile - ES level quality
    o Add support for SN5400 Hippo - ES level quality
    o Add support thermal algorithm configurable minimum N PSUs and M Fans for error reporting
    o Add support different power supply redundency policies
    o Update HW-MGMT package installation dependency list
    o Optimize PSU initialization time

  • Bug fixes
    Issue Title
    #3706151 [MSN4600-VS2RC] : Fans running at high speed when PN of switch is SSG7B27990
    #3649551 [SPC2|SPC3|QM1|QM2|QM3] TC - dynamic minimum table RPM values for "sensor_read_error" were too low causing system hitting up
    #3651819 [SN2410] in systems with customer-adjusted PN, the fan direction is not recongnized,it appears wrong and causing faulty TC behavior
    #3825753 VPD parser: In case if BASE_MAC/GUID data starting with 0x0{X} and ending 0x{X}0 BASE_MAC1/GUID field decodes into 5B/7B instead of 6B/8B
    #3723906 systems occasionally report errors for FAN2 & FAN3 although only PSU FAN1 exists
    #3702878 [QM3400] Missing "alarm/status" files for voltmons and PSUs
    #3705457 [QM3000|QM3400] symlink for PSU voltage regulator included empty spaces
    #3733632 [MSN2410|MSN2100|MSN2010] sensors.conf "Chassis Fan Drawer x fan y" labels were mistakly defined as "fan z"
    #3726901 [MSN27002] voltmon6 mistakenly appearing in sensor_list at tc_config.json file causing ERROR of reading file voltmon6_temp
    #3835080 [MSN4280] wrong fan_max_speed/fan_min_speed RPM values defined in hw-management.sh
    #3748535 [SP1|SPC2] chipup timeout is too short over legacy system causing sometimes failure
    #3747683 [SN5400] : sysfs attributes related for power management are missing
    #3847931 [QM3400] TC erross over ctx_amb ConnectX thermal sensor which is not existing in hardware
    #3847741 [SGN2410] TC errors due to system not properly defined as "not supported" for TC
    NA When I2C device’s 1st probe fails there is no retry performed
    NA Hw mgmt started handling udev events before basic hw mgmt initialization is done
    NA udev events were handled randomly by hw mgmt since udev settle command was missed in hw mgmt init
    NA [MSN5400|MSN5600|QM3400] TC – systems with a single Fan direction always considered in TC to be P2C
    NA [MSN5600|MSN5400] sensors: mistakes in labels of sensor conf.
    NA [MSN5600|MSN5400] sensors: missed sensorss config rule for 2nd PSU
    NA vpd parser supportted only 2 GUID blocks instead of 4
    NA [MSN3420] symlink mitakenly pointing to MSN3700 sensors config file

    o For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0030.4000_BR/recipes-kernel/linux/Patch_Status_Table.txt

  • Known issues and limitations:

    o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
    which is a temporary inaccessibility of certain alarm attributes read from the PSU.
    o Systems may show a message of WARNING kernel: … supply vcc not found, using dummy regulator"
    o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their "-B" variants) require the following flag in kernel cmdline:
    "acpi_enforce_resources=lax acpi=noirq".

================================================================================

V.7.0030.3009

05 Apr 12:48
911f1a5
Compare
Choose a tag to compare
Update changelog V.7.0030.3009

V.7.0030.3009

V.7.0030.3008

04 Apr 15:59
Compare
Choose a tag to compare

realign patch status.