Skip to content

Commit

Permalink
Merge branch 'V.7.0040.1000_BR'
Browse files Browse the repository at this point in the history
t
  • Loading branch information
Yehuda Yehudai committed Jun 30, 2024
2 parents 8115a55 + fea7a27 commit 53fb87d
Show file tree
Hide file tree
Showing 181 changed files with 27,774 additions and 956 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,13 +194,13 @@ CONFIG_EDAC_AMD64=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_AMD_XGBE=m
CONFIG_AMD_XGBE_DCB=y
CONFIG_AMD_XGBE_HAVE_ECC=y
CONFIG_X86_AMD_PLATFORM_DEVICE=y
CONFIG_CPU_SUP_AMD=y
CONFIG_X86_MCE_AMD=y
CONFIG_USB_NET_DRIVERS=m
CONFIG_USB_USBNET=m
CONFIG_USB_NET_CDCETHER=m
CONFIG_HOTPLUG_PCI_PCIE=n
For arm64 architecture:
CONFIG_NET_VENDOR_MELLANOX=y
Expand Down Expand Up @@ -289,6 +289,7 @@ CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
CONFIG_FUSE_FS=m
CONFIG_SENSORS_ARM_SCMI=m
CONFIG_HOTPLUG_PCI_PCIE=n
```
**Note:**
Expand Down Expand Up @@ -325,6 +326,8 @@ sudo apt-get install devscripts build-essential lintian
- Go into the thermal-control base folder and build the Debian package.
- Run: `debuild -us -uc -b`
- To build for ARM64 architecture, run `debuild -us -uc -b -aarm64`
- To build without lm_sensor dependecy (for Sonic-based OS) run 'debuild --set-envvar=LM_DEPENDS=0 -us -uc -b'
or 'export LM_DEPENDS=0 && dpkg-buildpackage -us -uc -b'
- Find in upper folder the builded `.deb` package (for example `hw-management_1.mlnx.18.12.2018_amd64.deb`).

**For converting .deb package to .rpm package:**
Expand Down
43 changes: 43 additions & 0 deletions debian/Release.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,46 @@
================================================================================
- V.7.0040.1000
- Sun , 30 June 2024
--------------------------------------------------------------------------------

- New features
o Add support for QM3400 Blackmamba - ES level quality
o Add support for SN4280 SmartSwitch Bobcat - ES level quality
o Add support for N5110_LD Juliet Scaleout PO + TTM - ES level quality
o Add support in VPD parser for System VPD vendor specific SSD SED PSID block

- Bug fixes
#3649551 SN4700 : [Independent Module] | on r-leopard-41 with IM enabled, there was a thermal overload.
#3878328 SN4700 : Switch rebooted with "Thermal Overload" because ASIC thermal is not available
#3885405 TC: [Thermal Algorithm] | Blacklist is malfunctions
#3883147 TC: [Thermal Algorithm] | Counts errors even it was paused by black list
#3879220 SN3420 : Thermal control: increase PWM minimum speed (20%->25%) to work around fan state issue reported by smond
#3895891 SPC1: [systemctl is-system-running] | SPC1 stuck in starting state after config reload - System was not started – lmsensor dependency issue
#3900159 QM3400: [Kernel 6.1] thermal/module#_temp_crit: Input/output error
#3900138 QM3400: [Kernel 6.1] Can't get value of subfeature temp input for front panel
#3882472 QM3000 | QM3400: Mismatch system names in TC config (qm3400 instead q3200)
#3948113 Switch is freezing after generating hw-mgmt dump few times in row
#3852236 ARM: Kernel oops symptoms after boot: Unable to handle kernel paging address xxx when BSP Drivers are used
NA msn5400 | msn5600 | sn4280 :TC: fix asic sensor mask in sensor_parameters
NA vpd parser: Sanity check is done only for 'MLNX' fru types
NA Multi ASIC system: kernel config CONFIG_HOTPLUG_PCI_PCIE (kernel 6.1) is required to be disabled for the sw_reset on multi asic systems
NA TC: missing support of correct PWM calculation for systems with amb_{X} sensor count != 2
NA MSN4700 | MQM9520 :Some PSU1 labels are incorrectly marked as PSU2.
NA QM3000 | QM3400 :voltmon1 and voltmon4 symlinks pointing to curr2 sensors Instead of curr3 sensors.
NA QM3000 : ASIC PCIE mapping was wrong
NA Deployment tool : Missing support for Kconfig per Kernel major version
NA vpd-parser: In case onie "Base MAC Address" filed ends with zero byte - vpd-parser cut last byte in output.

o For detailed patch list: Please view: https://github.com/Mellanox/hw-mgmt/blob/V.7.0040.1000_BR/recipes-kernel/linux/Patch_Status_Table.txt

- Known issues and limitations:

o Systems like sn2700 which contain delta 460 PSU may have "Error getting sensor data: dps460/#25: Can't read"
which is a temporary inaccessibility of certain alarm attributes read from the PSU.
o Systems may show a message of WARNING kernel: … supply vcc not found, using dummy regulator"
o Systems SN2010, SN2100, SN2410, SN2700 and SN2740 (and their "-B" variants) require the following flag in kernel cmdline:
"acpi_enforce_resources=lax acpi=noirq".

================================================================================
- V.7.0030.4000
- Mon , 1 Apr 2024
Expand Down
4 changes: 2 additions & 2 deletions debian/changelog
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
hw-management (1.mlnx.7.0030.4000) unstable; urgency=low
hw-management (1.mlnx.7.0040.1000) unstable; urgency=low
[ MLNX ]

-- Yehuda Yehudai <yyehudai@nvidia.com> Fri, 05 Apr 2024 11:10:00 +0300
-- Yehuda Yehudai <yyehudai@nvidia.com> Sat, 29 Jun 2024 17:10:00 +0300
2 changes: 1 addition & 1 deletion debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Homepage: http://www.nvidia.com

Package:hw-management
Architecture: any
Depends: ${misc:Depends}, ${shlibs:Depends}, lsb-base (>= 3.0-6), python2.7 | python3, xxd, lm-sensors, libiio-utils, dmidecode, i2c-tools
Depends: ${misc:Depends}, ${shlibs:Depends}, ${dist:Depends}, lsb-base (>= 3.0-6), python2.7 | python3, xxd, libiio-utils, dmidecode, i2c-tools
Description: Thermal control and chassis management for Nvidia systems
This package supports Nvidia switches family for chassis
management and thermal control.
19 changes: 19 additions & 0 deletions debian/hw-management.hw-management-sync.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[Unit]
Description=Hw-management events sync service of Nvidia systems
After=hw-management.service
Requires=hw-management.service
PartOf=hw-management.service

StartLimitIntervalSec=1200
StartLimitBurst=5

[Service]
ExecStart=/bin/sh -c "/usr/bin/hw_management_sync.py"
ExecStop=/bin/kill $MAINPID
TimeoutStopSec=1

Restart=on-failure
RestartSec=10s

[Install]
WantedBy=multi-user.target
15 changes: 15 additions & 0 deletions debian/rules
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,20 @@ pname:=hw-management

pwd=$(shell pwd)

#debuild -sa -us -uc -eLM_DEPENDS=1

ifeq ($(LM_DEPENDS),0)
DEPENDS = -Vdist:Depends=""
else
DEPENDS = -Vdist:Depends="lm-sensors"
endif

%:
dh $@

override_dh_gencontrol:
dh_gencontrol -- $(DEPENDS)

override_dh_auto_configure:

override_dh_auto_build:
Expand All @@ -23,6 +34,7 @@ override_dh_auto_install:
ifeq ($(DEB_HOST_ARCH),arm64)
mv debian/$(pname)/usr/bin/iorw.sh debian/$(pname)/usr/bin/iorw
cp usr/etc/modules-load.d/05-hw-management-modules-arm64.conf debian/$(pname)/etc/modules-load.d/05-hw-management-modules.conf
cp usr/etc/modprobe.d/hw-management-arm64.conf debian/$(pname)/etc/modprobe.d/hw-management.conf
dh_installdirs -p$(pname) lib/systemd/system-shutdown
cp usr/usr/bin/hw-management-kexec-notifier.sh debian/$(pname)/lib/systemd/system-shutdown
endif
Expand All @@ -38,14 +50,17 @@ endif
override_dh_installinit:
dh_installinit --name=hw-management
dh_installinit --name=hw-management-tc
dh_installinit --name=hw-management-sync

override_dh_systemd_enable:
dh_systemd_enable --name=hw-management
dh_systemd_enable --name=hw-management-tc
dh_systemd_enable --name=hw-management-sync

override_dh_systemd_start:
dh_systemd_start --name=hw-management
dh_systemd_start --name=hw-management-tc
dh_systemd_start --name=hw-management-sync

override_dh_strip_nondeterminism:

Expand Down
Loading

0 comments on commit 53fb87d

Please sign in to comment.