Releases: RRZE-HPC/likwid
likwid-5.4.1
Changelog 5.4.1
- Fixes linking errors due to missing
bstrlib.h
- Fix for
likwid-bench
kernelstream_mem
- Fix builds with CUDA for versions 11.2 to 12.6
- Fix sysfeatures with
ACCESSMODE=perf_event
- Add AMD Zen1 to Zen3 to sysfeatures
- Add support for apple-cpufreq and cppc driver to sysfeatures
- Fix sysfeatures on ARM architectures
likwid-5.4.1rc7
v5.4.1rc7 Fix warning in GOTCHA
likwid-5.4.1rc6
v5.4.1rc6 Update version, date and headers
likwid-5.4.0
After a year without a new version, we are happy to release a new version.
- Support for Intel Granite Rapids (core, energy, uncore)
- Support for Intel Sierra Forrest (core, energy, uncore)
- Support for AMD Bergamo (core, energy, uncore)
- Support for Nvidia Grace (core, uncore)
- Fix: AMD Zen4 DataFabric units
- Fix: Multi-socket RAPL measurements on Sapphire Rapids
- Fix: Energy unit for RAPL DRAM domain for SPR, GNR and SRF
- Fix: Discovery mechanism workaround for UPI and M3UPI units of SPR
- Fix: Intel Westmere Uncore with perf_event backend
- Fix: Fujitsu A64FX (more counters, fixed topology, ...)
- Container bridge to use LIKWID inside container
- Sysfeatures interface reworked with support for various architectures and libraries
- Fix build for NVMON for CUDA 12.6+
- likwid-mpirun: SLURM pinning with cpu_mask feature
- Update of internal hwloc (2.11.2) and Lua (5.4.7) version
If you are building for RHEL9 (or its derivatives), the perl-locale
package has to be installed.
likwid-5.3.0
We are happy to release version 5.3.0 of LIKWID, the tools suite for performance oriented programmers. Thanks to all the contributors, especially HPE for the AMD ROCm backend.
Changelog for 5.3.0:
- Support for Intel SapphireRapids (Core, Uncore, RAPL)
- Support for AMD Zen4 (Core, Uncore, RAPL)
- Support for Apple M1
- Support for AMD GPUs (MarkerAPI, F90 interface)
- Support for AWS Graviton3 (ARM Neoverse V1)
- Support for HiSilicon TSV110
- Fix of F90 interface installation
- Support for extended umasks in ICX and SPR
- Units for metrics in performance groups
- Library calls to get meta information (version, supported features, etc.)
- Some fixes for direct access mode
- Some fixes for X86 RDPMC detection
- Update of internal hwloc (2.9.3) and Lua (5.4.6) version
- New experimental sysfeatures module
Note: For Intel SapphireRapids systems with HBM, LIKWID in perf_event
access mode and /sys/devices/uncore_type_14_*
devices, apply the attached patch. Thanks @Julius-Plehn
Note: There is a bug in the NVMarkerAPI. If you want to use LIKWID with the NvMarkerAPI, please apply the changes in likwid-marker.h
shown here
Note: Energy&Power measurements on SapphireRapids are broken on all sockets other than 0
likwid-5.2.2
- Fix pin string parsing in pinning library
- Make
SBIN
path configurable in build system - Add
PKGBUILD
for ArchLinux package builds - Remove
accessDaemon
double-fork in systemd environments - Group updates for L2/L3 (mainly AMD Zen)
- Fix multi-initialization in MarkerAPI
- Add energy event scaling for Fujitsu A64FX
- Nvmon: Use Cupti error string to get better warning/error messages
- Nvmon: Store events internally to re-use event strings in stopCounters
- AccessLayer: Catch SIGCHLD to stop sending requests to accessDaemon if it was killed
likwid-genTopoCfg
: Update writing and reading of topology file- Add
INST_RETIRED_NOP
event for Intel Icelake (desktop & server) - Removed some memory leaks
- Improved checks for
RDPMC
availability - Add
TOPDOWN_SLOTS
for perf_event - Fix for systems with CPU sockets without hwthreads (A64FX FX1000)
- Fix if
HOME
environment variable is not set (systemd) - Reader function for
perf_event_paranoid
in Lua to get state early likwid-mpirun
: Sanitize np and ppn values to avoid crashes
Note: The groups MEM_DP
and MEM_SP
use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
likwid-5.2.1
We are happy to release a new bugfix version of the LIKWID tool suite.
- Add support for Intel Rocketlake and AMD Zen3 variant (Family 19, Model 0x50)
- Fix for perf_event multiplexing (important!)
- Fix for potential deadlock in MarkerAPI (thx @jenny-cheung)
- Build and runtime fixes for Nvidia GPU backend, updates for CUDA test codes
- peakflops kernel for ARMv8
- Updates for AMD Zen1/2/3 event lists and groups
- Support spaces in MarkerAPI region tags (thx @jrmadsen)
- Use 'online' cpulist instead of 'present'
- Switch CI from Travis-CI to NHR@FAU Cx services
- Document -reset and -ureset for likwid-setFrequencies
- Reset cpuset in unpinned runs
- Remove destructor in frequency module
- Check PID if given through --perfpid
- Intel Icelake: OFFCORE_RESPONSE events
- AccessDaemon: Check PCI init state before using it
- likwid-mpirun: Set mpi type for SLURM automatically
- likwid-mpirun: Fix for skip mask for OpenMPI
- Fix for
triad_sve*
benchmarks
Note: The groups MEM_DP
and MEM_SP
use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
likwid-5.2.0
We are happy to release a new major update of the LIKWID tool suite.
- Support for AMD Zen3 (Core + Uncore)
- Support for Intel IcelakeSP (Core + Uncore)
- New affinity code
- Fix for Ivybridge uncore code
- Bypass accessdaemon by using rdpmc instruction on x86_64
- Introduce notion of CPU die in topology module
- Use CPU dies for socket-lock for Intel CascadelakeAP
- Add environment variable LIKWID_IGNORE_CPUSET to break out of current CPUset
- Fixes for affinity module CPUlist sorting
- Build against system-installed hwloc
- Update for Intel SkylakeX/CascadelakeX L3 group
- Rename DataFabric events for all generations of AMD Zen
- Add static cache configuration for Fujitsu A64FX
- Add multiplexing checks for perf_event backend
- Fix for table width of likwid-topology after adding CPU die column
- Adding RasPi 4 with 32 bit OS as ARMv7
- Add default groups for Intel Icelake desktop
- Fix for likwid-setFrequencies to not apply minFreq when setting governor
- likwid-powermeter: Fix hwthread selection when run with -p
- likwid-setFrequencies: Get measured base frequency if register is not readable
- CLOCK group for all AMD Zen
- Fixes in Nvidia GPU support in NvMarkerAPI and topology module
WARNING: This version has bugs in the perf_event backend. The multiplexing checks cause problems.
WARNING: The benchmarks triad_sve*
for ARM8 chips use only 3 instead of 4 streams.
Note: The groups MEM_DP
and MEM_SP
use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.
likwid-5.1.1
Changelog for version 5.1.1:
- Support for Intel Cometlake desktop (Core + Uncore)
- Fix for topology module of Fujitsu A64FX
- Fix for Intel Skylake SP in SNC mode
- Fix for likwid-perfscope
- Fix for CLI argument parsing
- Updated group and data file checkers
- Vector sum benchmark in SVE
- FP_PIPE group for Fujitsu A64FX
- Maximal number of CLI arguments configurable in config.mk (currently 16384)
- Fix for cpulist_sort function
- Fix for Intel SkylakeSP/CascadelakeSP CBOX devices in perf_event mode
- Multiplexing-Fix for perf_event (with warning)
- Adjust CUDA function pointer names in topology_gpu to avoid name clashes
- Fix for Lua 5.1
- Fix for likwid-setFrequency when reading CPU base frequency
Note: This version does not contain any updates for AMD Zen3 and Intel IcelakeSP.
Note: Uncore measurements on Intel Cascadelake AP systems require an update of the topology module which will come in 5.2.0
WARNING: The benchmarks triad_sve*
for ARM8 chips use only 3 instead of 4 streams.
likwid-5.1.0
Changelog for version 5.1.0:
- Support for Intel Icelake desktop (Core + Uncore)
- Support for Intel Icelake server (Core only)
- Support for Intel Tigerlake desktop (Core only)
- Support for Intel Cannonlake (Core only)
- Support for Nvidia GPUs with compute capability >= 7.0 (CUpti Profiling API)
- Initial support for Fujitsu A64FX (Core) including SVE assembly benchmarks
- Support for ARM Neoverse N1 (AWS Graviton 2)
- Support for AMD Zen3 (Core + Uncore but without any events)
- Check for Intel HWP
- Fix for TID filter of Skylake SP LLC filter0 register
- Fix for Lua 5.1
- Fix for likwid-mpirun skip masks
- Fortran90 interface for NvMarkerAPI (update)
- CPU_is_online check to filter non-usable CPU cores
- Fix for freeMemory in NUMA module (with hwloc backend)
- Fix for likwid-setFrequencies
We want to thank Intel, AMD, AWS and the University of Regensburg for their support.
If you want to use this release in a publication, please cite: https://doi.org/10.5281/zenodo.4282696