This repo has a few helper scripts for a Slurm HPC cluster.
Query the total resources (CPUs, Nodes, Jobs) for all users. Default shows job information for all users. Optionally, you can filter by user and or partition.
Report the cluster utilization (Nodes, CPUs and Memory).
Report completed jobs for a given user. Slurm must be configured to use the Elasticsearch job completion plugin.
A diagnostic tool that reports all processes and threads for all jobs on a given compute node. Information about each job also reports the CPU affinity, memory usage, and current state (R, D, S).
This project is in the worldwide public domain. See LICENSE for more information.