-
Notifications
You must be signed in to change notification settings - Fork 119
Profiling PySlurm
In order to see how pyslurm was performing it was profiled following the [Cython Profiling] (http://docs.cython.org/src/tutorial/profiling_tutorial.html) instructions with 10k+ jobs in the queue but on a quiet server. The script used was a modified jobs_list.py but without the python display function call :
Mon Dec 12 17:33:25 2011 profile_tmp
289896 function calls in 0.304 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
109956 0.118 0.000 0.118 0.000 pyslurm.pyx:1997(__get_select_jobinfo)
1 0.088 0.088 0.264 0.264 pyslurm.pyx:1886(__get)
49980 0.042 0.000 0.042 0.000 slurm.pxd:41(listOrNone)
1 0.024 0.024 0.024 0.024 pyslurm.pyx:1850(__load)
1 0.014 0.014 0.304 0.304 <string>:1(<module>)
119952 0.010 0.000 0.010 0.000 slurm.pxd:46(stringOrNone)
9996 0.005 0.000 0.005 0.000 pyslurm.pyx:3601(__get_licenses)
1 0.002 0.002 0.002 0.002 pyslurm.pyx:1778(__destroy)
1 0.000 0.000 0.287 0.287 jobs_list.py:42(main)
1 0.000 0.000 0.287 0.287 pyslurm.pyx:1874(get)
1 0.000 0.000 0.287 0.287 {method 'get' of 'pyslurm.pyslurm.job' objects}
1 0.000 0.000 0.002 0.002 pyslurm.pyx:1775(__dealloc__)
1 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 pyslurm.pyx:2070(__cpus_allocated_on_node)
1 0.000 0.000 0.000 0.000 pyslurm.pyx:1766(__cinit__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
So a summary of the times is approximately :
- 0.024 seconds to load (__load) slurm job data
- 0.264 seconds to retrieve the (__get) slurm data from C data structures and convert to Python data types
- 0.118 seconds for the __get_select_jobinfo routine called by the get method - hopefully this can be improved
Reference: http://stackoverflow.com/questions/28301931/how-to-profile-cython-functions-line-by-line
cProfile was insufficient as it didn't give me any introspection into the calls inside the get_job_info_msg(), it just reported how long that call took. I wanted to see what inside that function was slowing down execution. So, I added the linetracing and binding cython directives into the job.pyx and re-ran a small script as follows:
import pyslurm
import line_profiler
profile = line_profiler.LineProfiler(pyslurm.job.get_job_info_msg)
profile.runcall(pyslurm.job.get_job_info_msg)
profile.print_stats()
Now, when I run cProfile, I get line by line profiling!
Line # Hits Time Per Hit % Time Line Contents
==============================================================
[...]
463 9239 8344 0.9 0.2 this_job.user_id = record.user_id
464 9239 8187 0.9 0.2 try:
465 9239 19130 2.1 0.4 this_job.user_name = (
466 9239 1426998 154.5 29.0 tounicode(getpwuid(record.user_id)[0])
467 )
468 except:
469 pass
470
471 9239 8472 0.9 0.2 this_job.group_id = record.group_id
472
473 9239 8346 0.9 0.2 try:
474 9239 18826 2.0 0.4 this_job.group_name = (
475 9239 1847903 200.0 37.5 tounicode(getgrgid(record.group_id)[0])
476 )
477 else:
478 9239 9037 1.0 0.2 pass
Turns out, majority of the time spent in get_job_info_msg()
function was dedicated to getpwuid() and getgrgid(). I was taking the uids and gids from each job record and fetching the equivalent string user and group, and adding it to the object. Taking out both lines resulted it much better execution times.
Intro
Getting Started
Development
- Running a local user installation
- Testing PySlurm with Docker
- Continuous Integration
- Updating PySlurm for New Slurm Releases
- Using latest version of Cython
- Strings and bytes in Cython
- Profiling PySlurm
- Checking for memory leaks
- Do's and Dont's
- Slurm shell completion
Contributing