Skip to content

Releases: eXascaleInfolab/PyExPool

Chained Termination on Constraints Violation Fixed and Optimized

04 Jul 18:07
Compare
Choose a tag to compare

Features & Optimizations

  • Threshold added on group vmem limit exceeding to reduce the number of reschedulings before the workers reduction and speedup heavy-loaded usecases
  • Execution on the pure Python2 / PyPy without any dependences allowed, warnings are shown about the disabled functionality (vmlimit without psutils)

Fixes

  • Termination of the chained jobs from the violating origin is fixed to not restart or postpone such jobs (even when they are already terminating with the restart flag)
  • Logging to the implicit base directories fixed (filename without: "./" or full path specification)

Known Bugs

  • Jobs rescheduling with _CHAINED_CONSTRAINTS kills related jobs that have ontimeout flag and assumed to be restarted (should not been terminated at all)

Scheduling of the Spawning Processes

29 Jun 18:36
Compare
Choose a tag to compare
Pre-release

Features

  • Kind of the evaluating virtual memory for the job is parameterized (origin process, heaviest spawned [sub]process, whole process tree of the origin)

By default the virtual memory is evaluated for the heaviest process in the process tree of the executing job.
It allows to use intermediate apps in the execution chain having valid memory constraints for the target app[s] that is assumed to the the heaviest. An example of job with intermediate process of time measuring (time) that is not considered in the vmem constrains for the job:

find_job = Job(args=('time', 'find', '/etc', '-name', 'sh'))

Known Bugs

  • _LIMIT_WORKERS_RAM causes huge degrade of the rescheduling performance when the worker processes meet the specified constraint
  • Jobs rescheduling with _CHAINED_CONSTRAINTS does not kill jobs related to the terminated origin if they are in the terminating state with requested restart or are rescheduled because of the group violation of the memory constraints

Load Balancing of Jobs with Chained Dependencies

27 Jun 17:59
Compare
Choose a tag to compare

Features

  • Parameterized virtual memory constraints for each Job, optional guarantee of the in-RAM computations of all Jobs
  • Chained rescheduling of the heavier Jobs with the same category to meet RAM limitation / timeout constraints
  • Load balancing of the worker processes combined with jobs queue rescheduling, automatic reduction of the number of workers to compute heavier jobs withing the specified memory limit / in-RAM if jobs rescheduling does not help
  • Unittests integrated

Fixes & Optimizations

  • Forced termination of the job works fine even when SIG_TERM is ignored
  • Lots of fixes and optimizations related to the scheduling

Known Bugs

  • _LIMIT_WORKERS_RAM causes huge degrade of the rescheduling performance when the worker processes meet the specified constraint
  • Jobs rescheduling with _CHAINED_CONSTRAINTS does not kill jobs related to the terminated origin if they are in the terminating state with requested restart or are rescheduled because of the group violation of the memory constraints

Adjustments for NUMA, CPU cache and termination optimizations

22 Jun 11:11
Compare
Choose a tag to compare

Features

  • Automatic CPU affinity management (warm cache for single-threaded processes)
  • CPU cache adjustment (parallelization vs cache size)
  • NUMA architecture considered (nodes of CPUs, CPU cores, HW threads)
  • Execution Pool latency parameterized

Fixes & Optimizations

  • Processing of the terminating jobs speeded up
  • Workers deletion fixed (zombie workers eliminated on job restart)

Known bugs

  • Has issues when the executing process can't be terminated gracefully, fixed in the next release
  • Issues in the logical CPUs enumeration prevent cache maximization, fixed since v2.1-MultiprocAfn