Skip to content

Releases: iiasa/Condor_run_R

v2024-03-14

14 Mar 15:05
Compare
Choose a tag to compare
  • Do not seed to drained EPs to avoid waiting for a timeout.
  • Make EP selection clause more accurate for seeding.
  • Fix potentially waiting for a previously submitted ongoing run.
  • Documentation updates, more detailed reporting.

v2024-01-30

30 Jan 09:59
Compare
Choose a tag to compare
  • Use LABEL as a batch name for a run to make condor_q output more readable.
  • Give a batch name to seed jobs to make condor_q output more readable.
  • Allow requirement expressions containing $(JOB) job number expansions by filtering them from the seed job requirements since these should be limited to requirements applicable to all jobs.
  • When unbundling on an execute point fails, it is retried after a short delay in the hope that scratch space will have freed up.
  • On run completion, monitor size changes of log and output files for a while to ensure that all results have been transferred before proceeding with post processing.
  • Add a test of the Limpopo R stack.
  • Documentation improvements, upgraded alerts to GitHub Markdown alerts.

Note

The default JOB_TEMPLATE, BAT_TEMPLATE, and SEED_JOB_TEMPLATE have changed. If you override these, you probably want to merge these changes with your customization.

v2023-08-07

07 Aug 13:51
Compare
Choose a tag to compare

This release features documentation improvements related to requesting resources and testing jobs. In particular the new section on testing your workload and configuration and the updated section on path handling. Users are encouraged to carefully read this documentation to learn how to responsibly use shared compute resources without endangering the jobs of other researchers.

The only functional change is that the optional JOB_RELEASES configuration parameter now defaults to 0 (used to be 3). To enable retries to recover from transient errors, you have to explicitly configure it.

v2023-05-25

25 May 10:08
Compare
Choose a tag to compare

Change log:

  • Execute points are now selected based on REQUIREMENTS in a more fine-grained manner. The execute point configuration has been updated accordingly.
  • Various reporting improvements to help keep track of the submission, run, and its outcome.
  • Fixed several directory handling issues, in particular when a [GDX_|G00_]OUTPUT_DIR configuration is set to an empty string.
  • FIxed handling of multiple bare REQUIREMENTS.
  • Bundles are now seeded as owner when running jobs as owner. This ensures that the bundle can be touched on job launch to prevent premature cleanup without requiring the configuration of special access rights on the bundle cache directory on each execute point.
  • Documentation updates and improvements.

Changes specific to Condor_run.R:

  • GAMS_VERSION is no longer mandatory. When set, execute points that advertise that specific GAMS version as a capability are selected via a requirements expression.
  • The AVAILABLE_GAMS_VERSIONS optional configuration has been dropped as it has been supplanted by EP selection via requirements.

⚠️ Warning: the default values of *_TEMPLATE configuration parameters have been updated. If you override these, update your override values accordingly.

v2023-04-04

04 Apr 09:29
Compare
Choose a tag to compare

Change log:

  • Add reserve_limpopo6_resources scripts to temporarily reserve additional resources on limpopo6 for interactive use.
  • Postpone until run submission the as-needed creation of output directories on the submit machine. These directories — configured via the *OUTPUT_DIR* configuration parameters — store output files transferred from execute points on job completion.
  • Create OUTPUT_DIR on execute points on job startup when configured to a non-empty value.
  • Postpone creation of a log directory until actual use.
  • By default, trigger removal of idle seed jobs after 3 minutes instead of 2 minutes to accommodate slow cluster pickup.
  • Normalize paths after construction to clean up logging and avoid corner-case paths that confuse HTCondor.
  • When bundling nothing, show an intelligible warning instead of an unintelligible error.
  • Various code cleanups, documentation improvements, and test additions.

Changes specific to Condor_run.R:

  • Add support for OUTPUT_FILES.
  • Work/save/restart (G00) output files are now remapped to job-numbered filenames on retrieval after job completion such that increasing job numbers sort in alphabetical order by making use of leading zeros as was already the case for GDX files.

⚠️ Warning: the default values for the *_TEMPLATE configuration parameters have been updated. If you override these, update your override values accordingly.

v2023-03-02

02 Mar 17:12
Compare
Choose a tag to compare

This release refactors the handling of output files:

  • Added {} expansion support to OUTPUT_DIR_SUBMIT/G00_OUTPUT_DIR_SUBMIT/GDX_OUTPUT_DIR_SUBMIT. This allows you for example to expand {LABEL} to collect output files in a sub directory named after the label of the run.
  • Output directories are now always created when they do not exist, also when comprised of multiple path elements (the directories are created recursively).
  • Simplified and cleaned up tests accordingly.

v2023-01-16

16 Jan 16:28
Compare
Choose a tag to compare

A small release that adds removal of seed jobs when matched multiple times. This blacklists misconfigured execute points that otherwise would block seeding by continual rematching before expiry of the usual 2-minute timeout.

v2022-11-03

03 Nov 10:38
Compare
Choose a tag to compare

Major refactoring to implement the option to separately bundle and submit, or re-submit, from a bundle. This enables use cases such as:

  • Reproducible runs (given a preserved bundle).
  • Bundling on a container/machine without HTCondor installation (and later submission from a submit node).

See the updated usage documentation for details.

Other new features:

  • Add BUNDLE_DIR for configuring the directory where the bundle is stored.
  • BUNDLE_INCLUDE now accepts a vector of values. Changed the default value from "*" to c("*") to reflect this.
  • Made bundle cleanup more robust. The bundle no longer does double-duty as a lock file. Instead, bundle submission is protected by a dedicated lock file.
  • Condor_run.R: throw a clear error in the .out file when the configured GAMS_VERSION is not installed on the execute point being scheduled to.
  • Condor_run.R: show a warning when bundling multiple restart files.

⚠️Backwards incompatible change:

  • The LABEL is no longer part of the remapped output and log file names, making them shorter. Instead, name uniqueness now relies only on inclusion of the "cluster number" of the run and the job numbers. This makes inferring the names of these files less fragile (use CLUSTER_NUMBER_LOG to obtain the cluster number). If you have scripting to automatically process output files, you may need to adjust it.

Removed features:

  • A reference copy of the main SCRIPT or the main GAMS file specified via GAMS_FILE_PATH is no longer preserved in the log directory of the run. To keep a reference copy of your code, instead use RETAIN_BUNDLE, BUNDLE_ONLY, or the --bundle-only command line switch.

New tests:

v2022-09-26

26 Sep 08:43
e0eba2d
Compare
Choose a tag to compare

New configuration options for more maintainable and targeted customization of job templates:

Added RETAIN_SEED_ARTIFACTS for troubleshooting bundle seeding.

Condor_run_basic.R:

New tests:

v2022-08-23

23 Aug 12:25
Compare
Choose a tag to compare

Refactored bundling:

  • BUNDLE_ADDITIONAL_FILES now supports unlimited entries, wildcards, and directory recursion.
  • New BUNDLE_ONLY optional config for easier testing of your bundling configuration.
  • To reduce the risk of unintentionally excluding files, directories are excluded from the bundle only when not a parent directory of any of the BUNDLE_INCLUDE_* paths.
  • Updated and more complete documentation of BUNDLE_* configuration parameters.

Condor_run_stats.R:

  • Scale gridExtra tables to fit better.

New tests: