Deploy APSIM (Agricultural Production Systems sIMulator - https://www.apsim.info/) on high performance computing clusters.
June: Notes on how the config files were generated
- Extract SoilNames from CSV with
SoilNames <- as.vector(unlist(read.csv("SubsetSoilName.csv"))
- R script generate the config file/s ( one config file per Soil sample per Weather file)
- Alsmost all of the SoilNames exist. Weather files willl change but it shouldn't be a problem as we "catch" by them by using the unique
.met
file extension
- Alsmost all of the SoilNames exist. Weather files willl change but it shouldn't be a problem as we "catch" by them by using the unique
- config files should have both the soil names and weather names ( that pattern will not change) . In R scrip Soli name is
$var1
- Config files should be on the curernt working directory. ( hard-coded on APSIM)
- The base Config file should contain the correct name of the soil library to load soils from, as well as the correct Example file containing the simulations to run on the correct soil and weather files
- Run
./updateversion-and-build.sh
and respond to prompts - Once the image built was completed, execute "clean.sh"
- It will ask to define the APSIM Release information ( .deb version). Current release branch has the format of "2024.07.7572.0" and the prompts will request to provide information for "2024.07" in
Enter Year and Month (YYYY.MM)
followed byEnter TAG:
which is equivalent to7572
in above example - This will auto-update the corresponding fields in Apptainer.def under
%arguments
, add the tag tocurl --silent -o ${APPTAINER_ROOTFS}/ApsimSetup.deb https://builds.apsim.info/api/nextgen/download/${TAG}/Linux
and complete the%setup
on the same def file - Then it will ask
Would you like to submit the container build Slurm job? (Yes/No):
which we recommend answeringYes
as it will auto update theexport APSIM_VERSION=
andexport CACHETMPDIR=
based on the cluster of choice
We are using a relative path as defined in build-container.def
13 export IMAGE_PATH="../../apsim-simulations/container/"
apptainer-build.webm
generate_lua.py
script does the following:
- Prompts the user to enter the name of the container image.
- Validates the image name format to ensure it follows the convention "apsim-YYYY.MM.XXXX.Y.aimg".
- Extracts the version from the image name.
- checks if the image file exists in the "../../container/" directory.
- If everything is valid, it creates a new .lua file with the filename "Version.lua" (e.g., "2024.08.7572.0.lua") in the
APSIM/
directory. - The generated .lua file includes the correct version in the "whatis" statement.
To use this script: Run ./generate_lua.py
in the same directory where you want the .lua files to be created.
This module file does the following:
- Provides basic information about the module using whatis statements.
- Adds
/usr/local/bin
and/usr/bin
from the container to the system'sPATH
. - Sets the
R_LIBS_USER
environment variable to/usr/local/lib/R/site-library
. - Creates aliases for executables within the container, so they can be run directly from the command line.
- Sets an environment variable
APSIM_IMAGE
with the path to the Apptainer image.
Adjust the list of executables in the create_exec_alias
section as needed for your specific use case.
module use APSIM/
module load APSIM/2024.08.7572.0
- If the version is not specified,
module load APSIM
will load the latest version
This script will generate a separate config file for each combination of soil name and weather file, naming each file appropriately and placing it in the specified output directory,
ConfigFiles
BUG TO BE FIXED : `generate_apsim_configs.py` which is a backup to .R script but it is buggy at the moment
- reads soil names from the
CSV
file. - gets all
.met
files from the /Weather directory. - reads the base config file.
- generates a new config file for each combination of soil name and weather file.
- replaces the placeholders in the config with the correct soil name and weather file name.
- saves each new config file with a name that includes both the weather file name and soil name.
.
- Make sure you have the SubsetSoilName.csv, Weather directory with .met files, and ExampleConfig.txt in the same directory as the script (or adjust the paths in the script).
- Create a directory named
ConfigFiles
for the output (or change the output_dir in the script). ./generate_apsim_configs.py
Default setting of this script will split .txt files in the current working directory to four separate directories, set-1
, set-2
, set-3
and set-4
- Make sure to check the container image vesion (.aimg file) and double check the name of the ExampleConfig file ( template has
ExampleConfig.txt
) #SBATCH --time
variable will require revision based on the number of Config files. It takes ~25seconds per file- Then submit the Slurm script with
sbatch create_apsimx_skip_failed.sl
- This is a serial process due to #31
- Refer to line 16-21 on the slurm script and adjust
max_consecutive_failures
to fit the following requirement- Expected number of consecutive failures per Soil Sample is equilvanet to number of .met weather files. Therefore, we recommend
max_consecutive_failures
= [ number of weather files + 1 ] - Reason for this implementation was discussed in #35
- Expected number of consecutive failures per Soil Sample is equilvanet to number of .met weather files. Therefore, we recommend
- Purpose of this script is to address #48. .i.e. In an instance where the above script is to time out,etc. this restart script will
pick up the .txt files which weren't subject to
Models --apply
command and it will restart from the failed point. - This script has to be submitted from the same working directory where the above Slurm script was submitted. Also, it expects the
slurmlog/{timedout}.out
which is the standrd out from previous timedout/failed job to be on the slurmlogs directory as it uses the entires on that file to identify the "Successfully" processed .txt files
- It uses the same SLURM configuration and Apptainer setup as your original script.
- You will be required to enter the failed/timed out JOBID to `FAILED_JOB_ID="Enter the timedout/failed jobID"``
- Extracts the list of successfully processed files from this log file.
- Then iterates through all .txt files in both the working directory and the FAILED directory, skipping "ExampleConfig.txt".
- For each file, it checks if it has already been successfully processed (by looking for its name in the extracted list from the log file).
- If successful, it resets the consecutive failure counter.
- If the successfully processed file was in the
FAILED
directory, it moves it back to the working directory. - If processing fails, it moves the file to the
FAILED
directory and increments the consecutive failure counter.
It maintains the same consecutive failure limit as your original script.
if [ -f "$file" ] && [ "$file" != "ExampleConfig.txt" ]; then:
a. [ -f "$file" ]
: Checks if the current $file
is a regular file (not a directory or other special file).
b. [ "$file" != "ExampleConfig.txt" ]
: Checks if the current $file
is not named "ExampleConfig.txt"
.
Both conditions must be true for the code inside the if block to execute.
- Run
count_apsimxfiles_and_array.sh
script first which will generate the#SBATCH --array
variable with the number of array tasks based on the number of Config files ( and .db placeholder files). Copy and paste that variable toarray_create_db_files.sl
Slurm variables header section - Then submit the array script with
sbatch array_create_db_files.sl
db-file-sort.py
does the following
-
It sets up the source directory and creates
PASSED
andFAILED
directories if they don't exist. -
It defines the size threshold as 1 MB
size_threshold = 1 * 1024 * 1024
(converted to bytes). -
terates through all files in the source directory.
-
For each .db file, it checks the file size:
- If the size is greater than 1MB, it moves the file to the
PASSED
directory. - If the size is less than or equal to 1MB, it moves the file to the
FAILED
directory.
- If the size is greater than 1MB, it moves the file to the
-
It prints a message for each file moved and a completion message at the end.
To use this script:
- Replace
source_dir = '.'
in line 7 with the actual path to your directory containing the .db files.
- Process 10 config files.
- Use 4 CPUs and 8GB of memory per job.
- Save Slurm output files in the format %A_%a.out in the "slurmlogs" directory.
- Save output database files in the "OutputDatabases" directory.
- Create a file named "database_list.txt" in the "OutputDatabases" directory, containing the names of all generated database files.
To load the database files in Python later, we can use the "database_list.txt" file:
with open('OutputDatabases/database_list.txt', 'r') as f:
database_files = [line.strip() for line in f]
Default version of this script will:
- Create 5 files with names like large_1.db, large_2.db, etc., each between 21MB and 50MB in size.
- Create 5 files with names like small_1.db, small_2.db, etc., each between 1MB and 19MB in size.
- Use random data to fill the files.
- Show a progress bar for each file creation.
Please note:
- This script uses
/dev/urandom
as a source of random data, which might be slow for creating large files. For faster (but less random) file creation, you could replace/dev/urandom
with/dev/zero
. - The exact sizes will vary each time you run the script due to the use of random numbers.
- The script will create files in the current directory.