Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IN PROGRESS general dev contributions. Currently tracking: #17

Merged
merged 62 commits into from
Jun 24, 2024
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
ef12119
Added some starter code for the preprocessing side.
mmcdermott May 31, 2024
5060c40
Merge branch 'multirun' into fixes_to_preprocessing
mmcdermott Jun 6, 2024
a8cee7d
Merge branch 'multirun' into fixes_to_preprocessing
mmcdermott Jun 8, 2024
3ab9e40
Fixed code_metadata setup.
mmcdermott Jun 8, 2024
5511f91
Enabled summarizing the entire population in the code metadata map/re…
mmcdermott Jun 8, 2024
2caa82e
Merge branch 'main' into fixes_to_preprocessing
mmcdermott Jun 11, 2024
7f0ecae
Merge branch 'main' into fixes_to_preprocessing
mmcdermott Jun 13, 2024
7b6280d
added one test and implemented the basic filter codes logic
mmcdermott Jun 13, 2024
cb0b2d3
Added the outlier filtering function.
mmcdermott Jun 13, 2024
70e3524
Added a code metadata extraction step for core MEDS extraction -- thi…
mmcdermott Jun 13, 2024
92ec6da
Made script executable
mmcdermott Jun 13, 2024
f9a10fd
Set up a mapper output directory -- not sure if this is the best poli…
mmcdermott Jun 13, 2024
401fbbe
Merge pull request #18 from mmcdermott/describe_codes_post_extraction
mmcdermott Jun 13, 2024
320c8b4
made scripts executable
mmcdermott Jun 13, 2024
9f48793
fixed a small typo (maybe?)
mmcdermott Jun 13, 2024
e28ef24
Fixed an issue with using cfg instead of stage_cfg
mmcdermott Jun 13, 2024
3821d30
Fixing some other typos
mmcdermott Jun 13, 2024
c1e7b48
Fixing anohter typo in time-derived
mmcdermott Jun 13, 2024
719e1d0
Some small corrections and documentation for getting time derived to …
mmcdermott Jun 13, 2024
a03862f
Fixing various typos -- the metadata_input_dir change was actually wr…
mmcdermott Jun 13, 2024
01f3f78
Fixed some more typos and added more working or at least running comm…
mmcdermott Jun 13, 2024
6123087
Added command for getting normalization parameters
mmcdermott Jun 13, 2024
148fdd5
Added code to get lexicographic code indices for vocabulary tokenizat…
mmcdermott Jun 14, 2024
935102e
Added (yet untested) script for fitting vocabulary indices and correc…
mmcdermott Jun 14, 2024
3f32271
documentation
mmcdermott Jun 14, 2024
e8a01b4
Pre-processing prototypes documentation improvements
mmcdermott Jun 15, 2024
d0a4a0c
Further extensions to documentation.
mmcdermott Jun 15, 2024
2bb2863
Added (untested) code metadata type shrinking
mmcdermott Jun 15, 2024
117d950
Added (yet untested) normalization script
mmcdermott Jun 15, 2024
fae4115
Fit vocabulary works
mmcdermott Jun 15, 2024
bd035b9
Normalize works after some minor modification.
mmcdermott Jun 15, 2024
b48aefd
Fixed lint issues.
mmcdermott Jun 15, 2024
f2aaa0c
Merge branch 'main' into preprocessing_steps
mmcdermott Jun 16, 2024
eea4c59
Merge branch 'main' into dev
mmcdermott Jun 16, 2024
8820ce7
Corrected typos
mmcdermott Jun 16, 2024
982b537
Added some details on tokenization and tensorization.
mmcdermott Jun 16, 2024
1e06a63
Initial files -- not yet processed, tested, or verified.
mmcdermott Jun 16, 2024
188936b
Fixed lint errors.
mmcdermott Jun 17, 2024
1286864
Moved pytorch dataset code.
mmcdermott Jun 17, 2024
a4741fe
Added first test to tokenize.
mmcdermott Jun 17, 2024
c67b50c
Added a second test.
mmcdermott Jun 17, 2024
501adba
Added the last of the tests.
mmcdermott Jun 17, 2024
833adbf
Tested tensorize as well
mmcdermott Jun 17, 2024
0509faa
Some minor corrections for the pytorch dataset.
mmcdermott Jun 17, 2024
8fac9b8
Removing pytorch dataset files as they are being moved to https://git…
mmcdermott Jun 17, 2024
9c6d7c4
Resolved test stochasticity
mmcdermott Jun 17, 2024
605c651
removing unneeded dependency.
mmcdermott Jun 17, 2024
eb9a176
removing unused config
mmcdermott Jun 17, 2024
13e99dd
Added (yet untested) tokenization and tensorization scripts
mmcdermott Jun 17, 2024
1f616b3
Had to rename to avoid an import issue with hydra.
mmcdermott Jun 17, 2024
6f70306
Change file extension for NRT files
mmcdermott Jun 17, 2024
44bd331
Merge pull request #22 from mmcdermott/tokenization_tensorization_pyt…
mmcdermott Jun 17, 2024
dd78d1f
Fixed a test that fails on numpy 2.0
mmcdermott Jun 19, 2024
42eb4a6
Update pyproject.toml
mmcdermott Jun 19, 2024
f3a9edb
Try to correct github lint issue.
mmcdermott Jun 19, 2024
38c1d78
Updated the MIMIC README and removed the troublesome portions.
mmcdermott Jun 19, 2024
f368bbe
Updated the MIMIC README and removed the troublesome portions.
mmcdermott Jun 19, 2024
b7d3c4c
Merge pull request #21 from mmcdermott/preprocessing_steps
mmcdermott Jun 19, 2024
e70d26f
Checked the other shards for #23
mmcdermott Jun 19, 2024
7150d5e
Added code metadata checking to the test.
mmcdermott Jun 20, 2024
1ec3934
Re-arranged import statements
mmcdermott Jun 21, 2024
410e6ce
Updated workflow
mmcdermott Jun 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 69 additions & 82 deletions MIMIC-IV_Example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,24 @@ that page. You will need the raw `.csv.gz` files for this example. We will use `
the root directory of where the resulting _core data files_ are stored -- e.g., there should be a `hosp` and
`icu` subdirectory of `$MIMICIV_RAW_DIR`.

## Step 2: Get the data ready for base MEDS extraction
## Step 2: Run the basic MEDS ETL

This step contains several sub-steps; luckily, all these substeps can be run via a single script, with the
`joint_script.sh` script which uses the Hydra `joblib` launcher to run things with local parallelism (make
sure you enable this feature by including the `[local_parallelism]` option during installation) or via
`joint_script_slurm.sh` which uses the Hydra `submitit` launcher to run things through slurm (make sure you
enable this feature by including the `[slurm_parallelism]` option during installation). This script entails
several steps:

### Step 2.1: Get the data ready for base MEDS extraction

This is a step in a few parts:

1. Join a few tables by `hadm_id` to get the right timestamps in the right rows for processing. In
particular, we need to join:
- TODO
- the `hosp/diagnoses_icd` table with the `hosp/admissions` table to get the `dischtime` for each
`hadm_id`.
- the `hosp/drgcodes` table with the `hosp/admissions` table to get the `dischtime` for each `hadm_id`.
2. Convert the patient's static data to a more parseable form. This entails:
- Get the patient's DOB in a format that is usable for MEDS, rather than the integral `anchor_year` and
`anchor_offset` fields.
Expand All @@ -61,7 +72,8 @@ After these steps, modified files or symlinks to the original files will be writ
will be used as the input to the actual MEDS extraction ETL. We'll use `$MIMICIV_PREMEDS_DIR` to denote this
directory.

To run this step, you can use the following script (assumed to be run **not** from this directory but from the
This step is run in the `joint_script.sh` script or the `joint_script_slurm.sh` script, but in either case the
base command that is run is as follows (assumed to be run **not** from this directory but from the
root directory of this repository):

```bash
Expand All @@ -70,9 +82,7 @@ root directory of this repository):

In practice, on a machine with 150 GB of RAM and 10 cores, this step takes less than 5 minutes in total.

## Step 3: Run the MEDS extraction ETL

### Running locally, serially
### Step 2.2: Run the MEDS extraction ETL

We will assume you want to output the final MEDS dataset into a directory we'll denote as `$MIMICIV_MEDS_DIR`.
Note this is a different directory than the pre-MEDS directory (though, of course, they can both be
Expand All @@ -83,114 +93,91 @@ This is a step in 4 parts:
1. Sub-shard the raw files. Run this command as many times simultaneously as you would like to have workers
performing this sub-sharding step. See below for how to automate this parallelism using hydra launchers.

```bash
./scripts/extraction/shard_events.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
```
This step uses the `./scripts/extraction/shard_events.py` script. See `joint_script*.sh` for the expected
format of the command.

In practice, on a machine with 150 GB of RAM and 10 cores, this step takes approximately 20 minutes in total.
2. Extract and form the patient splits and sub-shards. The `./scripts/extraction/split_and_shard_patients.py`
script is used for this step. See `joint_script*.sh` for the expected format of the command.

2. Extract and form the patient splits and sub-shards.
3. Extract patient sub-shards and convert to MEDS events. The
`./scripts/extraction/convert_to_sharded_events.py` script is used for this step. See `joint_script*.sh` for
the expected format of the command.

```bash
./scripts/extraction/split_and_shard_patients.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
```
4. Merge the MEDS events into a single file per patient sub-shard. The
`./scripts/extraction/merge_to_MEDS_cohort.py` script is used for this step. See `joint_script*.sh` for the
expected format of the command.

In practice, on a machine with 150 GB of RAM and 10 cores, this step takes less than 5 minutes in total.
5. (Optional) Generate preliminary code statistics and merge to external metadata. This is not performed
currently in the `joint_script*.sh` scripts.

## Pre-processing for a model

3. Extract patient sub-shards and convert to MEDS events.
To run the pre-processing steps for a model, consider the sample script provided here:

1. Filter patients to only those with at least 32 events (unique timepoints):

```bash
./scripts/extraction/convert_to_sharded_events.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines
❯ ./scripts/preprocessing/filter_patients.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null stage_configs.filter_patients.min_events_per_patient=32
```

In practice, serially, this also takes around 20 minutes or more. However, it can be trivially parallelized to
cut the time down by a factor of the number of workers processing the data by simply running the command
multiple times (though this will, of course, consume more resources). If your filesystem is distributed, these
commands can also be launched as separate slurm jobs, for example. For MIMIC-IV, this level of parallelization
and performance is not necessary; however, for larger datasets, it can be.

4. Merge the MEDS events into a single file per patient sub-shard.
2. Add time-derived measurements (age and time-of-day):

```bash
./scripts/extraction/merge_to_MEDS_cohort.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 3s
❯ ./scripts/preprocessing/add_time_derived_measurements.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DI
R/test" code_modifier_columns=null stage_configs.add_time_derived_measurements.age.DOB_code="DOB"
```

### Running Locally, in Parallel.
3. Get preliminary counts for code filtering:

This step is the exact same commands as above, but leverages Hydra's multirun capabilities with the `joblib`
launcher. Install this package with the optional `local_parallelism` option (e.g., `pip install -e .[local_parallelism]` and run `./MIMIC-IV_Example/joint_script.sh`. See that script for expected args.
```bash
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines
❯ ./scripts/preprocessing/collect_code_metadata.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null stage="preliminary_counts"
```

### Running Each Step over Slurm
4. Filter codes:

To use slurm, run each command with the number of workers desired using Hydra's multirun capabilities with the
`submitit_slurm` launcher. Install this package with the optional `slurm_parallelism` option. See below for
modified commands. Note these can't be chained in a single script as the jobs will not wait for all slurm jobs
to finish before moving on to the next stage. Let `$N_PARALLEL_WORKERS` be the number of desired workers
```bash
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 4s
❯ ./scripts/preprocessing/filter_codes.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modi
fier_columns=null stage_configs.filter_codes.min_patients_per_code=128 stage_configs.filter_codes.min_occurrences_per_code=256
```

1. Sub-shard the raw files.
5. Get outlier detection params:

```bash
./scripts/extraction/shard_events.py \
--multirun \
worker="range(0,$N_PARALLEL_WORKERS)" \
hydra/launcher=submitit_slurm \
hydra.launcher.timeout_min=60 \
hydra.launcher.cpus_per_task=10 \
hydra.launcher.mem_gb=50 \
hydra.launcher.name="${hydra.job.name}_${worker}" \
hydra.launcher.partition="short" \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 19m57s
❯ ./scripts/preprocessing/collect_code_metadata.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null stage=fit_outlier_detection
```

In practice, on a machine with 150 GB of RAM and 10 cores, this step takes approximately 20 minutes in total.

2. Extract and form the patient splits and sub-shards.
6. Filter outliers:

```bash
./scripts/extraction/split_and_shard_patients.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 5m14s
❯ ./scripts/preprocessing/filter_outliers.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null
```

In practice, on a machine with 150 GB of RAM and 10 cores, this step takes less than 5 minutes in total.

3. Extract patient sub-shards and convert to MEDS events.
7. Fit normalization parameters:

```bash
./scripts/extraction/convert_to_sharded_events.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-a-17-72 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 16m25s
❯ ./scripts/preprocessing/collect_code_metadata.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null stage=fit_normalization
```

In practice, serially, this also takes around 20 minutes or more. However, it can be trivially parallelized to
cut the time down by a factor of the number of workers processing the data by simply running the command
multiple times (though this will, of course, consume more resources). If your filesystem is distributed, these
commands can also be launched as separate slurm jobs, for example. For MIMIC-IV, this level of parallelization
and performance is not necessary; however, for larger datasets, it can be.
8. Fit vocabulary:

```bash
mbm47 in  compute-e-16-230 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 2s
❯ ./scripts/preprocessing/fit_vocabulary_indices.py input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifier_columns=null
```

4. Merge the MEDS events into a single file per patient sub-shard.
9. Normalize:

```bash
./scripts/extraction/merge_to_MEDS_cohort.py \
input_dir=$MIMICIV_PREMEDS_DIR \
cohort_dir=$MIMICIV_MEDS_DIR \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml
mbm47 in  compute-e-16-230 in MEDS_polars_functions on  preprocessing_steps [$] is 󰏗 v0.0.1 via  v3.12.3 via  MEDS_pipelines took 4s
❯ ./scripts/preprocessing/normalize.py --multirun worker="range(0,3)" hydra/launcher=joblib input_dir="$MIMICIV_MEDS_DIR/3workers_slurm" cohort_dir="$MIMICIV_MEDS_PROC_DIR/test" code_modifie
r_columns=null
```

## Limitations / TO-DOs:
Expand Down
98 changes: 49 additions & 49 deletions MIMIC-IV_Example/joint_script_slurm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ shift 4
# this doesn't fall back on running anything locally in a setting where only slurm worker nodes have
# sufficient computational resources to run the actual jobs.

# echo "Running pre-MEDS conversion on one worker."
# ./MIMIC-IV_Example/pre_MEDS.py \
# --multirun \
# worker="range(0,1)" \
# hydra/launcher=submitit_slurm \
# hydra.launcher.timeout_min=60 \
# hydra.launcher.cpus_per_task=10 \
# hydra.launcher.mem_gb=50 \
# hydra.launcher.partition="short" \
# raw_cohort_dir="$MIMICIV_RAW_DIR" \
# output_dir="$MIMICIV_PREMEDS_DIR"
echo "Running pre-MEDS conversion on one worker."
./MIMIC-IV_Example/pre_MEDS.py \
--multirun \
worker="range(0,1)" \
hydra/launcher=submitit_slurm \
hydra.launcher.timeout_min=60 \
hydra.launcher.cpus_per_task=10 \
hydra.launcher.mem_gb=50 \
hydra.launcher.partition="short" \
raw_cohort_dir="$MIMICIV_RAW_DIR" \
output_dir="$MIMICIV_PREMEDS_DIR"

echo "Trying submitit launching with $N_PARALLEL_WORKERS jobs."

Expand All @@ -72,41 +72,41 @@ echo "Trying submitit launching with $N_PARALLEL_WORKERS jobs."
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml \
stage=shard_events

# echo "Splitting patients on one worker"
# ./scripts/extraction/split_and_shard_patients.py \
# --multirun \
# worker="range(0,1)" \
# hydra/launcher=submitit_slurm \
# hydra.launcher.timeout_min=60 \
# hydra.launcher.cpus_per_task=10 \
# hydra.launcher.mem_gb=50 \
# hydra.launcher.partition="short" \
# input_dir="$MIMICIV_PREMEDS_DIR" \
# cohort_dir="$MIMICIV_MEDS_DIR" \
# event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"
#
# echo "Converting to sharded events with $N_PARALLEL_WORKERS workers in parallel"
# ./scripts/extraction/convert_to_sharded_events.py \
# --multirun \
# worker="range(0,$N_PARALLEL_WORKERS)" \
# hydra/launcher=submitit_slurm \
# hydra.launcher.timeout_min=60 \
# hydra.launcher.cpus_per_task=10 \
# hydra.launcher.mem_gb=50 \
# hydra.launcher.partition="short" \
# input_dir="$MIMICIV_PREMEDS_DIR" \
# cohort_dir="$MIMICIV_MEDS_DIR" \
# event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"
#
# echo "Merging to a MEDS cohort with $N_PARALLEL_WORKERS workers in parallel"
# ./scripts/extraction/merge_to_MEDS_cohort.py \
# --multirun \
# worker="range(0,$N_PARALLEL_WORKERS)" \
# hydra/launcher=submitit_slurm \
# hydra.launcher.timeout_min=60 \
# hydra.launcher.cpus_per_task=10 \
# hydra.launcher.mem_gb=50 \
# hydra.launcher.partition="short" \
# input_dir="$MIMICIV_PREMEDS_DIR" \
# cohort_dir="$MIMICIV_MEDS_DIR" \
# event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"
echo "Splitting patients on one worker"
./scripts/extraction/split_and_shard_patients.py \
--multirun \
worker="range(0,1)" \
hydra/launcher=submitit_slurm \
hydra.launcher.timeout_min=60 \
hydra.launcher.cpus_per_task=10 \
hydra.launcher.mem_gb=50 \
hydra.launcher.partition="short" \
input_dir="$MIMICIV_PREMEDS_DIR" \
cohort_dir="$MIMICIV_MEDS_DIR" \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"

echo "Converting to sharded events with $N_PARALLEL_WORKERS workers in parallel"
./scripts/extraction/convert_to_sharded_events.py \
--multirun \
worker="range(0,$N_PARALLEL_WORKERS)" \
hydra/launcher=submitit_slurm \
hydra.launcher.timeout_min=60 \
hydra.launcher.cpus_per_task=10 \
hydra.launcher.mem_gb=50 \
hydra.launcher.partition="short" \
input_dir="$MIMICIV_PREMEDS_DIR" \
cohort_dir="$MIMICIV_MEDS_DIR" \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"

echo "Merging to a MEDS cohort with $N_PARALLEL_WORKERS workers in parallel"
./scripts/extraction/merge_to_MEDS_cohort.py \
--multirun \
worker="range(0,$N_PARALLEL_WORKERS)" \
hydra/launcher=submitit_slurm \
hydra.launcher.timeout_min=60 \
hydra.launcher.cpus_per_task=10 \
hydra.launcher.mem_gb=50 \
hydra.launcher.partition="short" \
input_dir="$MIMICIV_PREMEDS_DIR" \
cohort_dir="$MIMICIV_MEDS_DIR" \
event_conversion_config_fp=./MIMIC-IV_Example/configs/event_configs.yaml "$@"
Loading
Loading