An Empirical Framework for Domain Generalization In Clinical Settings

Paper

If you use this code in your research, please cite the following publication:

@inproceedings{zhang2021empirical,
  title={An empirical framework for domain generalization in clinical settings},
  author={Zhang, Haoran and Dullerud, Natalie and Seyyed-Kalantari, Laleh and Morris, Quaid and Joshi, Shalmali and Ghassemi, Marzyeh},
  booktitle={Proceedings of the Conference on Health, Inference, and Learning},
  pages={279--290},
  year={2021}
}

This paper can also be found on arxiv: https://arxiv.org/abs/2103.11163

Acknowledgements

Our implementation is a modified version of the excellent DomainBed framework (from commit a10458a). We also make use of some code from eICU Benchmarks.

To replicate the experiments in the paper:

Step 0: Environment and Prerequisites

Run the following commands to clone this repo and create the Conda environment:

git clone https://github.com/MLforHealth/ClinicalDG.git
cd ClinicalDG/
conda env create -f environment.yml
conda activate clinicaldg

Step 1: Obtaining the Data

See DataSources.md for detailed instructions.

Step 2: Running Experiments

Experiments can be ran using the same procedure as for the DomainBed framework, with a few additional adjustable data hyperparameters which should be passed in as a JSON formatted dictionary.

For example, to train a single model:

python -m clinicaldg.scripts.train\
       --algorithm ERM\
       --dataset eICUSubsampleUnobs\
       --es_method val\
       --hparams  '{"eicu_architecture": "GRU", "eicu_subsample_g1_mean": 0.5, "eicu_subsample_g2_mean": 0.05}'\
       --output_dir /path/to/output

To sweep a range of datasets, algorithms, and hyperparameters:

python -m clinicaldg.scripts.sweep launch\
       --output_dir=/my/sweep/output/path\
       --command_launcher slurm\
       --algorithms ERMID ERM IRM VREx RVP IGA CORAL MLDG GroupDRO \
       --datasets CXR CXRBinary\
       --n_hparams 10\
       --n_trials 5\
       --es_method train\
       --hparams '{"cxr_augment": 1}'

A detailed list of hparams available for each dataset can be found here.

We provide the bash scripts used for our main experiments in the bash_scripts directory. You will likely need to customize them, along with the launcher, to your compute environment.

Step 3: Aggregating Results

We provide sample code for creating aggregate results for an experiment in notebooks/AggResults.ipynb.

License

This source code is released under the MIT license, included here.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bash_scripts		bash_scripts
clinicaldg		clinicaldg
notebooks		notebooks
.gitignore		.gitignore
DataSources.md		DataSources.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
hparams.md		hparams.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An Empirical Framework for Domain Generalization In Clinical Settings

Paper

Acknowledgements

To replicate the experiments in the paper:

Step 0: Environment and Prerequisites

Step 1: Obtaining the Data

Step 2: Running Experiments

Step 3: Aggregating Results

License

About

Releases

Packages

Languages

License

MLforHealth/ClinicalDG

Folders and files

Latest commit

History

Repository files navigation

An Empirical Framework for Domain Generalization In Clinical Settings

Paper

Acknowledgements

To replicate the experiments in the paper:

Step 0: Environment and Prerequisites

Step 1: Obtaining the Data

Step 2: Running Experiments

Step 3: Aggregating Results

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages