Structured Exploration with Achievements (SEA)

This is the official repository for our paper "Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward". The code is based on the torchbeast IMPALA baseline implementation in NetHack 2021 NeurIPS Challenge.

Setup

Our code is tested on Python 3.7.16 and PyTorch 1.13.1. For other dependencies:

pip install -r requirements.txt

libtorchbeast

We use a custom version of libtorchbeast. You can clone this fork locally and use the following steps ("Installing PolyBeast" section in their README) to install.

cd torchbeast  # Your local torchbeast repo dir
pip install -r requirements.txt
git submodule update --init --recursive
pip install nest/
python setup.py install

Crafter

We use a custom version of Crafter. You can clone this fork locally follow the instructions there to install.

Settings

Replace the content in LOGDIR file with your designated log dir.
If you want to use wandb, fill in your wandb config in train_all.sh.

Quickstart

To replicate our experiments on Crafter in our paper, run the following command:

./train_all.sh $expr_name

Running each stage

Data collection policy

To train initial data collection policy with IMPAPA, run the following command:

./train.sh crafter run $expr_name \
    total_steps=2e8

Achievement representation learning

To collect data with data collection policy and learn achievement representations, run the following command:

./train.sh crafter pred $expr_name \
    actor_load_dir="$run_savedir/checkpoint.tar" \   # replace this with data collection policy
    total_steps=0.5e8 \
    contrast_step_limit=1e6                          # this parameter limits the environment interaction steps

Achievement clustering

To do automatic clustering of achievements with learned achievement representations, run the following command:

./train.sh crafter clustering $expr_name \
    actor_load_dir="$run_savedir/checkpoint.tar" \      # replace this with data collection policy
    pred_model_load_dir="$pred_savedir/checkpoint.tar"  # replace this with representation model

Sub-policy learning

To train sub-policies that reach each known achievement and explore for new achievements, run the following command:

./train.sh crafter mo $expr_name \
    num_objectives=$num_clusters \  # this is given in the last step
    cluster_load_dir="$cluster_savedir/cluster.data" \
    cluster_pred_model_load_dir="$pred_savedir" \
    causal_graph_load_path="$cluster_savedir/graph.data" \
    include_new_tasks=True \  # set this to False if you don't want to train exploration policy
    total_steps=3e8

Citation

@inproceedings{
    zhou2023learning,
    title={Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward},
    author={Zihan Zhou and Animesh Garg},
    booktitle={The Eleventh International Conference on Learning Representations },
    year={2023},
    url={https://openreview.net/forum?id=NDWl9qcUpvy}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
torchbeast		torchbeast
.gitignore		.gitignore
README.md		README.md
read_results.py		read_results.py
requirements.txt		requirements.txt
run_clustering.py		run_clustering.py
test.ipynb		test.ipynb
train.sh		train.sh
train_all.sh		train_all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Structured Exploration with Achievements (SEA)

Setup

libtorchbeast

Crafter

Settings

Quickstart

Running each stage

Data collection policy

Achievement representation learning

Achievement clustering

Sub-policy learning

Citation

About

Releases

Packages

Languages

pairlab/iclr-23-sea

Folders and files

Latest commit

History

Repository files navigation

Structured Exploration with Achievements (SEA)

Setup

libtorchbeast

Crafter

Settings

Quickstart

Running each stage

Data collection policy

Achievement representation learning

Achievement clustering

Sub-policy learning

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages