Skip to content
forked from typoverflow/WiseRL

PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms

License

Notifications You must be signed in to change notification settings

LAMDA-RL/WiseRL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WiseRL provides banchmarked PyTorch implementations for Offline Preference-Based RL algorithms, including:

  • Oracle-IQL & Oracle AWAC
  • Supervised Finetuning
  • Bradley-Terry Model + IQL/AWAC
  • Contrastive Prefereing Learning
  • Inverse Preference Learning + IQL/AWAC
  • Preference Transformer + IQL/AWAC
  • Hindsight Preference Learning + IQL/AWAC

Usage

# for reward-model free algorithms
python3 scripts/main.py --config /path/to/config.yaml

# for reward-model-based algorithms
python3 scripts/rmb_main.py --config /path/to/config.yaml

Installation

  • clone this repo and install the dependencies
    git clone git@github.com:typoverflow/WiseRL
    cd WiseRL && pip install -e .
  • install environment or dataset dependencies
    • for D4RL experiments:
      git clone https://github.com/Farama-Foundation/d4rl.git
      cd d4rl
      pip install -e .
    • for metaworld experiments:
      git clone git@github.com:Farama-Foundation/Metaworld
      cd Metaworld && git checkout 04be337a
      pip install -e .
    • for robosuite experiments (we follow the instructions from IPL):
      • Git clone the robosuite repository, checkout to offline_study branch and install.
        git clone https://github.com/ARISE-Initiative/robosuite
        cd robosuite && git checkout offline_study
        pip install -e . --no-dependencies
        Nota that if you are using python 3.10 or higher, you need to change from collections import Iterable to from collections.abc import Iterable in file robosuite/models/arenas/multi_table_arena.py.
      • Run import robosuite repeatedly until it completes. Install the missing packages if any error shows up.
      • Git clone the robomimic repository.
        git clone git@github.com:ARISE-Initiative/robomimic.git
        cd robomimic && git checkout v0.2.0
      • Download the robomimic dataset
        mkdir -p ~/.robomimic/datasets/
        cd robomimic/scripts/
        python download_datasets.py --tasks sim --dataset_types ph --hdf5_types low_dim --download_dir ~/.robomimic/datasets/
        python download_datasets.py --tasks sim --dataset_types mh --hdf5_types low_dim --download_dir ~/.robomimic/datasets/
      • Checkout back to the master branch and install. Note that you must first checkout to v0.2.0 branch to download the dataset, and come back to install the latest version of code.
        git checkout master
        pip install -e . --no-dependencies
      • Run import robomimic repeatedly until it completes. Install the missing packages if any error shows up.
      • Note that the above installation scripts will download the datasets to ~/.robosuite/datasets. If you would like to change to other locations, please make sure to change the macro in Robomimic Dataset accordingly.

Acknowledgement

Citation

@software{wiserl
  title = {{WiseRL: Benchmarked Implementations of Offline Preference-based RL Algorithms}},
  author = {Gao, Chen-Xiao and Shengjun, Fang},
  month = feb,
  url = {https://github.com/typoverflow/WiseRL},
  year = {2024}
}

@article{gao2024hindsight,
  title={Hindsight Preference Learning for Offline Preference-based Reinforcement Learning},
  author={Chen-Xiao Gao and Shengjun Fang and Chenjun Xiao and Yang Yu and Zongzhang Zhang},
  journal={arXiv preprint arXiv:2407.04451},
  year={2024},
}

About

PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%