Status: alpha, pre-release.
benchmark-environments is a suite of benchmarks for imitation-learning and reward-learning algorithms. It is currently a work-in-progress, but we intend for it to eventually contain a suite of diagnostic tasks for reward-learning, wrappers around common RL benchmark environments that help to avoid common pitfalls in benchmarking (e.g. by masking visible score counters in Gym Atari tasks), and new challenge tasks for imitation- and reward-learning. This benchmark suite is a complement to our https://github.com/humancompatibleai/imitation/ package of baseline algorithms for imitation learning.
To install the latest release from PyPI, run:
pip install benchmark-environments
To install from Git master:
pip install git+https://github.com/HumanCompatibleAI/benchmark-environments.git
For development, clone the source code and create a virtual environment for this project:
git clone git@github.com:HumanCompatibleAI/benchmark-environments.git
cd benchmark-environments
./ci/build_venv.sh
pip install -e .[dev] # install extra tools useful for development
We follow a PEP8 code style, and typically follow the Google Code Style Guide,
but defer to PEP8 where they conflict. We use the black
autoformatter to avoid arguing over formatting.
Docstrings follow the Google docstring convention defined here,
with an extensive example in the Sphinx docs.
All PRs must pass linting via the ci/code_checks.sh
script. It is convenient to install this as a commit hook:
ln -s ../../ci/code_checks.sh .git/hooks/pre-commit
We use pytest for unit tests and codecov for code coverage. We also use pytype for type checking.
Trivial changes (e.g. typo fixes) may be made directly by maintainers. Any non-trivial changes must be proposed in a PR and approved by at least one maintainer. PRs must pass the continuous integration tests (CircleCI linting, type checking, unit tests and CodeCov) to be merged.
It is often helpful to open an issue before proposing a PR, to allow for discussion of the design before coding commences.