Skip to content

Sampling based Model Predictive Control package for Model-Based RL research

Notifications You must be signed in to change notification settings

ossamaAhmed/blackbox_mpc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BlackBox MPC (Model Predictive Control)

license GitHub release Documentation Status Maintenance PR Open Source Love png2

Description

This package provides a framework of different derivative-free optimizers (powered by Tensorflow 2.0.0) which can be used in conjuction with an MPC (model predictive controller) and an analytical/ learned dynamics model to control an agent in a gym environment.

Derivative-Free Optimizer BlackBox MPC
Cross-Entropy Method (CEM) ✔️
Covariance Matrix Adaptation Evolutionary-Strategy (CMA-ES) ✔️
Path Intergral Method (PI2) ✔️
Particle Swarm Optimizer (PSO) ✔️
Random Search (RandomSearch) ✔️
Simultaneous Perturbation Stochastic Approximation (SPSA) ✔️

The package features other functionalities to aid in model-based reinforcement learning (RL) research such as:

  • Parallel implementation of the different optimizers using Tensorflow 2.0
  • Loading/ saving system dynamics model.
  • Monitoring progress using tensorboard.
  • Learning dynamics functions.
  • Recording videos.
  • A modular and flexible interface design to enable research on different trajectory evaluation methods, optimizers, cost functions, system dynamics network architectures or even training algorithms.

Optimizers references:

Iterative MPC

Installation

Install as a pip package from latest release

pip install blackbox_mpc

Install from source

git clone https://github.com/ossamaAhmed/blackbox_mpc.git
cd blackbox_mpc
pip install -e .

To use GPU (recommended for faster inference)

pip install tensorflow_gpu==2.0.0

Usage

The easiest way to get familiar with the framework is to run through the tutorials provided. An example is shown below:

from blackbox_mpc.policies.mpc_policy import \
    MPCPolicy
from blackbox_mpc.utils.pendulum import PendulumTrueModel, \
    pendulum_reward_function
import gym

env = gym.make("Pendulum-v0")
mpc_policy = MPCPolicy(reward_function=pendulum_reward_function,
                       env_action_space=env.action_space,
                       env_observation_space=env.observation_space,
                       true_model=True,
                       dynamics_function=PendulumTrueModel(),
                       optimizer_name='RandomSearch',
                       num_agents=1)

current_obs = env.reset()
for t in range(200):
    action_to_execute, expected_obs, expected_reward = mpc_policy.act(
        current_obs, t)
    current_obs, reward, _, info = env.step(action_to_execute)
    env.render()

Documentation

An API specification and explanation of the code components can be found here.

Visualize Training

Authors

blackbox_mpc is work done by Ossama Ahmed (ETH Zürich), Jonas Rothfuss (ETH Zürich) and Prof. Andreas Krause (ETH Zurich).

This package was developed at the Learning and Adaptive Systems Lab @ETH Zurich.

If you use the package, please cite blackbox_mpc

@misc{blackbox_mpc,
   author = {Ahmed, Ossama and Rothfuss, Jonas and Krause, Andreas},
   year = {2020},
   publisher = {GitHub},
   journal = {GitHub repository},
   howpublished = {\url{https://github.com/ossamaAhmed/blackbox_mpc}},
}

License

The code is licenced under the MIT license and free to use by anyone without any restrictions.

TODO

  • Add bayesian neural networks (BNN) and graph neural networks (GNN) support.
  • Add different trajectory evaluators to propagate uncertainities support.