-
Notifications
You must be signed in to change notification settings - Fork 228
Model Predictive Path Integral Controller (MPPI)
This tutorial assumes you have CUDA and CNPY installed according to the README instructions, compiled the AutoRally software on your machine including, and are comfortable running the state estimator in simulation or on a physical robot.
The model predictive path integral (MPPI) controller is a novel approach for autonomous vehicle control based on stochastic sampling of trajectories. The method is derivative free, and can handle complex non-linear dynamics and cost functions, which makes it ideal for performing aggressive maneuvers with the AutoRally vehicle. For a high level overview of the algorithm, and our latest results see the YouTube videos:
For a detailed overview see the related papers. If you use this code in your research please cite one of the related papers.
Information Theoretic MPC for Model-Based Reinforcement Learning. Williams, G., Wagener, N., Goldfain, B., Drews, P., Rehg, J. M., Boots, B., & Theodorou, E. A. IEEE International Conference on Robotics and Automation (2017)
Aggressive Driving with Model Predictive Path Integral Control. Williams, G., Drews, P., Goldfain, B., Rehg, J. M., & Theodorou, E. A. IEEE International Conference on Robotics and Automation (2016)
You can run the controller in stand alone mode with the command:
roslaunch autorally_control path_integral_nn.launch
The _nn suffix denotes that we're launching the controller using a neural network dynamics model to predict and propagate the dynamics. You should see a window pop up with an overhead view of a track. The vehicle's position is the center of the image, and this should follow along the centerline of the track.
If you receive a invalid device function error, then you may need to edit the path integral CMakeLists.txt. Find the following line in the CMakeLists:
-arch=sm_35;
and replace the 35 with your GPU's compute capability. You can find the compute capability of your GPU here: https://developer.nvidia.com/cuda-gpus. For example if your compute capability is 5.0, replace the 35 with 50.
Running the controller with Gazebo is just like running it in stand-alone mode, except now we'll receive updates from the state estimator instead of just propagating the dynamics forward in time. First, launch the Gazebo simulation of the track:
roslaunch autorally_gazebo autorallyTrackGazeboSim.launch
Next launch the state estimator before moving the simulated robot.
roslaunch autorally_core stateEstimator.launch
Included in the repository is a map defined for the simulation track where the origin is the spawn point of the vehicle in the simulation. Therefore, it's important that you do not move the vehicle before starting the state estimator. Once the state estimator is running, move the vehicle around manually and check that that state estimator is converged. After the state estimator is converged, launch the controller:
roslaunch autorally_control path_integral_nn.launch
Be sure to check that the controller is running fast enough, run:
rostopic hz /mppi_controller/chassisCommand
and make sure that the controller is running at (or extremely close to) 40 hz. Running the controller in the real world is exactly the same, except that you'll need to make a track cost map to fit your specific track.
The path integral launch file has the following parameters:
pose_estimate
: [string] pose estimate topic to subscribe to
path_topic
: [string] topic name for the planned path by mppi
debug_mode
: [bool] whether to run in debug mode
hz
: [int] control publishing frequency per second
num_timesteps
: [int] number of timesteps to look ahead
x_pos
: [double] initial x position
y_pos
: [double] initial y position
heading
: [double] initial heading
model_path
: [string] absolute path to dynamics model parameters
gamma
: [double] mppi temperature parameter
num_iters
: [int] number of optimization iterations per timestep
init_steering
: [double] steering value to initialize new steering controls to
init_throttle
: [double] throttle value to initialize new throttle controls to
steering_std
: [double] standard deviation in the steering control channel
throttle_std
: [double] standard deviation in the throttle control channel
max_throttle
: [double] maximum throttle command
min_throttle
: [double] minimum throttle command
max_steering
: [double] maximum steering command
min_steering
: [double] minimum steering comand
desired_speed
: [double] target speed for the controller
speed_coefficient
: [double] weight on speed cost
track_coefficient
: [double] weight on track cost
max_slip_angle
: [double] maximum allowable slip angle before killing trajectory
slip_penalty
: [double] weight on square of the slip angle
track_slop
: [double] threshold below which the track cost gets set to zero
crash_coeff
: [double] penalty for crashing
steering_coeff
: [double] control cost for steering
throttle_coeff
: [double] control cost for throttle
map_path
: [string] Absolute path of the map file
Some of these variables can safely be changed on the fly using dynamic reconfigure. These are:
max_throttle
, desired_speed
, speed_coefficient
, track_coefficient
, max_slip_angle
, slip_penalty
,
crash_coefficient
, track_slop
, steering_coeff
, throttle_coeff
.
Once you have the MPPI controller running, try launching rqt_reconfigure and changing some of the parameters to get different behavior. For instance, increasing the target speed will make the vehicle go faster. The dynamics of the simulator once the vehicle starts sliding are significantly less forgiving than the vehicle dynamics in the real world, but the vehicle should still be relatively stable for a target speed of 9 m/s. (On the real platform it is stable up to an 11 m/s target).
The cost map is what defines the boundaries of the track for the MPPI controller. A value of 0 defines the center of the track, and a value of 1 indicates off the track. If the track cost is larger than 1 then the MPPI controller receives an impulse cost (the crash coefficient term), this is designed to enforce a type of hard boundary constraint so that the vehicle won't violate the track boundaries. The cost map is a simple text file with the following format:
first entry: minimum x value second entry: maximum x value third entry: minimum y value fourth entry: maximum y value fifth entry: resolution of the grid. The total width of the grid is (maximum x - minimum x)*resolution and similarly for the height. all remaining entries: the grid values in row-major order
A pre-computed map is included in the repository for the Gazebo simulation track.
We use a neural network to represent the dynamics. The neural network that we use is trained on the real-world vehicle data, and then fine tuned to the Gazebo simulation data. The neural network parameters are saved in a numpy archive (.npz) file with the following format:
dynamics_W1
: input to first hidden layer matrix
dynamics_b1
: first hidden layer bias
...
...
dynamics_Wn
: hidden to hidden matrix for the nth layer
dynamics_bn
: nth hidden layer bias
...
The size of the neural network is a template parameter in the C++/CUDA code. You can try training your own neural network using one of many popular neural network libraries (we use Lasagne: https://github.com/Lasagne/Lasagne), and then saving the parameters in a numpy archive with the above format. You just need to change the template parameters to match your network, and point the model path in the path integral launch file to the correct location. Note that the model parameters need to be saved as double precision floating point numbers in order for them to be read in correctly.