Our training examples run using a third-party highly-optimized RL library, rl_games. This also demonstrates how our framework can be used with other RL libraries.
RL Games will be installed automatically along with isaacgymenvs
.
Otherwise, to install rl_games manually the following instructions should be performed:
pip install rl-games
Or to use the latest, unreleased version:
git clone https://github.com/Denys88/rl_games.git
pip install -e .
For all the sample tasks provided, we include training configurations
for rl_games, denoted with the suffixes *PPO.yaml
.
These files are located in isaacgymenvs/config/train
.
The appropriate config file will be selected
automatically based on the task being executed and the script that it is
being launched from. To launch a task using rl-games, run
python train.py
.
For a list of the sample tasks we provide, refer to the RL List of Examples
The base class for Isaac Gym's RL framework is VecTask
in vec_task.py.
The VecTask
class is designed to act as a parent class for all RL tasks
using Isaac Gym's RL framework. It provides an interface for interaction
with RL algorithms and includes functionalities that are required for
all RL tasks.
The VecTask
constructor takes a configuration dictionary containing numerous parameters required:
device_type
- the type of device used for simulation. cuda
or cpu
.
device_id
- ID of the device used for simulation. eg 0
for a single GPU workstation.
rl_device
- Full name:id
string of the device that the RL framework is using.
headless
- True
/False
depending on whether you want the simulation to run the simulation with a viewer.
physics_engine
- which physics engine to use. Must be "physx"
or "flex"
.
env
- a dictionary with environment-specific parameters.
Can include anything in here you want depending on the specific parameters, but key ones which you must provide are:
numEnvs
- number of environments being simulated in parallelnumObservations
- size of the observation vector used for each environment.numActions
- size of the actions vector.
Other optional parameters are
numAgents
- for multi-agent environments. Defaults to1
numStates
- for size of state vector for training with asymmetric actor-critic.controlFrequencyInv
- control decimation, ie. how many simulator steps between RL actions. Defaults to 1.clipObservations
- range to clip observations to. Defaults toinf
(+-infinity).clipActions
- range to clip actions to. Defaults to1
(+-1).enableCameraSensors
- set toTrue
if camera sensors are used in the environment.
The __init__
function of VecTask
triggers a call to create_sim()
,
which must be implemented by the extended classes.
It will then initialize buffers required for RL on the device specified. These include observation buffer, reward
buffer, reset buffer, progress buffer, randomization buffer, and an optional extras array for passing in any additional
information to the RL algorithm.
A call to prepare_sim()
will also be made to initialize the internal data
structures for simulation. set_viewer()
is also called, which, if running with a viewer,
this function will also initialize the viewer and create keyboard shortcuts for quitting
the application (ESC) and disabling/enabling rendering (V).
The step
function is designed to guide the workflow of each RL
iteration. This function can be viewed in three parts:
pre_physics_step
, simulate
, and post_physics_step
.
pre_physics_step
should be implemented to perform any computations
required before stepping the physics simulation. As an example, applying
actions from the policy should happen in pre_physics_step
. simulate
is then called to step the physics simulation. post_physics_step
should implement computations performed after stepping the physics
simulation, e.g. computing rewards and observations.
VecTask
also provides an implementation of render
to step graphics if
a viewer is initialized.
Additionally, VecTask provides an interface to perform Domain
Randomization via the apply_randomizations
method. For more details,
please see Domain Randomization.
Creating a new task is straight-forward using Isaac Gym's RL framework. The first step is to create a new script file in isaacgymenvs/tasks.
To use Isaac Gym's APIs, we need the following imports
from isaacgym import gymtorch
from isaacgym import gymapi
from .base.vec_task import VecTask
Then, we need to create a Task class that extends from VecTask
class MyNewTask(VecTask):
The __init__
method should take 3 arguments: a config dict conforming to the
specifications described above (this will be generated from hydra config), sim_device
, the device string representing
where the simulation will be run, and headless
, which specifies whether or not to run in headless mode.
In the __init__
method of MyNewTask, make sure to make a call to
VecTask
's __init__
to initialize the simulation, providing the
config dictionary with members as described above:
super().__init__(
cfg=config_dict
)
Then, we can initialize state tensors that we may need for our task. For example, we can initialize the DOF state tensor
dof_state_tensor = self.gym.acquire_dof_state_tensor(self.sim)
self.dof_state = gymtorch.wrap_tensor(dof_state_tensor)
There are a few methods that must be implemented by a child class of
VecTask: create_sim
, pre_physics_step
, post_physics_step
.
def create_sim(self):
# implement sim set up and environment creation here
# - set up-axis
# - call super().create_sim with device args (see docstring)
# - create ground plane
# - set up environments
def pre_physics_step(self, actions):
# implement pre-physics simulation code here
# - e.g. apply actions
def post_physics_step(self):
# implement post-physics simulation code here
# - e.g. compute reward, compute observations
To launch the new task from train.py
, add your new
task to the imports and isaacgym_task_map
dict in the tasks
__init__.py file.
from isaacgymenvs.tasks.my_new_task import MyNewTask
...
isaac_gym_task_map = {
'Anymal': Anymal,
# ...
'MyNewTask': MyNewTask,
}
You will also need to create config files for task and training, which will be passed in dictionary form to the first
config
argument of your task. The task
config, which goes in the corresponding config folder
must have a name
in the root matching the task name you put in the isaac_gym_task_map
above. You should name your
task config the same as in the Isaac Gym task map, eg. Anymal
becomes Anymal.yaml
.
You also need a train
config specifying RL Games arguments. This should go in the corresponding config folder.
The file should have the postfix PPO
, ie Anymal
becomes AnymalPPO.yaml
.
Then, you can run your task with python train.py task=MyNewTask
.
If you have existing environments set up with Isaac Gym Preview 2 release or earlier, it is simple to convert your tasks to the new RL framework in IsaacGymEnvs. Here are a few pointers to help you get started.
- The
torch_jit_utils
script has been moved to IsaacGymEnvs. Tasks that are importing fromrlgpu.utils.torch_jit_utils
should now import fromutils.torch_jit_utils
. - The original
BaseTask
class has been converted toVecTask
in IsaacGymEnvs. All tasks inheriting from the previousBaseTask
should modifyfrom rlgpu.tasks.base.base_task import BaseTask
tofrom .base.vec_task import VecTask
.
- Your task class should now inherit from
VecTask
instead of the previousBaseTask
. - Arguments required for class initialization has been simplified. The task
__init__()
method now only requirescfg
,sim_device
, andheadless
as arguments. - It is no longer required to set
self.sim_params
andself.physics_engine
in the__init__()
method of your task definition. - Making a call to
VecTask
's__init__()
method requires 3 more arguments:rl_device
,sim_device
andheadless
. As an example, modify the line of code tosuper().__init__(config=self.cfg, rl_device=rl_device, sim_device=sim_device, headless=headless)
. VecTask
now defines areset_idx()
function that should be implemented in an environment class. It resets environments with the provided indices.- Note that
VecTask
now defines areset()
method that does not accept environment indices as arguments. To avoid naming conflicts, consider renaming thereset()
method inside your task definition.
- Assets have been moved to IsaacGymEnvs (with some still remaining in IsaacGym for use in examples). Please make sure the paths to your assets remain valid in the new IsaacGymEnvs setup.
- Assets are now located under
assets/
.
- Some config parameters are now updated to work with resolvers and Hydra. Please refer to an example config in
cfg/
for details. - For task configs, the following are modified:
physics_engine
,numEnvs
,use_gpu_pipeline
,num_threads
,solver_type
,use_gpu
,num_subscenes
. - For train configs, the following are modified:
seed
,load_checkpoint
,load_path
,name
,full_experiment_name
,num_actors
,max_epochs
. - Also note a few naming changes required for the latest version of rl_games:
lr_threshold
-->kl_threshold
,steps_num
-->horizon_length
.
When using the viewer, various actions can be executed with specific reserved keys:
- 'V' - Toggles rendering on and off. This is useful for speeding up training and observing the results.
- 'R' - Initiates video recording, saving the rendered frames to a designated folder.
- 'Tab' - Toggles the left panel, allowing you to remove and bring it back as necessary.
- 'ESC' - Stops the simulation and rendering processes, effectively quitting the program.