MultiDiscrete Gym Environments #176

bbalaji-ucsd · 2018-12-28T17:32:30Z

Gym API supports MultiDiscrete action spaces:
https://github.com/openai/gym/blob/master/gym/spaces/multi_discrete.py

This is useful when you want to discretize a continuous control problem, a technique common in literature: https://arxiv.org/abs/1808.00177

But MultiDiscrete action spaces are ignored in Coach:
https://github.com/NervanaSystems/coach/blob/master/rl_coach/environments/gym_environment.py#L367

Can you please add support for them?

As a concrete use case, I would like to (independently) discretize the steering and throttle actions in DeepRacer:
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/reinforcement_learning/rl_deepracer_robomaker_coach_gazebo/src/robomaker/environments/deepracer_env.py#L320

zach-nervana · 2019-01-03T21:35:59Z

I agree it makes sense to add support for MultiDiscrete action spaces in Coach. In cases where an environment already has a continuous action space defined, there are a couple approaches:

Create a gym wrapper which converts continuous action spaces into discrete ones.
Provide a way to specify in coach that a continuous action space should be discretized.

bbalaji-ucsd · 2019-01-03T23:36:34Z

Both options look good to me.

I'm concerned about how to map the multiple sets of discrete actions to the neural network outputs. If this get supported cleanly, I don't mind manually discretizing the continuous action spaces.

zach-nervana · 2019-01-04T17:17:24Z

I agree that there is another change that that this points to. The observation space can already be defined as a dictionary of multiple spaces. It would be nice to have something similar for action spaces. As it stands, gym environments only support BoxActionSpace and DiscreteActionSpace. It looks like coach also supports MultiSelectActionSpace and PartialDiscreteActionSpaceMap. These and look like they might support some special case, but not the general case as is available for observation spaces.

bbalaji-ucsd · 2019-01-04T23:06:36Z

Yes, completely agree.

galnov · 2019-01-15T14:50:24Z

@bbalaji-ucsd check out the BoxDiscretization Action Filter. There's also a sample CARLA preset using it. Is this good enough for your purpose?

zach-nervana · 2019-01-15T16:20:46Z

@galnov very cool, i didn't know about this, and it does address the case i raised where an environment has a continuous action space that you want discretized. However, the original issue still stands which is that gym environments can define MultiDiscrete action spaces that coach doesn't recognize.

zach-nervana · 2019-01-15T17:08:13Z

@galnov in gym, MultiDiscrete is an action space defined by a vector nvec. If the vector has value [2, 3, 4], then the action space consists of 3 decisions. The first decision has two possibilities, the second decision has 3, et cetera.

We could support this in coach with CompoundActionSpace in combination with DiscreteActionSpace, however that would also require supporting CompoundActionSpace under all agents. Do you think that is preferable over creating a new MultiDiscreteActionSpace?

mattiasmar · 2019-03-03T12:02:51Z

@galnov Is the acceptance criteria for this issue an agent that implements multi dimensional discrete RL?
Would the paper by Dulac-Arnold et. al be implemented?

galnov added the priority/p3 enhancements not currently in focus or low impact bugs label Jan 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiDiscrete Gym Environments #176

MultiDiscrete Gym Environments #176

bbalaji-ucsd commented Dec 28, 2018

zach-nervana commented Jan 3, 2019

bbalaji-ucsd commented Jan 3, 2019

zach-nervana commented Jan 4, 2019

bbalaji-ucsd commented Jan 4, 2019

galnov commented Jan 15, 2019

zach-nervana commented Jan 15, 2019

zach-nervana commented Jan 15, 2019

mattiasmar commented Mar 3, 2019 •

edited

Loading

MultiDiscrete Gym Environments #176

MultiDiscrete Gym Environments #176

Comments

bbalaji-ucsd commented Dec 28, 2018

zach-nervana commented Jan 3, 2019

bbalaji-ucsd commented Jan 3, 2019

zach-nervana commented Jan 4, 2019

bbalaji-ucsd commented Jan 4, 2019

galnov commented Jan 15, 2019

zach-nervana commented Jan 15, 2019

zach-nervana commented Jan 15, 2019

mattiasmar commented Mar 3, 2019 • edited Loading

mattiasmar commented Mar 3, 2019 •

edited

Loading