Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action buffer #4612

Merged
merged 89 commits into from
Nov 16, 2020
Merged

Action buffer #4612

merged 89 commits into from
Nov 16, 2020

Conversation

andrewcoh
Copy link
Contributor

Proposed change(s)

Describe the changes made in this PR.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

ml-agents-envs/mlagents_envs/base_env.py Show resolved Hide resolved
ml-agents-envs/mlagents_envs/base_env.py Outdated Show resolved Hide resolved
ml-agents-envs/mlagents_envs/base_env.py Show resolved Hide resolved
respectively.
"""

def __init__(self, continuous: np.ndarray, discrete: np.ndarray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some constructor that will take only continuous or only discrete so the user does not have to create an empty array when using only discrete or only continuous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the default be None but in the constructor assigns an empty array when None is specified? This is a common pattern for mutable default parameters

ml-agents-envs/mlagents_envs/environment.py Outdated Show resolved Hide resolved
@@ -143,3 +146,15 @@ def _process_step_infos(self, step_infos: List[EnvironmentStep]) -> int:
step_info.environment_stats, step_info.worker_id
)
return len(step_infos)

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

ml-agents/mlagents/trainers/demo_loader.py Show resolved Hide resolved
Comment on lines 100 to 104
decision_steps, terminal_steps = env.get_steps("RealFakeBrain")
n_agents = len(decision_steps)
env.set_actions("RealFakeBrain", spec.action_spec.empty_action(n_agents) - 1)
env.step()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
+ self.behavior_spec.action_spec.discrete_size
)
self.previous_action_dict: Dict[str, np.array] = {}
self.previous_action_dict: Dict[str, Dict[str, np.ndarray]] = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not Dict[str, ActionTuple]? a dictionary with two hardcoded keys (in make_empty_previous_action()) doesn't make much sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. I was doing this because it made writing to the buffer easier (e.g. using the string) but in reality we only ever use the previous action if its discrete. I've updated the PR to only save the discrete actions.

@@ -270,6 +271,14 @@ def get_action(
)

self.save_memories(global_agent_ids, run_out.get("memory_out"))
# For Compatibility with buffer changes for hybrid action support
if "log_probs" in run_out:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not new to this PR, but should we move these strings to constant/enums? "pre_action" and "actions_pre" has me scared, and "continuous" is generally pretty typo-prone.


def __init__(
self,
continuous: Optional[np.ndarray] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer having this not be Optional. When the action is discrete only, the continuous action becomes of shape (n_agents, 0). I think it is slightly more general than having a None if there is no action because this way, we can eventually consider all actions to be hybrid, only some of them happen to have 0 discrete or continuous actions. @chriselion what do you think? I might be missing something

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for initializer arguments, keeping them Optional is nice. If you want to ensure that the fields themselves are non-None, you could "allocate" the np array on construction if they weren't provided (but not sure that's meaningful since there would be 0 elements).

As a side note, I think we're assuming that continuous.shape[0] == discrete.shape[0] - might be good to check that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_validate_actions will check this`

) -> UnityInputProto:
rl_in = UnityRLInputProto()
for b in vector_action:
n_agents = len(self._env_state[b][0])
if n_agents == 0:
continue
for i in range(n_agents):
action = AgentActionProto(vector_actions=vector_action[b][i])
# TODO: This check will be removed when the oroto supports hybrid actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO: This check will be removed when the oroto supports hybrid actions
# TODO: This check will be removed when the proto supports hybrid actions

Copy link
Contributor

@ervteng ervteng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the remaining issues will end up being fixed by #4651, this part LGTM

_expected_shape = (n_agents, _size)
if actions.shape != _expected_shape:
_expected_shape = (n_agents, self.continuous_size)
if self.continuous_size > 0 and actions.continuous.shape != _expected_shape:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why check if self.continuous_size > 0 ? same question line 36 with discrete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed in the ActionModel because the defaults make it unnecessary

) -> UnityInputProto:
rl_in = UnityRLInputProto()
for b in vector_action:
n_agents = len(self._env_state[b][0])
if n_agents == 0:
continue
for i in range(n_agents):
action = AgentActionProto(vector_actions=vector_action[b][i])
# TODO: This check will be removed when the oroto supports hybrid actions
if vector_action[b].continuous.shape[1] > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why check on the shape[1] rather than action_specs.continuous_size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActionSpec is not available here. This is the only way I think we can determine if the action space is continuous or discrete in this function. Critically, this is just to support the old proto until the new proto is merged. This is just temporary so that the communication protocol works until the new protos are in place.

@andrewcoh andrewcoh merged commit 0a94d76 into develop-hybrid-action-staging Nov 16, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-action-buffer branch November 16, 2020 20:46
@ykeuter ykeuter mentioned this pull request Nov 29, 2020
10 tasks
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants