Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy output actiontuple #4651

Merged

Conversation

andrewcoh
Copy link
Contributor

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

self._discrete = discrete
def __init__(self):
self._continuous = None
self._discrete = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping the continuous and discrete arguments, and adding e.g.

if continuous is not None:
  self.add_continuous(continuous)
# TODO same for discrete

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -5,6 +5,38 @@
from mlagents.trainers.torch.utils import ModelUtils


class LogProbsTuple:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is identical to ActionTuple in functionality, but you want to make sure that they aren't accidentally used interchangeably, you could make them both derive from e.g. _ActionTupleBase

Copy link
Contributor

@chriselion chriselion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot better than the dictionary, although the class still feels clunky. I think you can keep the initializer args which should shrink the code a bit.

continuous = np.random.uniform(
low=-1.0, high=1.0, size=(n_agents, self.continuous_size)
)
discrete = np.zeros((n_agents, self.discrete_size), dtype=np.int32)
action_tuple.add_continuous(continuous)
Copy link
Contributor

@ervteng ervteng Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add back the continuous and discrete parameters to ActionTuple, we can do something like:

Suggested change
action_tuple.add_continuous(continuous)
_continuous = continuous
_discrete = np.column_stack() if self.discrete_size > 0 else None
action_tuple = ActionTuple(_continuous, _discrete)

@andrewcoh andrewcoh mentioned this pull request Nov 13, 2020
10 tasks
@andrewcoh andrewcoh merged commit e49f68b into develop-hybrid-actions-singleton Nov 16, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-output-actiontuple branch November 16, 2020 19:55
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants