Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate continuous/discrete actions in AgentActionProto #4698

Merged

Conversation

dongruoping
Copy link
Contributor

Proposed change(s)

  • Add continuous/discrete action entries in AgentActionProto. Mark the old vector_actions as deprecated but still keep it for backward compatibility.
  • Change storedVectorActions from arrays to ActionBuffers to store hybrid actions in AgentInfo

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

JIRA: MLA-1593

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

Copy link
Contributor

@surfnerd surfnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

foreach (var ap in proto.Value)
{
agentActions.Add(ap.VectorActions.ToArray());
agentActions.Add(new ActionBuffers(ap.ContinuousActions.ToArray(), ap.DiscreteActions.ToArray()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure what type ap is here, but can we add a ToActionBuffers() extension method to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is auto-generated AgentActionProto class so maybe not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extension method == "this" modifier in argument, so something like

public static ActionBuffers ToActionBuffers(this UnityRLInputProto.Types.AgentActionProto proto) 
{
    return new ActionBuffers(proto.ContinuousActions.ToArray(), proto.DiscreteActions.ToArray());
}

...
agentActions.Add(ap.ToActionBuffers());
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal but it's just a bit cleaner

}

public void CopyActions(ActionBuffers actionBuffers)
{
actionBuffers.PackActions(storedVectorActions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use PackActions anywhere else? If not, can we remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@dongruoping
Copy link
Contributor Author

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

Can you point out where the syntax errors are? I didn't see python syntax issues around VectorActionsDeprecated

@surfnerd
Copy link
Contributor

surfnerd commented Dec 3, 2020

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

Can you point out where the syntax errors are? I didn't see python syntax issues around VectorActionsDeprecated

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/bin/mlagents-learn", line 33, in <module>
    sys.exit(load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')())
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 280, in main
    run_cli(parse_command_line())
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 276, in run_cli
    run_training(run_seed, options)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 153, in run_training
    tc.start_learning(env_manager)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 201, in start_learning
    self._save_models()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 84, in _save_models
    self.trainers[brain_name].save_model()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py", line 215, in save_model
    model_checkpoint = self._checkpoint()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py", line 189, in _checkpoint
    checkpoint_path = self.model_saver.save_checkpoint(self.brain_name, self.step)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/model_saver/torch_model_saver.py", line 57, in save_checkpoint
    self.export(checkpoint_path, behavior_name)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/model_saver/torch_model_saver.py", line 62, in export
    self.exporter.export_policy_model(output_filepath)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/model_serialization.py", line 111, in export_policy_model
    dynamic_axes=self.dynamic_axes,
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/__init__.py", line 230, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 91, in export
    use_external_data_format=use_external_data_format)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 639, in _export
    dynamic_axes=dynamic_axes)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 411, in _model_to_graph
    use_new_jit_passes)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 379, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 342, in _trace_and_get_graph_from_model
    torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 1148, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 130, in forward
    self._force_outplace,
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 116, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 725, in _call_impl
    result = self._slow_forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/networks.py", line 294, in forward
    encoding, masks
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/action_model.py", line 173, in get_action_out
    return continuous_out, discrete_out, action_out_deprecated
UnboundLocalError: local variable 'action_out_deprecated' referenced before assignment

from here: https://yamato.cds.internal.unity3d.com/jobs/497-ml-agents/tree/develop-hybrid-actionproto/.yamato%252Ftraining-int-tests.yml%2523test_mac_training_int_2018.4/4450760/job/(log:Execution)

around line 586 in the execution log.

@dongruoping dongruoping merged commit 6ec6635 into develop-hybrid-actions-csharp Dec 3, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-hybrid-actionproto branch December 3, 2020 21:35
dongruoping added a commit that referenced this pull request Dec 4, 2020
* separate entries for continuous/discrete in action proto

* store actions in AgentInfo as ActionBuffers instead of arrays
dongruoping added a commit that referenced this pull request Dec 4, 2020
* Add hybrid action capability flag (#4576)

* Change BrainParametersProto to support ActionSpec (#4579)

* Assign new BrainParametersProto fields based on capabilities (#4581)

* ActionBuffer with hybrid actions for RemotePolicy (#4592)

* Barracuda inference for hybrid actions (#4611)

* Refactor BarracudaModel loader checks (#4629)

* Export separate nodes for continuous/discrete actions (#4655)

* Separate continuous/discrete actions in AgentActionProto (#4698)

* Force different nodes for new and deprecated action output (#4705)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants