Separate continuous/discrete actions in AgentActionProto #4698

dongruoping · 2020-12-02T18:55:57Z

Proposed change(s)

Add continuous/discrete action entries in AgentActionProto. Mark the old vector_actions as deprecated but still keep it for backward compatibility.
Change storedVectorActions from arrays to ActionBuffers to store hybrid actions in AgentInfo

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

JIRA: MLA-1593

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

surfnerd

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

chriselion · 2020-12-02T21:39:45Z

com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs

            foreach (var ap in proto.Value)
            {
-                agentActions.Add(ap.VectorActions.ToArray());
+                agentActions.Add(new ActionBuffers(ap.ContinuousActions.ToArray(), ap.DiscreteActions.ToArray()));


Not quite sure what type ap is here, but can we add a ToActionBuffers() extension method to it?

It is auto-generated AgentActionProto class so maybe not

extension method == "this" modifier in argument, so something like

public static ActionBuffers ToActionBuffers(this UnityRLInputProto.Types.AgentActionProto proto) { return new ActionBuffers(proto.ContinuousActions.ToArray(), proto.DiscreteActions.ToArray()); } ... agentActions.Add(ap.ToActionBuffers()); ...

not a big deal but it's just a bit cleaner

chriselion · 2020-12-02T21:43:24Z

com.unity.ml-agents/Runtime/Agent.cs

        }

        public void CopyActions(ActionBuffers actionBuffers)
        {
-            actionBuffers.PackActions(storedVectorActions);


Do we use PackActions anywhere else? If not, can we remove it?

dongruoping · 2020-12-02T23:23:14Z

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

Can you point out where the syntax errors are? I didn't see python syntax issues around VectorActionsDeprecated

surfnerd · 2020-12-03T00:01:46Z

The changes look good to me except for the python syntax errors around referencing VectorActionsDeprecated before they are assigned.

Can you point out where the syntax errors are? I didn't see python syntax issues around VectorActionsDeprecated

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/bin/mlagents-learn", line 33, in <module>
    sys.exit(load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')())
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 280, in main
    run_cli(parse_command_line())
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 276, in run_cli
    run_training(run_seed, options)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/learn.py", line 153, in run_training
    tc.start_learning(env_manager)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 201, in start_learning
    self._save_models()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 84, in _save_models
    self.trainers[brain_name].save_model()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py", line 215, in save_model
    model_checkpoint = self._checkpoint()
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py", line 189, in _checkpoint
    checkpoint_path = self.model_saver.save_checkpoint(self.brain_name, self.step)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/model_saver/torch_model_saver.py", line 57, in save_checkpoint
    self.export(checkpoint_path, behavior_name)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/model_saver/torch_model_saver.py", line 62, in export
    self.exporter.export_policy_model(output_filepath)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/model_serialization.py", line 111, in export_policy_model
    dynamic_axes=self.dynamic_axes,
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/__init__.py", line 230, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 91, in export
    use_external_data_format=use_external_data_format)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 639, in _export
    dynamic_axes=dynamic_axes)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 411, in _model_to_graph
    use_new_jit_passes)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 379, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/onnx/utils.py", line 342, in _trace_and_get_graph_from_model
    torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 1148, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 130, in forward
    self._force_outplace,
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/jit/_trace.py", line 116, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 725, in _call_impl
    result = self._slow_forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/networks.py", line 294, in forward
    encoding, masks
  File "/Users/bokken/build/output/Unity-Technologies/ml-agents/ml-agents/mlagents/trainers/torch/action_model.py", line 173, in get_action_out
    return continuous_out, discrete_out, action_out_deprecated
UnboundLocalError: local variable 'action_out_deprecated' referenced before assignment

from here: https://yamato.cds.internal.unity3d.com/jobs/497-ml-agents/tree/develop-hybrid-actionproto/.yamato%252Ftraining-int-tests.yml%2523test_mac_training_int_2018.4/4450760/job/(log:Execution)

around line 586 in the execution log.

* separate entries for continuous/discrete in action proto * store actions in AgentInfo as ActionBuffers instead of arrays

* Add hybrid action capability flag (#4576) * Change BrainParametersProto to support ActionSpec (#4579) * Assign new BrainParametersProto fields based on capabilities (#4581) * ActionBuffer with hybrid actions for RemotePolicy (#4592) * Barracuda inference for hybrid actions (#4611) * Refactor BarracudaModel loader checks (#4629) * Export separate nodes for continuous/discrete actions (#4655) * Separate continuous/discrete actions in AgentActionProto (#4698) * Force different nodes for new and deprecated action output (#4705)

dongruoping added 6 commits December 1, 2020 15:16

separate entries for continuous/discrete in action proto

d0e4c96

use int for discrete action in proto

cf40bc2

use int for discrete action in proto

9b6e20f

store actions in AgentInfo as ActionBuffers instead of arrays

b44b817

fix constructing action proto

fe569bb

fix bug

75e586b

dongruoping requested review from surfnerd, chriselion and vincentpierre December 2, 2020 18:55

surfnerd reviewed Dec 2, 2020

View reviewed changes

chriselion reviewed Dec 2, 2020

View reviewed changes

chriselion approved these changes Dec 2, 2020

View reviewed changes

surfnerd approved these changes Dec 2, 2020

View reviewed changes

remove PackActions()

da7b4f7

dongruoping added 2 commits December 2, 2020 16:19

initialize action_out_deprecated

b774cc9

extension method for action proto to action buffer

913e82d

dongruoping merged commit 6ec6635 into develop-hybrid-actions-csharp Dec 3, 2020

delete-merged-branch bot deleted the develop-hybrid-actionproto branch December 3, 2020 21:35

dongruoping added a commit that referenced this pull request Dec 4, 2020

Separate continuous/discrete actions in AgentActionProto (#4698)

e490ec5

* separate entries for continuous/discrete in action proto * store actions in AgentInfo as ActionBuffers instead of arrays

github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate continuous/discrete actions in AgentActionProto #4698

Separate continuous/discrete actions in AgentActionProto #4698

dongruoping commented Dec 2, 2020

surfnerd left a comment

chriselion Dec 2, 2020

dongruoping Dec 2, 2020

chriselion Dec 3, 2020

chriselion Dec 3, 2020

chriselion Dec 2, 2020

dongruoping Dec 2, 2020

dongruoping commented Dec 2, 2020

surfnerd commented Dec 3, 2020 •

edited

Loading

Separate continuous/discrete actions in AgentActionProto #4698

Separate continuous/discrete actions in AgentActionProto #4698

Conversation

dongruoping commented Dec 2, 2020

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

surfnerd left a comment

Choose a reason for hiding this comment

chriselion Dec 2, 2020

Choose a reason for hiding this comment

dongruoping Dec 2, 2020

Choose a reason for hiding this comment

chriselion Dec 3, 2020

Choose a reason for hiding this comment

chriselion Dec 3, 2020

Choose a reason for hiding this comment

chriselion Dec 2, 2020

Choose a reason for hiding this comment

dongruoping Dec 2, 2020

Choose a reason for hiding this comment

dongruoping commented Dec 2, 2020

surfnerd commented Dec 3, 2020 • edited Loading

surfnerd commented Dec 3, 2020 •

edited

Loading