Develop remove past action communication #2913

vincentpierre · 2019-11-15T01:34:47Z

No description provided.

- Fix the demonstration recorder - Fix the demonstration loader - verify the intrinsic reward signals work - Fix the tests on Python - Fix the C# tests

…ORMAT

UnitySDK/Assets/ML-Agents/Scripts/Grpc/GrpcExtensions.cs

UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/GeneratorImpl.cs

docs/Python-API.md

chriselion · 2019-11-19T01:27:10Z

ml-agents/mlagents/trainers/tf_policy.py

@@ -172,7 +178,7 @@ def make_empty_memory(self, num_agents):
        :param num_agents: Number of agents.
        :return: Numpy array of zeros.
        """
-        return np.zeros((num_agents, self.m_size))
+        return np.zeros((num_agents, self.m_size), dtype=np.float)


Out of scope for this PR, but should we try to make sure we never create float64 nparrays?

chriselion · 2019-11-19T01:31:36Z

ml-agents/mlagents/trainers/tf_policy.py

@@ -194,6 +200,34 @@ def remove_memories(self, agent_ids):
            if agent_id in self.memory_dict:
                self.memory_dict.pop(agent_id)

+    def make_empty_previous_action(self, num_agents):


Is there any difference between how we handle memories and previous actions besides the the dtype? Worth combining into a small utility class?

I think there is going to be a bigger refactor of this part coming later. I think it is fine for now. What do you think @ervteng ?

Might be worth combining - but then again, there will only be two things we store between steps. Also, the "when" they're added is different - memories can be added after the get_action, previous_actions must be added before.

I think there is no way around it, this logic will still need to live in the Policy for Python inference to work. The only other place I can think of is the env_manager, but that would be messy.

chriselion · 2019-11-19T01:36:14Z

Just to clarify, existing models will keep working as before? The only thing that would break when upgrading is existing demo files?

chriselion

Looks good, optional to DRY up the python memory/action storage.

vincentpierre · 2019-11-19T01:48:03Z

Models will still work as before but the demo files will not.

ervteng · 2019-11-19T01:58:20Z

Models will still work as before but the demo files will not.

Will we need to re-record all of them or just the Hallway one?

vincentpierre · 2019-11-19T01:59:29Z

Will we need to re-record all of them or just the Hallway one?

I reconverted them already

ervteng

vincentpierre added 4 commits November 13, 2019 16:04

Modifying the .proto files

de4bbf6

attempt 1 at refactoring Python

14e8fbd

works for ppo hallway

89b4e33

changing the documentation

9a1c156

vincentpierre self-assigned this Nov 15, 2019

vincentpierre requested a review from ervteng November 15, 2019 18:58

vincentpierre added 11 commits November 15, 2019 22:00

now works with both sac and ppo both training and inference

6f72a05

Ned to fix the tests

d113e9d

TODOs :

534f669

- Fix the demonstration recorder - Fix the demonstration loader - verify the intrinsic reward signals work - Fix the tests on Python - Fix the C# tests

Regenerating the protos

10060ca

fix proto typo

ae497c6

protos and modifying the C# demo recorder

25babf5

modified the demo loader

e1be237

Demos are loading

dd1ffc3

IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW F…

b58d9ca

…ORMAT

Modified all the demo files

5db0efe

Fixing all the tests

cb0ad0b

vincentpierre requested a review from chriselion November 19, 2019 00:48

vincentpierre marked this pull request as ready for review November 19, 2019 00:48

vincentpierre added 2 commits November 18, 2019 17:02

Merge branch 'develop' into develop-remove-past-action-communication

3744179

fixing ci

a2cb0b4

chriselion reviewed Nov 19, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/Grpc/GrpcExtensions.cs Outdated Show resolved Hide resolved

chriselion reviewed Nov 19, 2019

View reviewed changes

UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/GeneratorImpl.cs Outdated Show resolved Hide resolved

chriselion reviewed Nov 19, 2019

View reviewed changes

docs/Python-API.md Show resolved Hide resolved

vincentpierre added 2 commits November 18, 2019 17:18

addressing comments

b0e605b

removing reference to memories in the ll-api

5a96fe9

chriselion reviewed Nov 19, 2019

View reviewed changes

chriselion approved these changes Nov 19, 2019

View reviewed changes

ervteng approved these changes Nov 19, 2019

View reviewed changes

vincentpierre merged commit 4f58e10 into develop Nov 19, 2019

vincentpierre deleted the develop-remove-past-action-communication branch November 19, 2019 02:05

chriselion mentioned this pull request Nov 27, 2019

handle None action outputs #2988

Merged

github-actions bot locked as resolved and limited conversation to collaborators May 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop remove past action communication #2913

Develop remove past action communication #2913

vincentpierre commented Nov 15, 2019

chriselion Nov 19, 2019

chriselion Nov 19, 2019

vincentpierre Nov 19, 2019

ervteng Nov 19, 2019

chriselion commented Nov 19, 2019

chriselion left a comment

vincentpierre commented Nov 19, 2019

ervteng commented Nov 19, 2019

vincentpierre commented Nov 19, 2019

ervteng left a comment

Develop remove past action communication #2913

Develop remove past action communication #2913

Conversation

vincentpierre commented Nov 15, 2019

chriselion Nov 19, 2019

Choose a reason for hiding this comment

chriselion Nov 19, 2019

Choose a reason for hiding this comment

vincentpierre Nov 19, 2019

Choose a reason for hiding this comment

ervteng Nov 19, 2019

Choose a reason for hiding this comment

chriselion commented Nov 19, 2019

chriselion left a comment

Choose a reason for hiding this comment

vincentpierre commented Nov 19, 2019

ervteng commented Nov 19, 2019

vincentpierre commented Nov 19, 2019

ervteng left a comment

Choose a reason for hiding this comment