-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop remove past action communication #2913
Develop remove past action communication #2913
Conversation
UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/GeneratorImpl.cs
Outdated
Show resolved
Hide resolved
@@ -172,7 +178,7 @@ def make_empty_memory(self, num_agents): | |||
:param num_agents: Number of agents. | |||
:return: Numpy array of zeros. | |||
""" | |||
return np.zeros((num_agents, self.m_size)) | |||
return np.zeros((num_agents, self.m_size), dtype=np.float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope for this PR, but should we try to make sure we never create float64 nparrays?
@@ -194,6 +200,34 @@ def remove_memories(self, agent_ids): | |||
if agent_id in self.memory_dict: | |||
self.memory_dict.pop(agent_id) | |||
|
|||
def make_empty_previous_action(self, num_agents): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any difference between how we handle memories and previous actions besides the the dtype? Worth combining into a small utility class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is going to be a bigger refactor of this part coming later. I think it is fine for now. What do you think @ervteng ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth combining - but then again, there will only be two things we store between steps. Also, the "when" they're added is different - memories can be added after the get_action, previous_actions must be added before.
I think there is no way around it, this logic will still need to live in the Policy for Python inference to work. The only other place I can think of is the env_manager, but that would be messy.
Just to clarify, existing models will keep working as before? The only thing that would break when upgrading is existing demo files? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, optional to DRY up the python memory/action storage.
Models will still work as before but the demo files will not. |
Will we need to re-record all of them or just the Hallway one? |
I reconverted them already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.