Allow NaNs in observations #4728

RedTachyon · 2020-12-09T17:51:25Z

Is your feature request related to a problem? Please describe.

I'm implementing a custom sensor which can have a variable number of observations. As mentioned in #4686 this might be supported somewhere down the line, but at the moment I'm handling it by setting a maximum observation count, and padding the array with NaNs to make sure I catch it on the Python side - otherwise, I'll just instantly get NaNs everywhere which will be a pretty good alert.

Unfortunately, that's not possible. and the UnityEnvironment crashes the moment it sees any NaN (or inf)

Describe the solution you'd like

While Unity handles it well by itself, in mlagents_envs/rpc_utils.py:219 there's an explicit check for NaNs/infs in observations.

Since for most situations it's probably a good thing to avoid NaNs, I think the ideal solution would be adding a setting somewhere that can turn it into a warning. Perhaps it could be set with a global context, similar to functions like np.seed()?

But threading that setting from the environment itself to the checking function itself might be too cumbersome, so also changing the error to a warning might still be alright - it still lets you know something happened, but allows the Python side to handle it by itself.

Describe alternatives you've considered
Since my observation space is bounded, I'll probably try to use some very high value or float.MaxValue and treat that as a NaN. Not ideal, but should work.

Additional context

The text was updated successfully, but these errors were encountered:

awjuliani · 2020-12-09T18:21:23Z

Hi @RedTachyon

Thanks for the request. This is an interesting use-case that you bring up. As you mention, we currently check for NaN and Inf because these are typically useless values that will ruin a training run. While I understand the thinking behind the requested change, there are likely better ways to flag observations as being ignorable than sending NaNs. Having a pre-set value to correspond to ignorable values is one. Another would be to send a separate observation vector of masking values in addition to the observations themselves.

robinerd · 2020-12-13T15:30:34Z

Hi! Out of curiousity, how do you even handle your variable observation count on the neural network side of things? The only way I'm aware of is some kind of recurrent sequence encoder. I suppose you have implemented a custom neural network layout in python?

chriselion · 2021-04-19T17:45:51Z

@RedTachyon The variable length observation feature was added in the previous release.

@robinerd There are some information about the implementation in
https://github.com/Unity-Technologies/ml-agents/blob/release_16_docs/docs/ML-Agents-Overview.md#learning-from-variable-length-observations-using-attention
and
https://github.com/Unity-Technologies/ml-agents/blob/release_16_docs/docs/Learning-Environment-Design-Agents.md#variable-length-observations

github-actions · 2021-05-19T20:02:21Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

RedTachyon added the request Issue contains a feature request. label Dec 9, 2020

awjuliani self-assigned this Dec 9, 2020

dongruoping assigned chriselion and unassigned awjuliani Apr 15, 2021

chriselion closed this as completed Apr 19, 2021

github-actions bot locked as resolved and limited conversation to collaborators May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow NaNs in observations #4728

Allow NaNs in observations #4728

RedTachyon commented Dec 9, 2020

awjuliani commented Dec 9, 2020

robinerd commented Dec 13, 2020

chriselion commented Apr 19, 2021

github-actions bot commented May 19, 2021

Allow NaNs in observations #4728

Allow NaNs in observations #4728

Comments

RedTachyon commented Dec 9, 2020

awjuliani commented Dec 9, 2020

robinerd commented Dec 13, 2020

chriselion commented Apr 19, 2021

github-actions bot commented May 19, 2021