Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

Closed
keisuke-nakata opened this issue Jul 30, 2020 · 0 comments · Fixed by #32
Closed

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

keisuke-nakata opened this issue Jul 30, 2020 · 0 comments · Fixed by #32

Comments

@keisuke-nakata
Copy link
Member

For CategoricalDQN, logged average_q values in statistics are always nan.

average_q stats is the mean value of self.q_record.
https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/dqn.py#L695

However, CategoricalDQN overrides _compute_loss function,
https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/categorical_dqn.py#L172-L198
which is responsible to append a Q value to self.q_record, and thus the mean is always nan.

Obtaining scalar Q values in categorical algorithms is a bit unclear since they calculate distribution of Q values instead of scalar, but I think that taking expectation value of the distribution is natural. (as greedy_action does.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant