You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Obtaining scalar Q values in categorical algorithms is a bit unclear since they calculate distribution of Q values instead of scalar, but I think that taking expectation value of the distribution is natural. (as greedy_action does.)
The text was updated successfully, but these errors were encountered:
For CategoricalDQN, logged
average_q
values in statistics are alwaysnan
.average_q
stats is the mean value ofself.q_record
.https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/dqn.py#L695
However,
CategoricalDQN
overrides_compute_loss
function,https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/categorical_dqn.py#L172-L198
which is responsible to append a Q value to
self.q_record
, and thus the mean is alwaysnan
.Obtaining scalar Q values in categorical algorithms is a bit unclear since they calculate distribution of Q values instead of scalar, but I think that taking expectation value of the distribution is natural. (as greedy_action does.)
The text was updated successfully, but these errors were encountered: