Logged average_q is NaN for CategoricalDQN.get_statistics() #31

keisuke-nakata · 2020-07-30T07:19:25Z

For CategoricalDQN, logged average_q values in statistics are always nan.

average_q stats is the mean value of self.q_record.
https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/dqn.py#L695

However, CategoricalDQN overrides _compute_loss function,
https://github.com/pfnet/pfrl/blob/d420891573/pfrl/agents/categorical_dqn.py#L172-L198
which is responsible to append a Q value to self.q_record, and thus the mean is always nan.

Obtaining scalar Q values in categorical algorithms is a bit unclear since they calculate distribution of Q values instead of scalar, but I think that taking expectation value of the distribution is natural. (as greedy_action does.)

The text was updated successfully, but these errors were encountered:

keisuke-nakata mentioned this issue Jul 30, 2020

calculate scalar q_values to log average_q statistics in categorical dqn algorithms #32

Merged

muupan closed this as completed in #32 Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

keisuke-nakata commented Jul 30, 2020

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

Logged average_q is NaN for CategoricalDQN.get_statistics() #31

Comments

keisuke-nakata commented Jul 30, 2020