About the input of policy #9

Liuxueyi · 2024-05-06T07:57:03Z

In the paper, there is a state that

leverage the output state policy in nonmemory environments and the full state policy or hidden state policy within memory environments.

But in the configs.yaml, there is only in the POPGym task where the actor.inputs is set to be [stoch, hidden]. Why? Looking forward to your reply.

artemZholus · 2024-05-07T07:04:07Z

Hi @Liuxueyi .

the sentence you quoted is a general recommendation for working with new environments. There is another part of the paper where we state that in popgym we use the hidden state policy, so this config reflects that.

Liuxueyi · 2024-05-07T07:11:46Z

Thanks for your reply!
By the way, for the atari100k environment and pong task, what should be the input of policy(the default is [deter, stoch, hidden])? And in the issue #8 , when set the parameter run.script: train, there isn't any error. Is this feasible and will it affect the final performance of the algorithm?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the input of policy #9

About the input of policy #9

Liuxueyi commented May 6, 2024

artemZholus commented May 7, 2024

Liuxueyi commented May 7, 2024 •

edited

Loading

About the input of policy #9

About the input of policy #9

Comments

Liuxueyi commented May 6, 2024

artemZholus commented May 7, 2024

Liuxueyi commented May 7, 2024 • edited Loading

Liuxueyi commented May 7, 2024 •

edited

Loading