You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue appears to be TD3's target policy smoothing, with the action noise wrapper not supporting the hybrid action space. Setting noise=False in the policy configuration to be the same as the reference DDPG config disables target policy smoothing as a workaround.
在使用TD3训练混合动作空间环境时,运行会报错assert isinstance(action, torch.Tensor),我查看源码发现HybridArgmaxSampleWrapper的forward返回值确实可能会引起错误,请问我应该怎样解决呢
代码如下:
The text was updated successfully, but these errors were encountered: