-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/wrapper native sb3 policy for On-Policy #143
Conversation
… reseting the environment twice since it will be reset on Training class
This is only for On-Policy. For Off-Policy will be on different PR. An issue is already created #144 and still work in progress. Unit test is also created, where it checks the policy loss. If no different in Policy Loss then it is ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool stuff! Thank you for your commitment!
Fixing the mismatch Native and Wrapper SB3 Policy.
Create How to, to do the comparison.