-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib; Offline RL] CQL: Support multi-GPU/CPU setup and different learning rates for actor, critic, and alpha. #47402
[RLlib; Offline RL] CQL: Support multi-GPU/CPU setup and different learning rates for actor, critic, and alpha. #47402
Commits on Aug 22, 2024
-
Fixed a bug in SAC/CQL due to twin-Q parameters in backward pass. Sta…
…rted to rewrite CQL loss. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 8923dc2 - Browse repository at this point
Copy the full SHA 8923dc2View commit details
Commits on Aug 28, 2024
-
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 75d91ce - Browse repository at this point
Copy the full SHA 75d91ceView commit details
Commits on Aug 29, 2024
-
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 331f0ef - Browse repository at this point
Copy the full SHA 331f0efView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe79e3d - Browse repository at this point
Copy the full SHA fe79e3dView commit details -
Moved all forward passes from learner to module and added learning ra…
…tes for actor, critic, and alpha. Multi-learner setups works. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for ecbd588 - Browse repository at this point
Copy the full SHA ecbd588View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2be80fd - Browse repository at this point
Copy the full SHA 2be80fdView commit details -
Removed all 'deterministic_loss' uses.
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for cb9ecbe - Browse repository at this point
Copy the full SHA cb9ecbeView commit details
Commits on Aug 30, 2024
-
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 54b5d2d - Browse repository at this point
Copy the full SHA 54b5d2dView commit details -
Configuration menu - View commit details
-
Copy full SHA for e6b3769 - Browse repository at this point
Copy the full SHA e6b3769View commit details -
Removed another forward pass through the Q-network and used the alrea…
…dy sampled log-probabilities. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 62d6dfb - Browse repository at this point
Copy the full SHA 62d6dfbView commit details -
Multiple smaller modifications to CQL algorithm following the origina…
…l one published by Kumar et al. (2020). Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for edbe263 - Browse repository at this point
Copy the full SHA edbe263View commit details -
Adapted hyperparameters after tuning the example.
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 100aabe - Browse repository at this point
Copy the full SHA 100aabeView commit details