-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(zjow): add wandb logger features; fix relative bugs for wandb online logger #579
Conversation
Codecov Report
@@ Coverage Diff @@
## main #579 +/- ##
==========================================
- Coverage 83.34% 82.96% -0.39%
==========================================
Files 569 570 +1
Lines 46819 47013 +194
==========================================
- Hits 39022 39004 -18
- Misses 7797 8009 +212
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 8 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -1,6 +1,7 @@ | |||
from typing import TYPE_CHECKING, Callable, List, Tuple, Any | |||
from easydict import EasyDict | |||
from functools import reduce | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -257,7 +257,7 @@ def _evaluate(ctx: Union["OnlineRLContext", "OfflineRLContext"]): | |||
eval_monitor.update_video(env.ready_imgs) | |||
eval_monitor.update_output(inference_output) | |||
output = [v for v in inference_output.values()] | |||
action = [to_ndarray(v['action']) for v in output] # TBD | |||
action = np.array([to_ndarray(v['action']) for v in output]) # TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
ding/policy/td3.py
Outdated
|
||
def monitor_vars(self) -> List[str]: | ||
variables = ["q_value", "target q_value", "loss", "lr", "entropy", "target_q_value", "td_error"] | ||
return variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
ding/bonus/td3.py
Outdated
wandb_url: str | ||
|
||
|
||
class TD3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to OffPolicyAgent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Description
Add and fix wandb logger features for rendering videos and logging information during training, which is tested in algorithms td3/ddpg/sac.
Fix relative bugs for wandb online logger.
Copy changes to wandb offline logger.