-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifies loss storage in DDPG, TD3, and Soft Actor Critic #195
Conversation
Friendly reminder on this :) |
The changes will probably not break anything, but can you elaborate on why you suggest them? If it is just for consistency, I think it is better to force recorded losses to be floats, not scalar ndarrays, for simplicity. |
I think to justify it I need more data. But I did it mainly for consistency. But I think it may require less memory in some instances (at least some informal experiments showed slight improvements in memory usage though I can't be sure without a more complete experiment).
To be clear, we do indeed force them to be floats, right? We simply do Do you remember why we did |
Ah sorry I missed
IIRC I was just unaware it is possible to directly cast torch.Tensor to float. After I learned it is safe to cast (pytorch/pytorch#1129) I switched. I think it is better to choose the simpler way unless there is a memory leak issue in it. There seems to be yet another way of doing this,
|
Oh, thanks! Yes, This tells me that it is implicitly detaching from the computation graph (which is what we apparently we do when we detach()). Perhaps I can replace instances of |
Sounds good. I think it is ok to write like |
/test |
Successfully created a job for commit 41c0e92: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Sorry for the delay.
As per this discussion:
#195 (comment)
I have switched many of the
variable.detach().cpu().numpy()
withvariable.item()
in our DDPG-TD3-SAC family of agents.I have ran the training scripts for these agents for a bit and checked the scores.txt and the values are indeed being written to scores.txt without issues.