Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifies loss storage in DDPG, TD3, and Soft Actor Critic #195

Merged
merged 11 commits into from
Aug 4, 2024

Conversation

prabhatnagarajan
Copy link
Contributor

@prabhatnagarajan prabhatnagarajan commented Apr 13, 2024

As per this discussion:

#195 (comment)

I have switched many of the variable.detach().cpu().numpy() with variable.item() in our DDPG-TD3-SAC family of agents.

I have ran the training scripts for these agents for a bit and checked the scores.txt and the values are indeed being written to scores.txt without issues.

@github-actions github-actions bot requested a review from muupan April 13, 2024 00:00
@prabhatnagarajan
Copy link
Contributor Author

Friendly reminder on this :)

@muupan
Copy link
Member

muupan commented Apr 29, 2024

The changes will probably not break anything, but can you elaborate on why you suggest them? If it is just for consistency, I think it is better to force recorded losses to be floats, not scalar ndarrays, for simplicity.

@prabhatnagarajan
Copy link
Contributor Author

prabhatnagarajan commented Apr 29, 2024

I think to justify it I need more data. But I did it mainly for consistency. But I think it may require less memory in some instances (at least some informal experiments showed slight improvements in memory usage though I can't be sure without a more complete experiment).

If it is just for consistency, I think it is better to force recorded losses to be floats, not scalar ndarrays, for simplicity.

To be clear, we do indeed force them to be floats, right? We simply do loss.detach().cpu().numpy() before casting them to floats.

Do you remember why we did detach().cpu().numpy() originally?

@muupan
Copy link
Member

muupan commented Apr 29, 2024

To be clear, we do indeed force them to be floats, right? We simply do loss.detach().cpu().numpy() before casting them to floats.

Ah sorry I missed float around it. Right, both are eventually casting losses to floats. So I think they should function in the same way. Directly casting to float could be more efficient as it could skip making a numpy.ndarray, but I don't know the implementation details of Torch.__float__.

Do you remember why we did detach().cpu().numpy() originally?

IIRC I was just unaware it is possible to directly cast torch.Tensor to float. After I learned it is safe to cast (pytorch/pytorch#1129) I switched.

I think it is better to choose the simpler way unless there is a memory leak issue in it. There seems to be yet another way of doing this, Tensor.item(), perhaps now recommended by pytorch:
https://pytorch.org/docs/stable/tensors.html#initializing-and-basic-operations

Use torch.Tensor.item() to get a Python number from a tensor containing a single value:

@prabhatnagarajan
Copy link
Contributor Author

prabhatnagarajan commented Apr 29, 2024

Oh, thanks!

Yes, tensor.item() actually seems quite like what we want. The documentation also states that: This operation is not differentiable.

This tells me that it is implicitly detaching from the computation graph (which is what we apparently we do when we detach()).

Perhaps I can replace instances of detach().cpu().numpy() with tensor.item()? And check the tests and maybe some runs and so forth?

@muupan
Copy link
Member

muupan commented Apr 29, 2024

Perhaps I can replace instances of detach().cpu().numpy() with tensor.item()? And check the tests and maybe some runs and so forth?

Sounds good. I think it is ok to write like self.q_func1_loss_record.append(loss1.item()) as torch.zeros(1, dtype=torch.float32).item() is already float.

@prabhatnagarajan prabhatnagarajan changed the title Modifies loss storage in TD3 and Soft Actor Critic Modifies loss storage in DDPG, TD3, and Soft Actor Critic Jul 8, 2024
@muupan
Copy link
Member

muupan commented Aug 4, 2024

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 41c0e92:

Copy link
Member

@muupan muupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Sorry for the delay.

@muupan muupan merged commit c8cb332 into pfnet:master Aug 4, 2024
6 of 7 checks passed
@prabhatnagarajan prabhatnagarajan deleted the loss_storage branch August 4, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants