Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate SerialEnv and introduct multistep policy logic #26

Merged
merged 26 commits into from
Mar 20, 2024

Conversation

alexander-soare
Copy link
Collaborator

@alexander-soare alexander-soare commented Mar 14, 2024

This PR has two unrelated features but it was more sensible to make sure they work together.

  1. Incorporate SerialEnv. For rollout, this runs the policy in batch mode, but the environment still runs in a sequential batch. For PushT this still provides a 10x speedup on a machine with an RTX 3090 and 16 CPU threads. The goal is to swap this out with ParallelEnv.

  2. Creates a base class for policies thereby enabling a base forward method that handles multi-step policies. This is for correct handling of the step outputs (reward, terminated, truncated etc). For example, see: https://github.com/users/Cadene/projects/1?pane=issue&itemId=56225488

@Cadene
Copy link
Collaborator

Cadene commented Mar 14, 2024

Thanks! Very useful. Dont hesitate to ping me on discord if you need insights to pass the unit tests ;)
Best

@Cadene
Copy link
Collaborator

Cadene commented Mar 15, 2024

In the description could you add a minimal code to run the tests so that we can breakpoint easily to better understand the code
Also, the main issue that we are trying to address here is probably the miscalculation of the reward, no? Maybe we should add this in the description !
Thanks for your contribution ;)

lerobot/scripts/eval.py Outdated Show resolved Hide resolved
lerobot/scripts/eval.py Outdated Show resolved Hide resolved
lerobot/scripts/eval.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's iterate one more them and let's merge this very cool feature!
Could you add the time that you saved by adding SerialEnv?
Not sure we have time for this, but ideally we should make sure by training a diffusion policy on Pusht that we can reproduce results obtained before this PR and post them in the PR description.
Thanks!

@alexander-soare
Copy link
Collaborator Author

In the description could you add a minimal code to run the tests so that we can breakpoint easily to better understand the code Also, the main issue that we are trying to address here is probably the miscalculation of the reward, no? Maybe we should add this in the description ! Thanks for your contribution ;)

You mean the PR description? I'm confused, as the way to run the tests is the same as for all other tests.

I'll link the relevant issue.

@alexander-soare alexander-soare changed the title [WIP]: Incorporate SerialEnv and introduct multistep policy logic Incorporate SerialEnv and introduct multistep policy logic Mar 19, 2024
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments ;) Almost ready to merge

lerobot/common/envs/aloha/env.py Outdated Show resolved Hide resolved
lerobot/common/envs/pusht/env.py Show resolved Hide resolved
lerobot/common/policies/abstract.py Show resolved Hide resolved
lerobot/common/policies/abstract.py Show resolved Hide resolved
lerobot/common/policies/abstract.py Show resolved Hide resolved
lerobot/scripts/eval.py Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
tests/test_policies.py Show resolved Hide resolved
@alexander-soare
Copy link
Collaborator Author

@Cadene bty. Can you please check the previously unresolved comments as well?

@alexander-soare
Copy link
Collaborator Author

@Cadene 🏓. I removed abstractmethods as you suggested. Also, please take a look at what I did re. online training. There's definitely a way to do it properly but I'm short-circuiting for now to get this PR through.

@Cadene Cadene merged commit ec536ef into main Mar 20, 2024
1 check passed
@aliberts aliberts deleted the user/alexander-soare/multistep_policy_and_serial_env branch April 27, 2024 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants