Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] Possible resolution for random seed setting and non-deterministic training #904

Closed
1 of 4 tasks
hojae-io opened this issue Aug 30, 2024 · 4 comments · Fixed by #940
Closed
1 of 4 tasks
Labels
bug Something isn't working

Comments

@hojae-io
Copy link

hojae-io commented Aug 30, 2024

Describe the bug

It has been observed that the training results, such as the reward curve, are not the same even if you manually set a random seed (for instance, seed=42).

Several similar issues have been submitted:
#489
#275

Steps to reproduce

Try running some code like : IsaacLab/source/standalone/workflows/rsl_rl/train.py
Or check the above issue to reproduce the problem.

System Info

Describe the characteristic of your environment:

  • Commit: [e.g. 8f3b9ca]
  • Isaac Sim Version: 2024.4.1
  • OS: Ubuntu 22.04
  • GPU: Geforce 3090
  • CUDA: 11.2
  • GPU Driver: 550

Resolution

Here's my resolution, and now I don't have any non-deterministic / stochastic behavior, and the reward curve "exactly" overlap if I train the same code multiple times.

The problem is coming from "setting the seed after the environment is created"
image

You can see the seed is being set at line 118 which is after the env is created at line 90

But now, if you set the seed before the env is created (like line 92 - 102 in the image below), all the behavior becomes deterministic.
I don't know why "the time at which you set the seed" is important and it "could" cause non-deterministic behavior.

It would be nice if you could provide explanation and reflect this bug into your next pull request.

image

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have checked that the issue is not in running Isaac Sim itself and is related to the repo

Acceptance Criteria

  • The determinacy issue is resolved (potentially with the above fix)
  • There are tests that ensure the determinacy
@pascal-roth
Copy link
Collaborator

Hi,
Thanks for looking into this issue. I agree; it is strange that the moment when the seed is set makes a difference when the seed is set. Could you make a PR with the necessary changes and, ideally, a test to prove that the fix is effective?

@Mayankm96
Copy link
Contributor

Thanks @hojae-io for digging deeper into this issue. This is a really great find and I may have an explanation.

From what I suspect, at initialization, we have some random events happening. For instance, the terrain generation (a lot of random sampling there), initialization of PhysX solver and internal buffers (not sure). It would make sense that setting the seed before ensures that the randomness from these sources are limited when the seed is fixed.

Definitely makes sense to fix this issue. Instead of modifying the play/train scripts, we should set the seed as the first operation when the class is constructed. We can add seed into the configuration of the environment.

@Mayankm96 Mayankm96 added the bug Something isn't working label Aug 30, 2024
@Mayankm96 Mayankm96 changed the title [Bug Report] Major bug report and resolution for random seed setting and non-deterministic behavior during training [Bug Report] Possible resolution for random seed setting and non-deterministic training Aug 30, 2024
@amrmousa144
Copy link
Contributor

I'm having the same issue and looking forward to a contribution to fix it ASAP!

@Mayankm96
Copy link
Contributor

Mayankm96 commented Sep 6, 2024

Based on the suggestion here, I have made the fixes in #940. At least from the unit test, where I do some fixed number of env steps, the obtained obs and rewards are the same. Though this checks this within the same process.

I ran the training for anymal locomotion and it looks promising:

./isaaclab.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --run_name seed_fix
Results over three runs (top: current main, bottom: after the fix)
before
after

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants