-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Noise in the environment 🐛 #385
Comments
The easiest way that I've found to achieve the reset is the following one.
Then, in
Probably there are cleaner ways, but I find the current seed mechanism to be too complicated for me to be analysed so I'm leaving this fix here for whoever is interested in a shortcut for fixing this problem. P.S. It has not been tested with Real Data mode |
Thank you @riccardopoiani ! It is really a good comment and suggestion. We'll find some way to handle this problem. |
Hi @riccardopoiani , on the latest master branch, we add a new parameter The default value of Thanks for your contribution! Please have a try. |
Thanks @lihuoran! it seems to be working to me. However, personally I would also add something like this to
In this way, it is also possible to have an environment that starts with different trajectory at each run. |
We will discuss your idea in the future. For now, you could customize the code on your local branch according to your needs. Thanks! |
Description
Currently, the data generation process works in the following way.
Whenever there is noise in the environment configurations, fixed a seed, the data generation process will be fixed, thus generating always the same "environment trajectory" (i.e., same order distribution, same vessel speeds, same vessel parking noise).
Expected Behavior
I would expect each "environment trajectory" to be different from the previous one (i.e., after a reset), in the sense that a different noise should be applied each time the environment is reset.
This is crucial also for the different reasons that are mentioned in both of your papers: if not done, the environment is fully deterministic (and one of the main reason to apply methods based on RL is the way in which they can handle uncertainty, as it happens in truly real scenarios indeed).
If this is not done, the performances that any RL-based method is able to achieve are flawed. In this case, indeed, it is obvious that the method is overfitting the "noise" in that specific configuration (at this point, it is even missleading to call it noise, since each trajectory generates the same exact data) .
Environment
CIM
,Citi Bike
): CIMSimulation
,RL
,Distributed Training
): SimulationGraSS on Azure
,AKS on Azure
):pip
,source
): sourceLinux
,Windows
,macOS
): Linux3.6
,3.7
): 3.7The text was updated successfully, but these errors were encountered: