-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model improvements6 #206
Model improvements6 #206
Conversation
Hi Rizky and William, classes Training and RLTraining are reworked and enhanced as described in #161 and demonstrated in new Howto 17. Please take a look. Thx! |
@detlefarend could you please check there are still some errors. |
Hi @rizkydiprasetya, I guess the problem is caused by this code rew is not a scalar but a np array. The method Reward.set_overall_reward() in turn expects a scalar. Then a parameter of class RLTraining changed it's name. I already fixed and pushed it: |
why should be a scalar? it used to work with np array |
A reward is a scalar value, not an array. That's why both methods Reward.set_overall_reward() and Reward.add_agent_reward() expect scalars. Fyi: I edited my last comment (parameter of class RLTraining) |
ok, and why is it working without this stagnation and evaluation and other stuff that you added? |
It is a simple thing: you use the method beyond its specification. If you do so the behavior is undefined. The new evaluation stuff expects scalar values and gets arrays. We can surely improve both methods by adding try/except logic. But it's not a bug. Please adjust your code or add the logic to Reward by yourself. |
ok, then we need to change all of the environment. Since all of them based on numpy. You can check them. |
its not that beyond. Just about either using NumPy or not. |
I have no problem to solve this in Reward - except of time. Feel free to do it if you prefer this solution. |
done |
@detlefarend why? |
What is the problem? Give me a chance to understand. |
@detlefarend why? Why are you using try and except, but there is no error that you want to catch? |
Because you can use an env as one (or all) of the afcts in an envmodel. But an env is not adaptive and raises an exception, if you try to switch the adaptivity. See EnvBase.switch_adaptivity(). |
I see, ok. It is fine like that, but I would prefer to check for the inheritance of the object for the adaptivity. Oh just check again, EnvBase inherits also the Model class. That is why you put the raise on the function. |
What is your problem? Does something behave different from your expectation? |
Description
Background
Types of changes
New Feature
Checklist: