-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reward handler issues #275
Comments
So far I've been totally unable to reproduce this, using the attached script. The double reward firing hasn't happened once, in around 15000 missions. Is it possible the bug lies in the rl-framework code? (NB: I'm testing on an updated code-base which contains the fix for slow xml reward parsing - #261 - perhaps this has an impact?)
|
Clicks for Dave for discovering this unused parameter bug: https://github.com/Microsoft/malmo/blob/master/Malmo/src/TimestampedReward.cpp#L155 |
Fixed in 8e50eb2 |
Is it possible there's been a regression? I'm currently seeing reward doubling on 0.36.0 Scenario 1:
This gives me rewards of only -1000, as it should. No doubling. Scenario 2:
This is scenario 1 with a second type of reward added, the found_goal reward. In this case I almost exclusively see -2000 and 2000 rewards. This doesn't seem expected to me; I was aiming for just rewards of -1000 and 1000. |
I am having the same issue with the findthegoal mission. I changed the movement type to discrete and i am having many moves with 0 rewards and then some steps later multiple rewards. One solution i found was to use a time.sleep of around 150ms and then the values for the reward work properly. |
reward handler might be firing twice, or firing late, or firing the wrong value?
e.g. I can get -200 reward for turn -1, though the only reward handlers in my mission are:
The text was updated successfully, but these errors were encountered: