-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A3C-FF seems not work well? #28
Comments
Hi @miyosuda, sorry I forget to say that, I did try A3C-FF with LOCAL_T_MAX = 5, because it's the A3C paper's settings. I also tried 20, the same low scores. Previously I had a private implementation of A3C in Torch 7, and I found that A3C-FF works as good as in the paper on breakout and space_invaders, no matter LOCAL_T_MAX =5 or 20 |
But I can confirm that your A3C-LSTM can achieve reasonable high scores on breakout and space_invaders, that's why I feel strange... |
If you're running on Gym, try running Deterministic-v0. The regular Atari games have a random frame skip/action repeat between 1-5 in Gym that could impact the result. Deterministic fixes the FS to 4 (3 for SpaceInvaders) |
@pengsun would you mind open-source your torch implementation? I'm also using torch but can't find a good replication. |
Hi @duyunshu , thanks for your interest. Yes, it can be found here: https://github.com/pengsun/torch-rl-async-v2 But I don'd know if I have the time to add a README.md for how to run the code... |
@pengsun thank you very much! I'll try if can make it work in my machine. |
hi,@miyosuda, thank you thanks for providing the code! but i run the A3C FF,i use all your setting ,but the score is very bad ,it always fluctuates between -19 and -20(30.00M). but i run the A3CLSTM,is normal。 |
Hi @miyosuda, thanks for providing the code! When I experimented it with other games than pong (only the ROM name and ACTION_SIZE are modified), I found A3C-FF seems not work very well. For example, after iteration 50M, the training score for breakout is ~30, while that for space_invaders is ~600, which are lower than what is reported in the A3C paper.
Also, I found videos for breakout and space_invaders in @Itsukara 's fork, could you @Itsukara show your training details on these two games with this code?
The text was updated successfully, but these errors were encountered: