Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A3C-FF seems not work well? #28

Open
pengsun opened this issue Feb 15, 2017 · 9 comments
Open

A3C-FF seems not work well? #28

pengsun opened this issue Feb 15, 2017 · 9 comments

Comments

@pengsun
Copy link

pengsun commented Feb 15, 2017

Hi @miyosuda, thanks for providing the code! When I experimented it with other games than pong (only the ROM name and ACTION_SIZE are modified), I found A3C-FF seems not work very well. For example, after iteration 50M, the training score for breakout is ~30, while that for space_invaders is ~600, which are lower than what is reported in the A3C paper.

Also, I found videos for breakout and space_invaders in @Itsukara 's fork, could you @Itsukara show your training details on these two games with this code?

@miyosuda
Copy link
Owner

miyosuda commented Feb 15, 2017

@pengsun
Recently I've changed LOCAL_T_MAX value from 5 to 20.

d67e7dc

I've checked with Pong environment with LSTM, and confirmed that the score becomes so much better, but I didn't check FF version when I changed this parameter.

Could you try old LOCAL_T_MAX=5 setting?

@pengsun
Copy link
Author

pengsun commented Feb 15, 2017

Hi @miyosuda, sorry I forget to say that, I did try A3C-FF with LOCAL_T_MAX = 5, because it's the A3C paper's settings. I also tried 20, the same low scores.

Previously I had a private implementation of A3C in Torch 7, and I found that A3C-FF works as good as in the paper on breakout and space_invaders, no matter LOCAL_T_MAX =5 or 20

@pengsun
Copy link
Author

pengsun commented Feb 15, 2017

But I can confirm that your A3C-LSTM can achieve reasonable high scores on breakout and space_invaders, that's why I feel strange...

@babaktr
Copy link

babaktr commented Feb 15, 2017

If you're running on Gym, try running Deterministic-v0. The regular Atari games have a random frame skip/action repeat between 1-5 in Gym that could impact the result. Deterministic fixes the FS to 4 (3 for SpaceInvaders)

@pengsun
Copy link
Author

pengsun commented Feb 15, 2017

Hi @babaktr, no I didn't run it on Gym, I just followed @miyosuda and installed ale python wrapper. The skip frame = 4, repeat action probability = 0, as defaulted in game_state.py.

@duyunshu
Copy link

@pengsun would you mind open-source your torch implementation? I'm also using torch but can't find a good replication.

@pengsun
Copy link
Author

pengsun commented Apr 21, 2017

Hi @duyunshu , thanks for your interest. Yes, it can be found here: https://github.com/pengsun/torch-rl-async-v2

But I don'd know if I have the time to add a README.md for how to run the code...

@duyunshu
Copy link

@pengsun thank you very much! I'll try if can make it work in my machine.

@zhoudoudou
Copy link

hi,@miyosuda, thank you thanks for providing the code! but i run the A3C FF,i use all your setting ,but the score is very bad ,it always fluctuates between -19 and -20(30.00M). but i run the A3CLSTM,is normal。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants