Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added simple_ga.py algo file #5

Merged
merged 3 commits into from
Feb 18, 2022
Merged

Added simple_ga.py algo file #5

merged 3 commits into from
Feb 18, 2022

Conversation

MaximilienLC
Copy link

No description provided.


@best_params.setter
def best_params(self, params):
self._best_params = params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By setting best_params, we expect the algorithm can continue training from that point, it seems self._best_params is not used in this way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work as intended now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm a little confused about the format of best_params. Is it the parameters of the elite agent or the whole batch of parameters from the entire population?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the top elite agent's parameter.
Users call NEAlgorithm.best_params to save/test the model.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it and fixed the setter!

@lerrytang lerrytang self-assigned this Feb 17, 2022
@lerrytang
Copy link
Contributor

Hi, thanks for the PR, I'm testing its performance on the tasks.
At the same time, can you take a look at my review comment and make changes accordingly?
Please see our implementation as an example.

@lerrytang
Copy link
Contributor

Test results

Benchmarks Parameters Results
MNIST 90.0 (max_iter=5000) sigma=0.001 90.8
CartPole (easy) 900 (max_iter=2000) default 925
CartPole (hard) 600 (max_iter=2000) default 616
Waterworld 6 (max_iter=2000) default 6.39
Waterworld (MA) 2 (max_iter=5000) default 1.19
Brax Ant 3000 (max_iter=1000) truncation_divisor= 4 2336

Notes

  1. This table shows the implementation's performance so that users can choose the algorithms for their experiments.
  2. We are aware that some algorithms have limitations (e.g., unable to train large policy network), the benchmarks are therefore not hard requirements. However, we refuse to merge if some scores are significantly lower.
  3. After we release the test scripts, the PR submitter will be responsible for producing this table. We can help fill in some entries if the submitter cannot run the experiments due to hardware limitations.

@lerrytang lerrytang merged commit a4fcf82 into google:main Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants