[Enhancement] Support copying optuna params dict for all hyperparameters #121

jkterry1 · 2021-06-21T15:23:55Z

Right now, only hyperparmeters that are searched by default can have their params dict be copied and reused due to naming issues. This should be extended to hyperparameters that are not searched by default, per discussion in issue #115.

araffin · 2021-06-23T09:26:49Z

only hyperparmeters that are searched by default can have their params dict be copied and reused due to naming issues

well also some params that are searched cannot be copied too.

IlonaAT · 2021-09-15T07:59:54Z

[related question] Transfer hyperparameters from optuna

For learning purposes I am tuning a number of algorithms for environment 'MountanCar-v0'. At the moment I am interested in PPO. I intend to share tuned hyperparameters working putting them on your repo. I try to understand the working with some depth of a variety of algorithms hands-on. SB3 and zoo are great tools to get hands-on.
So I was using optuna from zoo to find the right parameters for PPO, and by the results produced by it I would say that the hyperparameters should work:

I execute as indicated:
train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median

Output:

  ========== MountainCar-v0 ==========
  Seed: 2520733740
  Default hyperparameters for environment (ones being tuned will be overridden):
  OrderedDict([('ent_coef', 0.0),
  ('gae_lambda', 0.98),
  ('gamma', 0.99),
  ('n_envs', 16),
  ('n_epochs', 4),
  ('n_steps', 16),
  ('n_timesteps', 1000000.0),
  ('normalize', True),
  ('policy', 'MlpPolicy')])
  Using 16 environments
  Overwriting n_timesteps with n=50000
  Normalization activated: {'gamma': 0.99}
  Optimizing hyperparameters
  Sampler: tpe - Pruner: median

Then one nice result is:

  Trial 151 finished with value: -95.4 and parameters: {'batch_size': 256, 'n_steps': 32, 'gamma': 0.999, 'learning_rate': 0.00043216809397908225, 'ent_coef': 5.844122887301502e-07, 'clip_range': 0.2, 'n_epochs': 10, 'gae_lambda': 0.92, 'max_grad_norm': 2, 'vf_coef': 0.035882158772375855, 'net_arch': 'medium', 'activation_fn': 'relu'}. Best is trial 151 with value: -95.4.
  Normalization activated: {'gamma': 0.99}
  Normalization activated: {'gamma': 0.99, 'norm_reward': False}

The environment is solved at -110 reward, following literature.

When passing these hyperparameters to the algorithm it does not work (remains at -200). I do not exactly understand why.

envm = make_vec_env("MountainCar-v0", n_envs=16)
policy_kwargs = dict(activation_fn=th.nn.ReLU, net_arch=[dict(pi=[254,254], vf=[254,254])])
model = PPO("MlpPolicy", envm, verbose=1, batch_size=256, n_steps=2048, gamma=0.9999, learning_rate=0.00043216809397908225, ent_coef= 5.844122887301502e-07, clip_range=0.2, n_epochs=10, gae_lambda=0.92, max_grad_norm=2 ,vf_coef= 0.035882158772375855, policy_kwargs=policy_kwargs )

model.learn(total_timesteps=1000000)
model.save("ppo_mountaincar")

As I read it in the docs, I would say it is supposed to work like that, am I wrong? Should I take something else into account?

araffin · 2021-09-15T09:46:09Z

When passing these hyperparameters to the algorithm it does not work (remains at -200). I do not exactly understand why.

You are missing the normalization wrapper: envm = VecNormalize(envm, gamma=0.9999)

Note that results may also depends on the random seed (cf. doc and issue #151 )

IlonaAT · 2021-09-15T14:19:34Z

Thank you!

araffin added the enhancement New feature or request label Jun 23, 2021

jkterry1 mentioned this issue Sep 5, 2021

Add script for parsing best hyperparameters after a large number of trials #140

Merged

13 tasks

araffin mentioned this issue Sep 14, 2021

[question] Transfer hyperparameters from optuna araffin/rl-baselines-zoo#117

Closed

araffin mentioned this issue Dec 28, 2021

Invalid parameters in study.best_params after hyperparameter optimization #195

Closed

araffin mentioned this issue Aug 16, 2022

[Question] Registering Custom Env #272

Closed

araffin mentioned this issue Oct 13, 2022

Roadmap RL Zoo #299

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] Support copying optuna params dict for all hyperparameters #121

[Enhancement] Support copying optuna params dict for all hyperparameters #121

jkterry1 commented Jun 21, 2021

araffin commented Jun 23, 2021

IlonaAT commented Sep 15, 2021 •

edited by araffin

Loading

araffin commented Sep 15, 2021

IlonaAT commented Sep 15, 2021

[Enhancement] Support copying optuna params dict for all hyperparameters #121

[Enhancement] Support copying optuna params dict for all hyperparameters #121

Comments

jkterry1 commented Jun 21, 2021

araffin commented Jun 23, 2021

IlonaAT commented Sep 15, 2021 • edited by araffin Loading

araffin commented Sep 15, 2021

IlonaAT commented Sep 15, 2021

IlonaAT commented Sep 15, 2021 •

edited by araffin

Loading