Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Support copying optuna params dict for all hyperparameters #121

Open
jkterry1 opened this issue Jun 21, 2021 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@jkterry1
Copy link
Contributor

Right now, only hyperparmeters that are searched by default can have their params dict be copied and reused due to naming issues. This should be extended to hyperparameters that are not searched by default, per discussion in issue #115.

@araffin araffin added the enhancement New feature or request label Jun 23, 2021
@araffin
Copy link
Member

araffin commented Jun 23, 2021

only hyperparmeters that are searched by default can have their params dict be copied and reused due to naming issues

well also some params that are searched cannot be copied too.

@IlonaAT
Copy link

IlonaAT commented Sep 15, 2021

[related question] Transfer hyperparameters from optuna

For learning purposes I am tuning a number of algorithms for environment 'MountanCar-v0'. At the moment I am interested in PPO. I intend to share tuned hyperparameters working putting them on your repo. I try to understand the working with some depth of a variety of algorithms hands-on. SB3 and zoo are great tools to get hands-on.
So I was using optuna from zoo to find the right parameters for PPO, and by the results produced by it I would say that the hyperparameters should work:

I execute as indicated:
train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median

Output:

  ========== MountainCar-v0 ==========
  Seed: 2520733740
  Default hyperparameters for environment (ones being tuned will be overridden):
  OrderedDict([('ent_coef', 0.0),
  ('gae_lambda', 0.98),
  ('gamma', 0.99),
  ('n_envs', 16),
  ('n_epochs', 4),
  ('n_steps', 16),
  ('n_timesteps', 1000000.0),
  ('normalize', True),
  ('policy', 'MlpPolicy')])
  Using 16 environments
  Overwriting n_timesteps with n=50000
  Normalization activated: {'gamma': 0.99}
  Optimizing hyperparameters
  Sampler: tpe - Pruner: median

Then one nice result is:

  Trial 151 finished with value: -95.4 and parameters: {'batch_size': 256, 'n_steps': 32, 'gamma': 0.999, 'learning_rate': 0.00043216809397908225, 'ent_coef': 5.844122887301502e-07, 'clip_range': 0.2, 'n_epochs': 10, 'gae_lambda': 0.92, 'max_grad_norm': 2, 'vf_coef': 0.035882158772375855, 'net_arch': 'medium', 'activation_fn': 'relu'}. Best is trial 151 with value: -95.4.
  Normalization activated: {'gamma': 0.99}
  Normalization activated: {'gamma': 0.99, 'norm_reward': False}

The environment is solved at -110 reward, following literature.

When passing these hyperparameters to the algorithm it does not work (remains at -200). I do not exactly understand why.

envm = make_vec_env("MountainCar-v0", n_envs=16)
policy_kwargs = dict(activation_fn=th.nn.ReLU, net_arch=[dict(pi=[254,254], vf=[254,254])])
model = PPO("MlpPolicy", envm, verbose=1, batch_size=256, n_steps=2048, gamma=0.9999, learning_rate=0.00043216809397908225, ent_coef= 5.844122887301502e-07, clip_range=0.2, n_epochs=10, gae_lambda=0.92, max_grad_norm=2 ,vf_coef= 0.035882158772375855, policy_kwargs=policy_kwargs )

model.learn(total_timesteps=1000000)
model.save("ppo_mountaincar")

As I read it in the docs, I would say it is supposed to work like that, am I wrong? Should I take something else into account?

@araffin
Copy link
Member

araffin commented Sep 15, 2021

When passing these hyperparameters to the algorithm it does not work (remains at -200). I do not exactly understand why.

You are missing the normalization wrapper: envm = VecNormalize(envm, gamma=0.9999)

Note that results may also depends on the random seed (cf. doc and issue #151 )

@IlonaAT
Copy link

IlonaAT commented Sep 15, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants