[question] All checkpoints use the same VecNormalize statistics #278

cameron-chen · 2022-08-16T06:40:07Z

Hi,

I find parameter save_freq and eval_freq enable us to save checkpoints during training:

save_freq: this saves checkpoints rl_model_{num_timesteps}_steps.zip
eval_freq: this saves checkpoint best_model.zip

However, we do not have normalization statistics at the moment we save them. There is only one vecnormalize.pkl saved by function save_trained_model() after training where we save checkpoint {env_id}.zip (refer to this link).

When we evaluate one of the checkpoints (rl_model_{num_timesteps}_steps.zip, best_model.zip, {env_id}.zip), we use the same normalization statistics (refer to this link). Does this affect the evaluation? Why do we not save the normalization statistics for every checkpoints?

Thank you!

The text was updated successfully, but these errors were encountered:

araffin · 2022-08-16T07:49:38Z

Hello,
good point, i did that mainly to save space, but you are right, we should give the ability to save each checkpoint stats too.

The other thing is that the normalization should converge at some point and therefore using the last stats for evaluating earlier checkpoints should not affect too much the results.

cameron-chen · 2022-08-17T12:33:38Z

It makes a lot of sense to me. Thank you for reply.

Saving each checkpoint stats may cost space and, in general, is not necessary. Probably, It is reasonable and practical to save the stats of best_model.zip and {env_id}.zip.

p.s. the code does save checkpoint stat for best_model.zip (refer to this link) but finally the stats is replaced by the checkpoint stats for {env_id}.zip.

araffin · 2022-09-21T09:28:06Z

Saving each checkpoint stats may cost space and, in general, is not necessary. Probably, It is reasonable and practical to save the stats of best_model.zip and {env_id}.zip.

I would welcome a PR that does this ;)

cameron-chen · 2022-11-04T08:20:38Z

Hi, I actually implemented one for personal use. Let me clean it and make a PR.

araffin added bug Something isn't working question Further information is requested labels Aug 16, 2022

araffin mentioned this issue Aug 18, 2022

[Feature Request] CheckpointCallback should also save replay buffer DLR-RM/stable-baselines3#1016

Closed

1 task

araffin added the enhancement New feature or request label Sep 21, 2022

araffin mentioned this issue Sep 21, 2022

[question] cannot recover the optimal reward from saved best_model.zip as the tensorboard reported #282

Closed

araffin mentioned this issue Oct 13, 2022

Roadmap RL Zoo #299

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] All checkpoints use the same VecNormalize statistics #278

[question] All checkpoints use the same VecNormalize statistics #278

cameron-chen commented Aug 16, 2022

araffin commented Aug 16, 2022

cameron-chen commented Aug 17, 2022 •

edited

Loading

araffin commented Sep 21, 2022

cameron-chen commented Nov 4, 2022 •

edited

Loading

[question] All checkpoints use the same VecNormalize statistics #278

[question] All checkpoints use the same VecNormalize statistics #278

Comments

cameron-chen commented Aug 16, 2022

araffin commented Aug 16, 2022

cameron-chen commented Aug 17, 2022 • edited Loading

araffin commented Sep 21, 2022

cameron-chen commented Nov 4, 2022 • edited Loading

cameron-chen commented Aug 17, 2022 •

edited

Loading

cameron-chen commented Nov 4, 2022 •

edited

Loading