Optimize checkpointing #505

dacorvo · 2024-03-05T15:58:06Z

What does this PR do?

This modifies the decoder models export code to reduce the disk usage when creating checkpoints:

use torch_dtype = auto when loading the model to avoid casting weigths to float32 (the default),
do not use snapshot_download before exporting to avoid downloading both pytorch and safetensors weights.

HuggingFaceDocBuilderDev · 2024-03-05T16:01:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This avoids checkpoint weights to be stored as float32.

When the model needs to be exported, using snapshot_download before export is not efficient as it fetches both pytorch and safetensors weigths.

dacorvo marked this pull request as ready for review March 5, 2024 15:58

dacorvo requested review from JingyaHuang, michaelbenayoun and philschmid March 5, 2024 15:58

dacorvo added 3 commits March 5, 2024 16:06

fix(decoder): use model dtype when creating checkpoint

78a07ca

This avoids checkpoint weights to be stored as float32.

fix(tgi): export model in one step

44fc4c7

When the model needs to be exported, using snapshot_download before export is not efficient as it fetches both pytorch and safetensors weigths.

fix(tools): styling

87bf726

dacorvo force-pushed the optimize_checkpointing branch from 2e0e4c7 to 87bf726 Compare March 5, 2024 16:12

michaelbenayoun approved these changes Mar 6, 2024

View reviewed changes

dacorvo merged commit 249d0b6 into main Mar 6, 2024
11 checks passed

dacorvo deleted the optimize_checkpointing branch March 6, 2024 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize checkpointing #505

Optimize checkpointing #505

dacorvo commented Mar 5, 2024

HuggingFaceDocBuilderDev commented Mar 5, 2024

Optimize checkpointing #505

Optimize checkpointing #505

Conversation

dacorvo commented Mar 5, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 5, 2024