Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safetensors #1255

Merged
merged 3 commits into from
Oct 4, 2024
Merged

Safetensors #1255

merged 3 commits into from
Oct 4, 2024

Conversation

gabe-l-hart
Copy link
Contributor

Description

Closes #1249

This PR implements support for downloading and converting model checkpoints from huggingface which use the safetensors format rather than the pth binary (pickle) format.

Changes

  • Allow the tensor map file to be found under different *.index.json names
  • Allow loading model files with safetensors.torch.load when needed
  • Allow downloading safetensor files if they exist without pth files or the model explicitly prefers them

Testing

  • I have tested that the download and load for llama3.1 are un-changed with these changes (no safetensors are downloaded)
  • I have verified (on my WIP branch for Granite Code) that for models with only safetensors, they are not ignored and can be cleanly converted

Copy link

pytorch-bot bot commented Oct 2, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1255

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9db974d with merge base d8c0aaf (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link

Hi @gabe-l-hart!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@gabe-l-hart
Copy link
Contributor Author

NOTE: The CLA is in process since I'm contributing through the IBM corporate CLA.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 2, 2024
@Jack-Khuu Jack-Khuu self-requested a review October 2, 2024 23:53
@Jack-Khuu Jack-Khuu added the enhancement New feature or request label Oct 2, 2024
@byjlw
Copy link
Contributor

byjlw commented Oct 3, 2024

Thanks @gabe-l-hart
I appreciate you contributing. Code looks good, please rebase when you're ready and i'll approve and merge.
Also since it looks like you're already adding another model to the list and have tested it feel free to bring in the definition so others can benefit by downloading via the cli

@gabe-l-hart
Copy link
Contributor Author

Thanks @byjlw! I'll get the rebase done shortly. As you noted, this is part of my work to add Granite Code support. I have a single branch pointer for all that work, but I have also tried to organize the commits so that they can be merged as bite-sized features rather than a big overhaul. Right now, the addition of the configs for Granite Code are at the end of the chain since the model won't work at all until the other features are present (it also currently only works in the python runtime). I'm totally open to whatever convention the torchchat team prefers in terms of contribution chunking, so just let me know what you prefer!

@byjlw
Copy link
Contributor

byjlw commented Oct 3, 2024

chunks are definitely better. Would love to learn more about the overall goals for the Granite code support

@gabe-l-hart
Copy link
Contributor Author

Would love to learn more about the overall goals for the Granite code support

Good point, I didn't ever spell this out! Let me open a top-level issue about that support that I can use to track it all.

@gabe-l-hart
Copy link
Contributor Author

Top-level Granite Code support issue: #1262

@byjlw
Copy link
Contributor

byjlw commented Oct 3, 2024

Actually though, when i tested it by changing the the model.json to use safetensors for 11b base download errored out. It looks like this code works as long as it's not using definitions that use the torchtune format. Do you mind testing this case and resolving the issue?

"meta-llama/Llama-3.2-11B-Vision": {
        "aliases": ["llama3.2-11B-base", "Llama-3.2-11B-Vision-base"],
        "distribution_channel": "HuggingFaceSnapshot",
        "distribution_path": "meta-llama/Llama-3.2-11B-Vision",
        "prefer_safetensors": true
    },
Fetching 19 files: 100%|█████████████████████████████████████████████████████████████████████| 19/19 [1:06:30<00:00, 210.04s/it]
Converting meta-llama/Llama-3.2-11B-Vision to torchtune format...
Traceback (most recent call last):100%|████████████████████████████████████████████████████| 4.99G/4.99G [30:11<00:00, 3.52MB/s]
  File "/Users/byjlw/Documents/source/working/torchchat/torchchat.py", line 85, in <module>| 4.92G/4.92G [12:17<00:00, 3.34MB/s]
    check_args(args, "generate")
  File "/Users/byjlw/Documents/source/working/torchchat/torchchat/cli/cli.py", line 52, in check_args
    download_and_convert(args.model, args.model_directory, args.hf_token)
  File "/Users/byjlw/Documents/source/working/torchchat/torchchat/cli/download.py", line 123, in download_and_convert
    _download_hf_snapshot(model_config, temp_dir, hf_token)
  File "/Users/byjlw/Documents/source/working/torchchat/torchchat/cli/download.py", line 82, in _download_hf_snapshot
    convert_hf_checkpoint_to_tune( model_dir=artifact_dir, model_name=model_config.name)
  File "/Users/byjlw/Documents/source/working/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/byjlw/Documents/source/working/torchchat/torchchat/cli/convert_hf_checkpoint.py", line 184, in convert_hf_checkpoint_to_tune
    raise RuntimeError(f"Could not find {consolidated_pth}")
RuntimeError: Could not find /Users/byjlw/.torchchat/model-cache/downloads/meta-llama/Llama-3.2-11B-Vision/original/consolidated.pth

@gabe-l-hart
Copy link
Contributor Author

Ah, good catch! I guess N == 2 is a small sample size. I'll dig into your error and see how far I can get.

@byjlw
Copy link
Contributor

byjlw commented Oct 3, 2024

@ebsmothers can also help

@gabe-l-hart
Copy link
Contributor Author

Great. I'll report progress or blockers as they come up. @ebsmothers let me know if you dig in and get anywhere!

@ebsmothers
Copy link

@byjlw so I'm not super familiar with how torchchat is handling checkpoint conversion but if you're switching from the .pth format to .safetensors format you will no longer be able to just do the simple move that's happening here. The safetensors format for Llama 3.2 11B Vision distributes the model weights across multiple files (see here) so they will need to be loaded and merged into a single state dict as we do in torchtune here.

@gabe-l-hart
Copy link
Contributor Author

That makes sense. A similar approach is being taken in convert_hf_checkpoint to load the sharded weights and convert them into the .pth format needed by torchchat. If I'm understanding the logic in convert_hf_checkpoint_to_tune correctly, it looks like it's really just a simpler version of the logic in convert_hf_checkpoint that doesn't require the tensor renaming or permuting. If that's the case, I think the fix should be to hoist out the shard-loading logic into a helper and then only do the post-processing in convert_hf_checkpoint before resaving as model.pth.

I finally have the safetensors downloaded locally, so I'll see how far I can get with this approach.

@gabe-l-hart
Copy link
Contributor Author

I have the mechanics of this working for the safetensors weights with Llama-3.2-11B-Vision-Instruct, but it appears that there is a naming difference between the tensors in original/consolidated.pth and the safetensors weights. This is likely similar to the name remapping needed in the standard conversion logic.

Given that, and given the fact that these models do already have checkpoints that match the target naming scheme, I think it might make sense to leave the PR as-is for now and not switch prefer_safetensors to true for these models. I could also see an argument for using safetensors all-the-way-through to avoid the known pickle vulnerabilities with .pth, but this PR doesn't address that anyway since the models are converted to .pth during the conversion process.

…r names

Branch: GraniteCodeSupport

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Branch: GraniteCodeSupport

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
…h or safetensors

The logic here will prefer pth over safetensors unless the model's config
explicitly states a preference for safetensors over pth. If only one of the
two is found, the download will use whichever is present.

Branch: GraniteCodeSupport

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
@byjlw byjlw merged commit 766bee9 into pytorch:main Oct 4, 2024
52 checks passed
@gabe-l-hart gabe-l-hart deleted the Safetensors-1249 branch October 4, 2024 20:02
@@ -41,7 +42,12 @@ def convert_hf_checkpoint(
print(f"Model config {config.__dict__}")

# Load the json file containing weight mapping
model_map_json = model_dir / "pytorch_model.bin.index.json"
model_map_json_matches = [Path(m) for m in glob.glob(str(model_dir / "*.index.json"))]
assert len(model_map_json_matches) <= 1, "Found multiple weight mapping files"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an error? Thanks!

Copy link
Contributor Author

@gabe-l-hart gabe-l-hart Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. See my response over on your PR: #1346 (comment)

@gabe-l-hart gabe-l-hart mentioned this pull request Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Huggingface models from safetensors
6 participants