Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch #20801

Merged
merged 4 commits into from
Dec 23, 2022

Conversation

bastings
Copy link
Contributor

@bastings bastings commented Dec 16, 2022

What does this PR do?

Adds a script that can convert Google T5X (Flax) T5 and T5-v1.1 checkpoints into PyTorch checkpoints.
This allows users to convert non-standard checkpoints that have been trained with T5X and use them with the Transformers library in PyTorch.

Usage:

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case. Discussed with @thomwolf .
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? The code is tested but not part of this PR, since the test requires manually downloading the T5X checkpoints from a cloud bucket.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@sanchit-gandhi
@ArthurZucker
@younesbelkada

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 16, 2022

The documentation is not available anymore as the PR was closed or merged.

@bastings
Copy link
Contributor Author

I could use some clarification on the following: I'm missing a configuration option for T5 for the 1.0/original T5 checkpoints to have an lm_head that shares parameters with the token embeddings.

Currently there is T5Model (which returns hidden states) and T5ForConditionalGeneration (which returns logits, used for T5 v1.1 models among others). The latter assumes there is an lm_head layer, but for the 1.0 checkpoints there is no such thing, it reuses the embedding matrix to map to the vocab space.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this @bastings

cc @ArthurZucker

@ArthurZucker
Copy link
Collaborator

Hey @bastings, when there is no lm_head you have to set the tie_word_embeddings to True

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very clean, thanks a lot for the addition.

Copy link
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool PR @bastings! Thanks for the addition! Do you have a set of example args I could use just to try the script out once for myself? Thanks! 🙌

@bastings bastings force-pushed the t5x_to_pytorch branch 2 times, most recently from ea37c40 to c529472 Compare December 21, 2022 12:28
@bastings
Copy link
Contributor Author

I added the instructions to the top docstring. Maybe it's ready? :-)

@ArthurZucker
Copy link
Collaborator

A last nit and we can merge! Thanks a lot for bearing with me 😄

@bastings
Copy link
Contributor Author

Thanks! Committed your suggestion :)

"""
Convert T5X checkpoint to PyTorch

Steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@sanchit-gandhi
Copy link
Contributor

Once the quality tests are green (requires make fixup) we can merge!

@bastings
Copy link
Contributor Author

Oh looks like the suggestion made it fail ;)

@ArthurZucker
Copy link
Collaborator

Ah, sorry then ahha, I guess the make stylewill correct that 😅

@bastings
Copy link
Contributor Author

Ah, sorry then ahha, I guess the make stylewill correct that 😅

Fixed! :)

@ArthurZucker ArthurZucker merged commit efed8a2 into huggingface:main Dec 23, 2022
MKhalusova pushed a commit to MKhalusova/transformers that referenced this pull request Dec 28, 2022
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jan 4, 2023
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
silverriver pushed a commit to silverriver/transformers that referenced this pull request Jan 6, 2023
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
venkat-natchi pushed a commit to venkat-natchi/transformers that referenced this pull request Jan 22, 2023
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants