Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 #5422

dependabot · 2021-09-28T13:03:15Z

Updates the requirements on transformers to permit the latest version.

Release notes

v4.11.0: GPT-J, Speech2Text2, FNet, Pipeline GPU utilization, dynamic model code loading

GPT-J

Three new models are released as part of the GPT-J implementation: GPTJModel, GPTJForCausalLM, GPTJForSequenceClassification, in PyTorch.

The GPT-J model was released in the kingoflolz/mesh-transformer-jax repository by Ben Wang and Aran Komatsuzaki. It is a GPT-2-like causal language model trained on the Pile dataset.

It was contributed by @StellaAthena, @kurumuz, @EricHallahan, and @leogao2.

GPT-J-6B #13022 (@StellaAthena)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=gptj

SpeechEncoderDecoder & Speech2Text2

One new model is released as part of the Speech2Text2 implementation: Speech2Text2ForCausalLM, in PyTorch.

The Speech2Text2 model is used together with Wav2Vec2 for Speech Translation models proposed in Large-Scale Self- and Semi-Supervised Learning for Speech Translation by Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau.

Speech2Text2 is a decoder-only transformer model that can be used with any speech encoder-only, such as Wav2Vec2 or HuBERT for Speech-to-Text tasks. Please refer to the SpeechEncoderDecoder class on how to combine Speech2Text2 with any speech encoder-only model.

Add SpeechEncoderDecoder & Speech2Text2 #13186 (@patrickvonplaten)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?other=speech2text2

FNet

Eight new models are released as part of the FNet implementation: FNetModel, FNetForPreTraining, FNetForMaskedLM, FNetForNextSentencePrediction, FNetForSequenceClassification, FNetForMultipleChoice, FNetForTokenClassification, FNetForQuestionAnswering, in PyTorch.

The FNet model was proposed in FNet: Mixing Tokens with Fourier Transforms by James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon. The model replaces the self-attention layer in a BERT model with a fourier transform which returns only the real parts of the transform. The model is significantly faster than the BERT model because it has fewer parameters and is more memory efficient. The model achieves about 92-97% accuracy of BERT counterparts on GLUE benchmark, and trains much faster than the BERT model.

Add FNet #13045 (@gchhablani)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?other=fnet

TensorFlow improvements

Several bug fixes and UX improvements for Tensorflow:

Users should notice much fewer unnecessary warnings and less 'console spam' in general while using Transformers with TensorFlow.

TensorFlow models should be less picky about the specific integer dtypes (int32/int64) that are passed as input

Changes to compile() and train_step()

You can now compile our TensorFlow models without passing a loss argument! If you do, the model will compute loss internally during the forward pass and then use this value to fit() on. This makes it much more convenient to get the right loss, particularly since many models have unique losses for certain tasks that are easy to overlook and annoying to reimplement. Remember to pass your labels as the "labels" key of your input dict when doing this, so that they're accessible to the model during the forward pass. There is no change to the behavior if you pass a loss argument, so all old code should remain unaffected by this change.

Associated PRs:

Modified TF train_step #13678 (@Rocketknight1)

... (truncated)

Commits

dc193c9 Release: v4.11.0
1c96500 Fix gather for SageMaker model parallel
4e0410e Fix in gather for SM distributed
367c2ef Modified TF train_step (#13678)
e00bc7c Silence warning in gradient checkpointing when it's False (#13734)
3ffd18a Fix loss computation in Trainer (#13760)
3ccc270 Fix type annotations for distributed_concat() (#13746)
e0d31a8 [Tests] Cast Hubert test models to fp16 (#13755)
400c5a1 [megatron gpt checkpoint conversion] causal mask requires pos_embed dimension...
91df455 [Trainer] Make sure shown loss in distributed training is correctly averaged ...
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [transformers](https://github.com/huggingface/transformers) to permit the latest version. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.1.0...v4.11.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot bot added the dependencies Pull requests that update a dependency file label Sep 28, 2021

dependabot bot force-pushed the dependabot/pip/transformers-gte-4.1-and-lt-4.12 branch from a724a71 to 97e6b18 Compare September 28, 2021 16:34

Merge branch 'main' into dependabot/pip/transformers-gte-4.1-and-lt-4.12

947319a

epwalsh approved these changes Sep 30, 2021

View reviewed changes

epwalsh enabled auto-merge (squash) September 30, 2021 00:33

epwalsh merged commit a63e28c into main Sep 30, 2021

epwalsh deleted the dependabot/pip/transformers-gte-4.1-and-lt-4.12 branch September 30, 2021 00:56

rohitgr7 mentioned this pull request Oct 18, 2021

update pytorch-lightning to latest Kaggle/docker-python#1087

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 #5422

Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 #5422

dependabot bot commented on behalf of github Sep 28, 2021 •

edited

Loading

Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 #5422

Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 #5422

Conversation

dependabot bot commented on behalf of github Sep 28, 2021 • edited Loading

v4.11.0: GPT-J, Speech2Text2, FNet, Pipeline GPU utilization, dynamic model code loading

GPT-J

SpeechEncoderDecoder & Speech2Text2

FNet

TensorFlow improvements

dependabot bot commented on behalf of github Sep 28, 2021 •

edited

Loading