Add gradient checkpointing to Whisper Flax #22954

versae · 2023-04-24T10:43:50Z

It uses flax.linen.remat and follows on PRs #13657 and #17994.

What does this PR do?

Adds gradient_checkpointing to Flax Whisper models.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sanchit-gandhi @peregilk

HuggingFaceDocBuilderDev · 2023-04-24T10:57:48Z

The documentation is not available anymore as the PR was closed or merged.

sanchit-gandhi

Very nice @versae! Just a few minor nits, but otherwise this PR is looking good!

src/transformers/models/whisper/modeling_flax_whisper.py

sanchit-gandhi · 2023-04-25T11:19:25Z

src/transformers/models/whisper/modeling_flax_whisper.py

-                    init_cache=init_cache,
-                    output_attentions=output_attentions,
-                    deterministic=deterministic,
+                    attention_mask,


Note to reviewer: remat does not support key-word arguments, hence the need to change to pure arguments

src/transformers/models/whisper/modeling_flax_whisper.py

…ssary comments

versae · 2023-04-26T08:57:43Z

Thanks for the review, @sanchit-gandhi! Should be all good now 😃.

sanchit-gandhi · 2023-04-26T16:17:55Z

Amazing @versae! Requesting final review before we can get this merged 🤗

sgugger

Thanks for your contribution!

versae · 2023-04-27T08:14:00Z

Thank you! I learnt a lot 🤓

* Add gradient checkpointing to Whisper Flax * self.gradient_checkpointing only needed in nn.Module, removing unnecessary comments

Add gradient checkpointing to Whisper Flax

c68582a

versae mentioned this pull request Apr 24, 2023

Flax whisper gradient checkpointing #22897

Closed

5 tasks

versae marked this pull request as ready for review April 24, 2023 10:59

sanchit-gandhi approved these changes Apr 25, 2023

View reviewed changes

self.gradient_checkpointing only needed in nn.Module, removing unnece…

d3ecfdf

…ssary comments

sanchit-gandhi requested a review from sgugger April 26, 2023 16:18

sgugger approved these changes Apr 26, 2023

View reviewed changes

sgugger merged commit ba0dc54 into huggingface:main Apr 26, 2023

versae deleted the add-gradient-checkpointing-whisper-flax branch April 27, 2023 08:13

sanchit-gandhi mentioned this pull request May 5, 2023

Add FlaxWhisperForAudioClassification model #23173

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gradient checkpointing to Whisper Flax #22954

Add gradient checkpointing to Whisper Flax #22954

versae commented Apr 24, 2023

HuggingFaceDocBuilderDev commented Apr 24, 2023 •

edited

Loading

sanchit-gandhi left a comment

sanchit-gandhi Apr 25, 2023

versae commented Apr 26, 2023

sanchit-gandhi commented Apr 26, 2023

sgugger left a comment

versae commented Apr 27, 2023

Add gradient checkpointing to Whisper Flax #22954

Add gradient checkpointing to Whisper Flax #22954

Conversation

versae commented Apr 24, 2023

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Apr 24, 2023 • edited Loading

sanchit-gandhi left a comment

Choose a reason for hiding this comment

sanchit-gandhi Apr 25, 2023

Choose a reason for hiding this comment

versae commented Apr 26, 2023

sanchit-gandhi commented Apr 26, 2023

sgugger left a comment

Choose a reason for hiding this comment

versae commented Apr 27, 2023

HuggingFaceDocBuilderDev commented Apr 24, 2023 •

edited

Loading