[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility #13693

anton-l · 2021-09-22T08:05:39Z

Resolves #13689

This fixes an issue introduced by #13650 with speech feature extractors' tensors being returned as torch.float64 when .pad() is called directly:

from transformers import Wav2Vec2FeatureExtractor
import numpy as np

extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

rand_input = np.ones((100,), dtype=np.float64)
out = extractor.pad([{"input_values": rand_input}], return_tensors="pt")

print(out.dtype) # <- this should be `torch.float32`

This is due to how pytorch converts float numpy arrays (new padding logic) vs python lists (old padding logic):

uses torch.float32 for python lists by default: torch.tensor([1.2, 2.3]).dtype # torch.float32
np.array([1.2, 2.3]).dtype # np.float64
uses source dtype for numpy arrays: torch.tensor(np.array([1.2, 2.3])).dtype # torch.float64

anton-l · 2021-09-22T08:19:59Z

src/transformers/feature_extraction_sequence_utils.py

@@ -187,23 +187,6 @@ def pad(
        padding_strategy = self._get_padding_strategies(padding=padding, max_length=max_length)

        required_input = processed_features[self.model_input_names[0]]
-        if required_input and not isinstance(required_input[0], np.ndarray):


Leftover logic from the old implementation, required_input[0] is always np.ndarray now

good catch!

anton-l · 2021-09-22T08:22:31Z

src/transformers/feature_extraction_sequence_utils.py

+                if value.dtype is np.dtype(np.float64):
+                    value = value.astype(np.float32)


Instead of adding dtype customization to .pad(), just always use np.float32, until we really need float64 support for some new models.

…atibility (huggingface#13693) * Force dtype, add tests * Local torch imports * Remove unused logic (always ndarray)

Force dtype, add tests

a4c9952

anton-l requested a review from patrickvonplaten September 22, 2021 08:05

anton-l added 2 commits September 22, 2021 11:08

Local torch imports

3db244a

Remove unused logic (always ndarray)

a074964

anton-l commented Sep 22, 2021

View reviewed changes

patrickvonplaten approved these changes Sep 22, 2021

View reviewed changes

patrickvonplaten merged commit 75f6641 into huggingface:master Sep 22, 2021

anton-l deleted the fix-w2v2-fp64-pad branch September 22, 2021 09:15

Narsil pushed a commit to Narsil/transformers that referenced this pull request Sep 25, 2021

[Wav2Vec2FeatureExtractor] Fix extractor.pad() dtype backwards comp…

e88712c

…atibility (huggingface#13693) * Force dtype, add tests * Local torch imports * Remove unused logic (always ndarray)

stas00 pushed a commit to stas00/transformers that referenced this pull request Oct 12, 2021

[Wav2Vec2FeatureExtractor] Fix extractor.pad() dtype backwards comp…

0a7c000

…atibility (huggingface#13693) * Force dtype, add tests * Local torch imports * Remove unused logic (always ndarray)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility #13693

[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility #13693

anton-l commented Sep 22, 2021 •

edited

Loading

anton-l Sep 22, 2021

patrickvonplaten Sep 22, 2021

anton-l Sep 22, 2021

		if value.dtype is np.dtype(np.float64):
		value = value.astype(np.float32)

[Wav2Vec2FeatureExtractor] Fix extractor.pad() dtype backwards compatibility #13693

[Wav2Vec2FeatureExtractor] Fix extractor.pad() dtype backwards compatibility #13693

Conversation

anton-l commented Sep 22, 2021 • edited Loading

anton-l Sep 22, 2021

Choose a reason for hiding this comment

patrickvonplaten Sep 22, 2021

Choose a reason for hiding this comment

anton-l Sep 22, 2021

Choose a reason for hiding this comment

[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility #13693

[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility #13693

anton-l commented Sep 22, 2021 •

edited

Loading