Add task template for automatic speech recognition #2533

lewtun · 2021-06-22T12:45:02Z

This PR adds a task template for automatic speech recognition. In this task, the input is a path to an audio file which the model consumes to produce a transcription.

Usage:

from datasets import load_dataset
from datasets.tasks import AutomaticSpeechRecognition

ds = load_dataset("timit_asr", split="train[:10]")
# Dataset({
#     features: ['file', 'text', 'phonetic_detail', 'word_detail', 'dialect_region', 'sentence_type', 'speaker_id', 'id'],
#     num_rows: 10
# })

task = AutomaticSpeechRecognition(audio_file_column="file", transcription_column="text")
ds.prepare_for_task(task)
# Dataset({
#     features: ['audio_file', 'transcription'],
#     num_rows: 10
# })

lhoestq · 2021-06-22T14:40:59Z

src/datasets/tasks/automatic_speech_recognition.py

+@dataclass(frozen=True)
+class AutomaticSpeechRecognition(TaskTemplate):
+    task: str = "automatic-speech-recognition"
+    input_schema: ClassVar[Features] = Features({"audio_file": Value("string")})


Thanks for adding this template :)

Note that in the future we'll have an Audio feature type that will probably have additional parameters (like ClassLabel) such as the sampling rate or the audio format.

good to know!

lhoestq · 2021-06-22T14:41:57Z

tests/test_arrow_dataset.py

@@ -2144,6 +2144,39 @@ def test_task_question_answering(self, in_memory):
                )
                self.assertDictEqual(features_after_cast, dset.features)

+    def test_task_automatic_speech_recognition(self, in_memory):


don't hesitate to move it outside of the BaseDatasetTest class, as mentioned in #2529

yep, will do this here as well!

SBrandeis

Thank you @lewtun !

SBrandeis · 2021-06-22T14:41:25Z

src/datasets/tasks/automatic_speech_recognition.py

+@dataclass(frozen=True)
+class AutomaticSpeechRecognition(TaskTemplate):
+    task: str = "automatic-speech-recognition"
+    input_schema: ClassVar[Features] = Features({"audio_file": Value("string")})


Maybe audio_file_path would be more explicit on what the column represent ?

Also, paths are not portable between machines a priori. This is probably good enough for now, but at some point, we'll need to replace the Value("string") with an Audio or Signal feature!

lewtun · 2021-06-23T14:59:36Z

@SBrandeis @lhoestq i've integrated your suggestions, so this is ready for another review :)

lhoestq · 2021-06-23T15:55:36Z

Merging if it's good for you @lewtun :)

lewtun added 2 commits June 22, 2021 14:39

Add ASR template

423ebd7

Add unit tests

5a83332

lewtun requested review from SBrandeis and lhoestq June 22, 2021 12:56

lhoestq approved these changes Jun 22, 2021

View reviewed changes

SBrandeis reviewed Jun 22, 2021

View reviewed changes

lewtun mentioned this pull request Jun 22, 2021

Use Audio features for AutomaticSpeechRecognition task template #2536

Closed

lewtun added 3 commits June 22, 2021 17:20

Refactor

3cb504a

Merge branch 'master' into add-asr-template

535867a

Refactor

26e5d7f

Move test to better group

da8d53b

lhoestq merged commit 0764fcd into huggingface:master Jun 23, 2021

lewtun deleted the add-asr-template branch June 23, 2021 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add task template for automatic speech recognition #2533

Add task template for automatic speech recognition #2533

lewtun commented Jun 22, 2021 •

edited

Loading

lhoestq Jun 22, 2021 •

edited

Loading

lewtun Jun 22, 2021

lhoestq Jun 22, 2021

lewtun Jun 22, 2021

SBrandeis left a comment

SBrandeis Jun 22, 2021

SBrandeis Jun 22, 2021

lewtun commented Jun 23, 2021

lhoestq commented Jun 23, 2021 •

edited

Loading

Add task template for automatic speech recognition #2533

Add task template for automatic speech recognition #2533

Conversation

lewtun commented Jun 22, 2021 • edited Loading

lhoestq Jun 22, 2021 • edited Loading

Choose a reason for hiding this comment

lewtun Jun 22, 2021

Choose a reason for hiding this comment

lhoestq Jun 22, 2021

Choose a reason for hiding this comment

lewtun Jun 22, 2021

Choose a reason for hiding this comment

SBrandeis left a comment

Choose a reason for hiding this comment

SBrandeis Jun 22, 2021

Choose a reason for hiding this comment

SBrandeis Jun 22, 2021

Choose a reason for hiding this comment

lewtun commented Jun 23, 2021

lhoestq commented Jun 23, 2021 • edited Loading

lewtun commented Jun 22, 2021 •

edited

Loading

lhoestq Jun 22, 2021 •

edited

Loading

lhoestq commented Jun 23, 2021 •

edited

Loading