-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add task template for automatic speech recognition #2533
Conversation
@dataclass(frozen=True) | ||
class AutomaticSpeechRecognition(TaskTemplate): | ||
task: str = "automatic-speech-recognition" | ||
input_schema: ClassVar[Features] = Features({"audio_file": Value("string")}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this template :)
Note that in the future we'll have an Audio feature type that will probably have additional parameters (like ClassLabel) such as the sampling rate or the audio format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know!
tests/test_arrow_dataset.py
Outdated
@@ -2144,6 +2144,39 @@ def test_task_question_answering(self, in_memory): | |||
) | |||
self.assertDictEqual(features_after_cast, dset.features) | |||
|
|||
def test_task_automatic_speech_recognition(self, in_memory): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't hesitate to move it outside of the BaseDatasetTest class, as mentioned in #2529
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, will do this here as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @lewtun !
@dataclass(frozen=True) | ||
class AutomaticSpeechRecognition(TaskTemplate): | ||
task: str = "automatic-speech-recognition" | ||
input_schema: ClassVar[Features] = Features({"audio_file": Value("string")}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe audio_file_path
would be more explicit on what the column represent ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, paths are not portable between machines a priori. This is probably good enough for now, but at some point, we'll need to replace the Value("string")
with an Audio
or Signal
feature!
@SBrandeis @lhoestq i've integrated your suggestions, so this is ready for another review :) |
Merging if it's good for you @lewtun :) |
This PR adds a task template for automatic speech recognition. In this task, the input is a path to an audio file which the model consumes to produce a transcription.
Usage: