Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EMFORMER_RNNT_BASE_MUSTC bundle #2222

Closed
wants to merge 2 commits into from

Conversation

nateanl
Copy link
Member

@nateanl nateanl commented Feb 14, 2022

  • Add EMFORMER_RNNT_BASE_MUSTC bundle to torchaudio.prototype.pipelines
  • Add unit tests for training recipes of TED-LIUM and MuST-C
  • Refactor training and evaluation scripts

@@ -1,38 +1,51 @@
#!/usr/bin/env python3
"""Train the SentencePiece model by using the transcripts of MuST-C release v2.0 training set.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description does not seem to be correct.

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does a lot of things. Can you split it into smaller ones?

I see things not related to the bundle, which can be done independently.

  1. Change the argument from --model_type to --model-type (and similar to checkpoint_path)
  2. Add shebang lines to existing scripts
  3. Update to eaval.py
    Then there are smaller changes that are required by MUSTC model addition.
  4. Refactor of pipeline_demo.py (the part without MUSTC model)
  5. Addition of tedlium3 model test (and refactor of MockSentencePiece)
    Finally, add MUST-C model.
  6. Add MUSTC bundle.

@@ -0,0 +1,32 @@
class MockSentencePieceProcessor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name for utilities cannot include test prefix / suffix. IIRC, it causes some issue with pytest.

@nateanl nateanl changed the title Add EMFORMER_RNNT_BASE_MUSTC bundel Add EMFORMER_RNNT_BASE_MUSTC bundle Feb 15, 2022
@nateanl nateanl closed this Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants