Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit tests for PyTorch Lightning modules of emformer_rnnt recipes #2240

Closed
wants to merge 4 commits into from

Conversation

nateanl
Copy link
Member

@nateanl nateanl commented Feb 15, 2022

  • Refactor the current LibriSpeechRNNTModule's unit test.
  • Add unit tests for TEDLIUM3RNNTModule and MuSTCRNNTModule
  • Replace the lambda with partial in TEDLIUM3RNNTModule to pass the lightning unit test.

@hwangjeff
Copy link
Contributor

can you also try out mocking the dataloader to bypass multiprocessing to see whether that mitigates the mac python 3.7 test hanging issue?

@nateanl
Copy link
Member Author

nateanl commented Feb 16, 2022

@hwangjeff The hanging issue seems to be resolved by mocking the dataloader.

@hwangjeff
Copy link
Contributor

@hwangjeff The hanging issue seems to be resolved by mocking the dataloader.

great, thanks!

@@ -388,7 +388,7 @@ def get_token_processor(self) -> TokenProcessor:

The underlying model is constructed by :py:func:`torchaudio.models.emformer_rnnt_base`
and utilizes weights trained on LibriSpeech using training script ``train.py``
`here <https://github.com/pytorch/audio/tree/main/examples/asr/librispeech_emformer_rnnt>`__ with default arguments.
`here <https://github.com/pytorch/audio/tree/main/examples/asr/emformer_rnnt>`__ with default arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing these

@facebook-github-bot
Copy link
Contributor

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I originally meant was that test code was identical across librispeech/tedlium cases, but It seems parameterizing within the same dataset is also a good option.

@mthrok
Copy link
Collaborator

mthrok commented Feb 16, 2022

@hwangjeff The hanging issue seems to be resolved by mocking the dataloader.

great, thanks!

I think this is worth reporting somewhere in the core.
@VitalyFedyunin @ejguan @NivekT We had an issue where mock dataset hangs inside of dataloader. Was there a dataloader update recently? The issue was observed sporadicly so we are not sure about the trigger yet.

@nateanl nateanl deleted the rnnt_mustc_pipeline_4 branch March 1, 2022 20:53
@VitalyFedyunin
Copy link

No known DataLoader updates

xiaohui-zhang pushed a commit to xiaohui-zhang/audio that referenced this pull request May 4, 2022
…pytorch#2240)

Summary:
- Refactor the current `LibriSpeechRNNTModule`'s unit test.
- Add unit tests for `TEDLIUM3RNNTModule` and `MuSTCRNNTModule`
- Replace the lambda with partial in `TEDLIUM3RNNTModule` to pass the lightning unit test.

Pull Request resolved: pytorch#2240

Reviewed By: mthrok

Differential Revision: D34285195

Pulled By: nateanl

fbshipit-source-id: 4f20749c85ddd25cbb0eafc1733c64212542338f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants