-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Canary parallel inference #9517
Conversation
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: karpnv <karpnv@users.noreply.github.com>
@@ -109,7 +112,9 @@ class ParallelTranscriptionConfig: | |||
# att_context_size can be set for cache-aware streaming models with multiple look-aheads | |||
att_context_size: Optional[list] = None | |||
|
|||
trainer: TrainerConfig = TrainerConfig(devices=-1, accelerator="gpu", strategy="ddp") | |||
trainer: TrainerConfig = TrainerConfig( | |||
devices=-1, accelerator="gpu", strategy="ddp", use_distributed_sampler=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be careful with distributed sampler setting here: non-lhotse datasets still likely require True
. it might be better to just override this for EncDec model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, rm use_distributed_sampler=False
@@ -72,7 +72,7 @@ def __getitem__(self, cuts: CutSet) -> tuple[torch.Tensor, torch.Tensor, torch.T | |||
prompts = None | |||
prompts_lens = None | |||
|
|||
return audio, audio_lens, prompts_with_answers, prompts_with_answers_lens, prompts, prompts_lens | |||
return audio, audio_lens, prompts_with_answers, prompts_with_answers_lens, prompts, prompts_lens, cuts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not ideal as returning cuts
here will transfer the data held in-memory across dataloading worker subprocesses to the main training/inference loop process. we should return cuts.drop_recordings()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
trainer.global_rank is apriori set by PTL in slurm environment. It does not require model to be built or it's functions to be called c Is this a case where PTL cannot apriori detect global rank ? |
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: karpnv <karpnv@users.noreply.github.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
…karpnv/canary_parallel
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: karpnv <karpnv@users.noreply.github.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
Jenkins |
[🤖]: Hi @karpnv 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
[🤖]: Hi @karpnv 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
1 similar comment
[🤖]: Hi @karpnv 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
* add Canary cats Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler=False Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * update lhotse Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * fix global_rank Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * OmegaConf.set_struct Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * review fix Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> * predict_step return Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> --------- Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: karpnv <karpnv@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> Co-authored-by: karpnv <karpnv@users.noreply.github.com> Co-authored-by: Piotr Żelasko <petezor@gmail.com> Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
* add Canary cats Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler=False Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * update lhotse Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * fix global_rank Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * OmegaConf.set_struct Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * review fix Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> * predict_step return Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> --------- Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: karpnv <karpnv@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> Co-authored-by: karpnv <karpnv@users.noreply.github.com> Co-authored-by: Piotr Żelasko <petezor@gmail.com> Co-authored-by: pzelasko <pzelasko@users.noreply.github.com> Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
* add Canary cats Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler=False Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * update lhotse Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * fix global_rank Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * OmegaConf.set_struct Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * review fix Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> * predict_step return Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> --------- Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: karpnv <karpnv@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> Co-authored-by: karpnv <karpnv@users.noreply.github.com> Co-authored-by: Piotr Żelasko <petezor@gmail.com> Co-authored-by: pzelasko <pzelasko@users.noreply.github.com> Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
* add Canary cats Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler=False Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * rm use_distributed_sampler Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * update lhotse Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * fix global_rank Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * OmegaConf.set_struct Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * review fix Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Apply isort and black reformatting Signed-off-by: karpnv <karpnv@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> * predict_step return Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> --------- Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: karpnv <karpnv@users.noreply.github.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com> Signed-off-by: pzelasko <pzelasko@users.noreply.github.com> Co-authored-by: karpnv <karpnv@users.noreply.github.com> Co-authored-by: Piotr Żelasko <petezor@gmail.com> Co-authored-by: pzelasko <pzelasko@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
What does this PR do ?
Support Canary at transcribe_speech_parallel.py script
Collection: ASR
#python3 ./examples/asr/transcribe_speech_parallel.py model=./canary-1b.nemo predict_ds.manifest_filepath=./manifest.json output_path=/tmp trainer.devices=-1
PR Type:
Who can review?
@pzelasko