-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[src] Partial hypothesis for cuda decoder #4101
Conversation
Great!
…On Fri, Jun 12, 2020 at 1:07 PM Hugo Braun ***@***.***> wrote:
Code is ready but setting the WIP flag because we need to finish testing
The cuda decoder can now generates partial hypotheses on the fly. The
compute cost for generating those partial hypotheses is very low. The work
is done asynchronously while the GPU is working. This work has been
integrated in the online pipeline, for the end user getting partial
hypotheses just require to add an argument to the DecodeBatch call:
void DecodeBatch(const std::vector<CorrelationID> &corr_ids,
const std::vector<SubVector<BaseFloat>> &wave_samples,
const std::vector<bool> &is_first_chunk,
const std::vector<bool> &is_last_chunk,
std::vector<std::string *> *partial_hypotheses = NULL);
If partial_hypotheses is not null, the vector will contain the partial
hypotheses. The pointers contained by that vector must be used before the
next DecodeBatch call.
Easiest way to test is to run the
cudadecoderbin/batched-wav-nnet3-cuda-online binary, with
--print-partial-hypotheses=true. You probably want to reduce the number
of parallel channels to be able to read the output, with
--num-parallel-streaming-channels.
Sample output:
main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ==========
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3 : HELLO
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4 : NUMBER
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5 : THE MUSIC
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6 : THE
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7 : A
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8 : THE
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9 : AT
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0 : HE HOPED
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1 :
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2 : AFTER
main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES ===========
main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ==========
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3 : HELLO
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4 : NUMBER DEN
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5 : THE MUSIC CAME
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6 : THE DULL
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7 : A COLD
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8 : THE CHAOS
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9 : AT MOST
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0 : HE HOPED THERE WOULD
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1 : STUFFED INTO
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2 : AFTER EARLY
main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES ===========
Right now only the text version of the partial hypothesis can be returned
by the online pipeline. If it's useful to also return the int olabels, we
can, but I'd like to keep the high-level API as simple as possible. The
endpointing will be handled directly in the decoder.
------------------------------
You can view, comment on, or merge this pull request online at:
#4101
Commit Summary
- Partial hypotheses
File Changes
- *M* src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.cc
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-7c1c4e0680ae78538cfda0c22a6cbd9e>
(22)
- *M* src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.h
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-12ca9dc56af5b89377162113031e88b9>
(16)
- *M* src/cudadecoder/cuda-decoder-common.h
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-776a143dc27a18c2e95cbf3afc21aceb>
(100)
- *M* src/cudadecoder/cuda-decoder-kernels.cu
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-b4efe7f1306c7ca46294486102c49be2>
(3)
- *M* src/cudadecoder/cuda-decoder.cc
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-0356f141a9ce41a45bcf40ad8d4622ec>
(248)
- *M* src/cudadecoder/cuda-decoder.h
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-669bbe39c133f2149a75020ac36e2e8c>
(46)
- *M* src/cudadecoderbin/batched-wav-nnet3-cuda-online.cc
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-95713b17b5a296ab763fb70c6e169a17>
(19)
- *M* src/cudadecoderbin/batched-wav-nnet3-cuda2.cc
<https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-345e226f997b822e77a1355ffdb3a940>
(2)
Patch Links:
- https://github.com/kaldi-asr/kaldi/pull/4101.patch
- https://github.com/kaldi-asr/kaldi/pull/4101.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4101>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO2BQSQTK5EJBC4ONPLRWGZ2PANCNFSM4N376WYA>
.
|
Thank you for your job! This is really important feature for me. |
@al-zatv glad to hear! Were you able to give it a try? Testing ok on our side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly about the public API, which will be harder to change later without breaking folk's code.
Feel free to apply changes to #4146 if that's easier for you, I'll merge one immediately after another then.
Thanks for this—I wish I understood CUDA programming one tenth as good as you do! :)
@kkm000 thanks for the great review! I'll push a new commit early next week. |
LMK when you guys think it's ready to merge. Or merge yourself, @kkm |
We've just found a bug while testing with a new dataset - let's hold for a few days until we fix it |
@hugovbraun I am assuming you mean close this PR.. can you do it yourself if that's what you mean? |
Code is ready but setting the WIP flag because we need to finish testing
The cuda decoder can now generates partial hypotheses on the fly. The compute cost for generating those partial hypotheses is very low. The work is done asynchronously while the GPU is working. This work has been integrated in the online pipeline, for the end user getting partial hypotheses just require to add an argument to the DecodeBatch call:
If partial_hypotheses is not null, the vector will contain the partial hypotheses. The pointers contained by that vector must be used before the next DecodeBatch call.
Easiest way to test is to run the cudadecoderbin/batched-wav-nnet3-cuda-online binary, with
--print-partial-hypotheses=true
. You probably want to reduce the number of parallel channels to be able to read the output, with--num-parallel-streaming-channels
.Sample output:
Right now only the text version of the partial hypothesis can be returned by the online pipeline. If it's useful to also return the int olabels, we can, but I'd like to keep the high-level API as simple as possible. The endpointing will be handled directly in the decoder.