[src] Partial hypothesis for cuda decoder #4101

hugovbraun · 2020-06-12T05:07:26Z

Code is ready but setting the WIP flag because we need to finish testing

The cuda decoder can now generates partial hypotheses on the fly. The compute cost for generating those partial hypotheses is very low. The work is done asynchronously while the GPU is working. This work has been integrated in the online pipeline, for the end user getting partial hypotheses just require to add an argument to the DecodeBatch call:

  void DecodeBatch(const std::vector<CorrelationID> &corr_ids,
                   const std::vector<SubVector<BaseFloat>> &wave_samples,
                   const std::vector<bool> &is_first_chunk,
                   const std::vector<bool> &is_last_chunk,
                   std::vector<std::string *> *partial_hypotheses = NULL);

If partial_hypotheses is not null, the vector will contain the partial hypotheses. The pointers contained by that vector must be used before the next DecodeBatch call.

Easiest way to test is to run the cudadecoderbin/batched-wav-nnet3-cuda-online binary, with --print-partial-hypotheses=true. You probably want to reduce the number of parallel channels to be able to read the output, with --num-parallel-streaming-channels.

Sample output:

main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ==========
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3     : HELLO
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4     : NUMBER
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5     : THE MUSIC
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6     : THE
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7     : A
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8     : THE
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9     : AT
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0     : HE HOPED
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1     : 
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2     : AFTER
main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES ===========
main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ==========
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3     : HELLO
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4     : NUMBER DEN     
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5     : THE MUSIC CAME 
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6     : THE DULL
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7     : A COLD
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8     : THE CHAOS
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9     : AT MOST 
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0     : HE HOPED THERE WOULD
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1     : STUFFED INTO   
main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2     : AFTER EARLY    
main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES ===========

Right now only the text version of the partial hypothesis can be returned by the online pipeline. If it's useful to also return the int olabels, we can, but I'd like to keep the high-level API as simple as possible. The endpointing will be handled directly in the decoder.

danpovey · 2020-06-12T05:37:09Z

Great!

…

On Fri, Jun 12, 2020 at 1:07 PM Hugo Braun ***@***.***> wrote: Code is ready but setting the WIP flag because we need to finish testing The cuda decoder can now generates partial hypotheses on the fly. The compute cost for generating those partial hypotheses is very low. The work is done asynchronously while the GPU is working. This work has been integrated in the online pipeline, for the end user getting partial hypotheses just require to add an argument to the DecodeBatch call: void DecodeBatch(const std::vector<CorrelationID> &corr_ids, const std::vector<SubVector<BaseFloat>> &wave_samples, const std::vector<bool> &is_first_chunk, const std::vector<bool> &is_last_chunk, std::vector<std::string *> *partial_hypotheses = NULL); If partial_hypotheses is not null, the vector will contain the partial hypotheses. The pointers contained by that vector must be used before the next DecodeBatch call. Easiest way to test is to run the cudadecoderbin/batched-wav-nnet3-cuda-online binary, with --print-partial-hypotheses=true. You probably want to reduce the number of parallel channels to be able to read the output, with --num-parallel-streaming-channels. Sample output: main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ========== main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3 : HELLO main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4 : NUMBER main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5 : THE MUSIC main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6 : THE main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7 : A main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8 : THE main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9 : AT main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0 : HE HOPED main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1 : main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2 : AFTER main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES =========== main():batched-wav-nnet3-cuda-online.cc:351) ========== BEGIN OF PARTIAL HYPOTHESES ========== main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #3 : HELLO main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #4 : NUMBER DEN main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #5 : THE MUSIC CAME main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #6 : THE DULL main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #7 : A COLD main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #8 : THE CHAOS main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #9 : AT MOST main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #0 : HE HOPED THERE WOULD main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #1 : STUFFED INTO main():batched-wav-nnet3-cuda-online.cc:353) CORR_ID #2 : AFTER EARLY main():batched-wav-nnet3-cuda-online.cc:356) =========== END OF PARTIAL HYPOTHESES =========== Right now only the text version of the partial hypothesis can be returned by the online pipeline. If it's useful to also return the int olabels, we can, but I'd like to keep the high-level API as simple as possible. The endpointing will be handled directly in the decoder. ------------------------------ You can view, comment on, or merge this pull request online at: #4101 Commit Summary - Partial hypotheses File Changes - *M* src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.cc <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-7c1c4e0680ae78538cfda0c22a6cbd9e> (22) - *M* src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.h <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-12ca9dc56af5b89377162113031e88b9> (16) - *M* src/cudadecoder/cuda-decoder-common.h <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-776a143dc27a18c2e95cbf3afc21aceb> (100) - *M* src/cudadecoder/cuda-decoder-kernels.cu <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-b4efe7f1306c7ca46294486102c49be2> (3) - *M* src/cudadecoder/cuda-decoder.cc <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-0356f141a9ce41a45bcf40ad8d4622ec> (248) - *M* src/cudadecoder/cuda-decoder.h <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-669bbe39c133f2149a75020ac36e2e8c> (46) - *M* src/cudadecoderbin/batched-wav-nnet3-cuda-online.cc <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-95713b17b5a296ab763fb70c6e169a17> (19) - *M* src/cudadecoderbin/batched-wav-nnet3-cuda2.cc <https://github.com/kaldi-asr/kaldi/pull/4101/files#diff-345e226f997b822e77a1355ffdb3a940> (2) Patch Links: - https://github.com/kaldi-asr/kaldi/pull/4101.patch - https://github.com/kaldi-asr/kaldi/pull/4101.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4101>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO2BQSQTK5EJBC4ONPLRWGZ2PANCNFSM4N376WYA> .

al-zatv · 2020-06-12T15:17:42Z

Thank you for your job! This is really important feature for me.

hugovbraun · 2020-06-30T21:06:02Z

@al-zatv glad to hear! Were you able to give it a try?

Testing ok on our side.

kkm000

This is mostly about the public API, which will be harder to change later without breaking folk's code.

Feel free to apply changes to #4146 if that's easier for you, I'll merge one immediately after another then.

Thanks for this—I wish I understood CUDA programming one tenth as good as you do! :)

src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.h

src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.cc

src/cudadecoder/cuda-decoder.h

src/cudadecoderbin/batched-wav-nnet3-cuda-online.cc

src/cudadecoderbin/batched-wav-nnet3-cuda2.cc

src/cudadecoder/cuda-decoder.cc

hugovbraun · 2020-07-03T01:15:20Z

@kkm000 thanks for the great review! I'll push a new commit early next week.

hugovbraun · 2020-07-06T23:12:26Z

Thank you for the detailed review @kkm000 . I've made the necessary changes to both this PR and #4146

danpovey · 2020-07-07T05:51:00Z

LMK when you guys think it's ready to merge. Or merge yourself, @kkm

hugovbraun · 2020-07-08T23:41:36Z

We've just found a bug while testing with a new dataset - let's hold for a few days until we fix it

hugovbraun · 2020-07-09T22:48:35Z

I pushed the fix in #4146. Maybe we can close that PR and work only with #4146

danpovey · 2020-07-11T06:19:41Z

@hugovbraun I am assuming you mean close this PR.. can you do it yourself if that's what you mean?

* Partial hypotheses * PR comments * Endpointing * PR comments * Neg non-em partial traceback bug fix

Partial hypotheses

ab60bdb

hugovbraun changed the title ~~[WIP] [src] Partial hypothesis for cuda decoder~~ [src] Partial hypothesis for cuda decoder Jun 30, 2020

hugovbraun mentioned this pull request Jun 30, 2020

[src] CudaDecoder endpointing #4146

Merged

kkm000 requested changes Jul 1, 2020

View reviewed changes

PR comments

2dfd4f3

hugovbraun closed this Jul 14, 2020

kkm000 pushed a commit that referenced this pull request Jul 15, 2020

[src] CudaDecoder endpointing (#4146, #4101)

083c64d

* Partial hypotheses * PR comments * Endpointing * PR comments * Neg non-em partial traceback bug fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[src] Partial hypothesis for cuda decoder #4101

[src] Partial hypothesis for cuda decoder #4101

hugovbraun commented Jun 12, 2020

danpovey commented Jun 12, 2020 via email

al-zatv commented Jun 12, 2020

hugovbraun commented Jun 30, 2020

kkm000 left a comment

hugovbraun commented Jul 3, 2020

hugovbraun commented Jul 6, 2020

danpovey commented Jul 7, 2020

hugovbraun commented Jul 8, 2020

hugovbraun commented Jul 9, 2020

danpovey commented Jul 11, 2020

[src] Partial hypothesis for cuda decoder #4101

[src] Partial hypothesis for cuda decoder #4101

Conversation

hugovbraun commented Jun 12, 2020

danpovey commented Jun 12, 2020 via email

al-zatv commented Jun 12, 2020

hugovbraun commented Jun 30, 2020

kkm000 left a comment

Choose a reason for hiding this comment

hugovbraun commented Jul 3, 2020

hugovbraun commented Jul 6, 2020

danpovey commented Jul 7, 2020

hugovbraun commented Jul 8, 2020

hugovbraun commented Jul 9, 2020

danpovey commented Jul 11, 2020