Skip to content

Commit

Permalink
Add doc about model export (#618)
Browse files Browse the repository at this point in the history
* Add doc about model export

* fix typos
  • Loading branch information
csukuangfj authored Oct 14, 2022
1 parent c39cba5 commit 11bff57
Show file tree
Hide file tree
Showing 9 changed files with 381 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ speech recognition recipes using `k2 <https://github.com/k2-fsa/k2>`_.
:caption: Contents:

installation/index
model-export/index
recipes/index
contributing/index
huggingface/index
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
2022-10-13 19:09:02,233 INFO [pretrained.py:265] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'encoder_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'decoder_dim': 512, 'joiner_dim': 512, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.21', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '4810e00d8738f1a21278b0156a42ff396a2d40ac', 'k2-git-date': 'Fri Oct 7 19:35:03 2022', 'lhotse-version': '1.3.0.dev+missing.version.file', 'torch-version': '1.10.0+cu102', 'torch-cuda-available': False, 'torch-cuda-version': '10.2', 'python-version': '3.8', 'icefall-git-branch': 'onnx-doc-1013', 'icefall-git-sha1': 'c39cba5-dirty', 'icefall-git-date': 'Thu Oct 13 15:17:20 2022', 'icefall-path': '/k2-dev/fangjun/open-source/icefall-master', 'k2-path': '/k2-dev/fangjun/open-source/k2-master/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-fj/fangjun/open-source-2/lhotse-jsonl/lhotse/__init__.py', 'hostname': 'de-74279-k2-test-4-0324160024-65bfd8b584-jjlbn', 'IP address': '10.177.74.203'}, 'checkpoint': './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/pretrained-iter-1224000-avg-14.pt', 'bpe_model': './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500/bpe.model', 'method': 'greedy_search', 'sound_files': ['./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav'], 'sample_rate': 16000, 'beam_size': 4, 'beam': 4, 'max_contexts': 4, 'max_states': 8, 'context_size': 2, 'max_sym_per_frame': 1, 'simulate_streaming': False, 'decode_chunk_size': 16, 'left_context': 64, 'dynamic_chunk_training': False, 'causal_convolution': False, 'short_chunk_size': 25, 'num_left_chunks': 4, 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
2022-10-13 19:09:02,233 INFO [pretrained.py:271] device: cpu
2022-10-13 19:09:02,233 INFO [pretrained.py:273] Creating model
2022-10-13 19:09:02,612 INFO [train.py:458] Disable giga
2022-10-13 19:09:02,623 INFO [pretrained.py:277] Number of model parameters: 78648040
2022-10-13 19:09:02,951 INFO [pretrained.py:285] Constructing Fbank computer
2022-10-13 19:09:02,952 INFO [pretrained.py:295] Reading sound files: ['./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav', './icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav']
2022-10-13 19:09:02,957 INFO [pretrained.py:301] Decoding started
2022-10-13 19:09:06,700 INFO [pretrained.py:329] Using greedy_search
2022-10-13 19:09:06,912 INFO [pretrained.py:388]
./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav:
AFTER EARLY NIGHTFALL THE YELLOW LAMPS WOULD LIGHT UP HERE AND THERE THE SQUALID QUARTER OF THE BROTHELS

./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav:
GOD AS A DIRECT CONSEQUENCE OF THE SIN WHICH MAN THUS PUNISHED HAD GIVEN HER A LOVELY CHILD WHOSE PLACE WAS ON THAT SAME DISHONORED BOSOM TO CONNECT HER PARENT FOREVER WITH THE RACE AND DESCENT OF MORTALS AND TO BE FINALLY A BLESSED SOUL IN HEAVEN

./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav:
YET THESE THOUGHTS AFFECTED HESTER PRYNNE LESS WITH HOPE THAN APPREHENSION


2022-10-13 19:09:06,912 INFO [pretrained.py:390] Decoding Done
135 changes: 135 additions & 0 deletions docs/source/model-export/export-model-state-dict.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
Export model.state_dict()
=========================

When to use it
--------------

During model training, we save checkpoints periodically to disk.

A checkpoint contains the following information:

- ``model.state_dict()``
- ``optimizer.state_dict()``
- and some other information related to training

When we need to resume the training process from some point, we need a checkpoint.
However, if we want to publish the model for inference, then only
``model.state_dict()`` is needed. In this case, we need to strip all other information
except ``model.state_dict()`` to reduce the file size of the published model.

How to export
-------------

Every recipe contains a file ``export.py`` that you can use to
export ``model.state_dict()`` by taking some checkpoints as inputs.

.. hint::

Each ``export.py`` contains well-documented usage information.

In the following, we use
`<https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless3/export.py>`_
as an example.

.. note::

The steps for other recipes are almost the same.

.. code-block:: bash
cd egs/librispeech/ASR
./pruned_transducer_stateless3/export.py \
--exp-dir ./pruned_transducer_stateless3/exp \
--bpe-model data/lang_bpe_500/bpe.model \
--epoch 20 \
--avg 10
will generate a file ``pruned_transducer_stateless3/exp/pretrained.pt``, which
is a dict containing ``{"model": model.state_dict()}`` saved by ``torch.save()``.

How to use the exported model
-----------------------------

For each recipe, we provide pretrained models hosted on huggingface.
You can find links to pretrained models in ``RESULTS.md`` of each dataset.

In the following, we demonstrate how to use the pretrained model from
`<https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13>`_.

.. code-block:: bash
cd egs/librispeech/ASR
git lfs install
git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13
After cloning the repo with ``git lfs``, you will find several files in the folder
``icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp``
that have a prefix ``pretrained-``. Those files contain ``model.state_dict()``
exported by the above ``export.py``.

In each recipe, there is also a file ``pretrained.py``, which can use
``pretrained-xxx.pt`` to decode waves. The following is an example:

.. code-block:: bash
cd egs/librispeech/ASR
./pruned_transducer_stateless3/pretrained.py \
--checkpoint ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/pretrained-iter-1224000-avg-14.pt \
--bpe-model ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500/bpe.model \
--method greedy_search \
./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1089-134686-0001.wav \
./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0001.wav \
./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/test_wavs/1221-135766-0002.wav
The above commands show how to use the exported model with ``pretrained.py`` to
decode multiple sound files. Its output is given as follows for reference:

.. literalinclude:: ./code/export-model-state-dict-pretrained-out.txt

Use the exported model to run decode.py
---------------------------------------

When we publish the model, we always note down its WERs on some test
dataset in ``RESULTS.md``. This section describes how to use the
pretrained model to reproduce the WER.

.. code-block:: bash
cd egs/librispeech/ASR
git lfs install
git clone https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13
cd icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp
ln -s pretrained-iter-1224000-avg-14.pt epoch-9999.pt
cd ../..
We create a symlink with name ``epoch-9999.pt`` to ``pretrained-iter-1224000-avg-14.pt``,
so that we can pass ``--epoch 9999 --avg 1`` to ``decode.py`` in the following
command:

.. code-block:: bash
./pruned_transducer_stateless3/decode.py \
--epoch 9999 \
--avg 1 \
--exp-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp \
--lang-dir ./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/data/lang_bpe_500 \
--max-duration 600 \
--decoding-method greedy_search
You will find the decoding results in
``./icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13/exp/greedy_search``.

.. caution::

For some recipes, you also need to pass ``--use-averaged-model False``
to ``decode.py``. The reason is that the exported pretrained model is already
the averaged one.

.. hint::

Before running ``decode.py``, we assume that you have already run
``prepare.sh`` to prepare the test dataset.
12 changes: 12 additions & 0 deletions docs/source/model-export/export-ncnn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Export to ncnn
==============

We support exporting LSTM transducer models to `ncnn <https://github.com/tencent/ncnn>`_.

Please refer to :ref:`export-model-for-ncnn` for details.

We also provide `<https://github.com/k2-fsa/sherpa-ncnn>`_
performing speech recognition using ``ncnn`` with exported models.
It has been tested on Linux, macOS, Windows, and Raspberry Pi. The project is
self-contained and can be statically linked to produce a binary containing
everything needed.
69 changes: 69 additions & 0 deletions docs/source/model-export/export-onnx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
Export to ONNX
==============

In this section, we describe how to export models to ONNX.

.. hint::

Only non-streaming conformer transducer models are tested.


When to use it
--------------

It you want to use an inference framework that supports ONNX
to run the pretrained model.


How to export
-------------

We use
`<https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless3>`_
as an example in the following.

.. code-block:: bash
cd egs/librispeech/ASR
epoch=14
avg=2
./pruned_transducer_stateless3/export.py \
--exp-dir ./pruned_transducer_stateless3/exp \
--bpe-model data/lang_bpe_500/bpe.model \
--epoch $epoch \
--avg $avg \
--onnx 1
It will generate the following files inside ``pruned_transducer_stateless3/exp``:

- ``encoder.onnx``
- ``decoder.onnx``
- ``joiner.onnx``
- ``joiner_encoder_proj.onnx``
- ``joiner_decoder_proj.onnx``

You can use ``./pruned_transducer_stateless3/exp/onnx_pretrained.py`` to decode
waves with the generated files:

.. code-block:: bash
./pruned_transducer_stateless3/onnx_pretrained.py \
--bpe-model ./data/lang_bpe_500/bpe.model \
--encoder-model-filename ./pruned_transducer_stateless3/exp/encoder.onnx \
--decoder-model-filename ./pruned_transducer_stateless3/exp/decoder.onnx \
--joiner-model-filename ./pruned_transducer_stateless3/exp/joiner.onnx \
--joiner-encoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_encoder_proj.onnx \
--joiner-decoder-proj-model-filename ./pruned_transducer_stateless3/exp/joiner_decoder_proj.onnx \
/path/to/foo.wav \
/path/to/bar.wav \
/path/to/baz.wav
How to use the exported model
-----------------------------

We also provide `<https://github.com/k2-fsa/sherpa-onnx>`_
performing speech recognition using `onnxruntime <https://github.com/microsoft/onnxruntime>`_
with exported models.
It has been tested on Linux, macOS, and Windows.
58 changes: 58 additions & 0 deletions docs/source/model-export/export-with-torch-jit-script.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
.. _export-model-with-torch-jit-script:

Export model with torch.jit.script()
===================================

In this section, we describe how to export a model via
``torch.jit.script()``.

When to use it
--------------

If we want to use our trained model with torchscript,
we can use ``torch.jit.script()``.

.. hint::

See :ref:`export-model-with-torch-jit-trace`
if you want to use ``torch.jit.trace()``.

How to export
-------------

We use
`<https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless3>`_
as an example in the following.

.. code-block:: bash
cd egs/librispeech/ASR
epoch=14
avg=1
./pruned_transducer_stateless3/export.py \
--exp-dir ./pruned_transducer_stateless3/exp \
--bpe-model data/lang_bpe_500/bpe.model \
--epoch $epoch \
--avg $avg \
--jit 1
It will generate a file ``cpu_jit.pt`` in ``pruned_transducer_stateless3/exp``.

.. caution::

Don't be confused by ``cpu`` in ``cpu_jit.pt``. We move all parameters
to CPU before saving it into a ``pt`` file; that's why we use ``cpu``
in the filename.

How to use the exported model
-----------------------------

Please refer to the following pages for usage:

- `<https://k2-fsa.github.io/sherpa/python/streaming_asr/emformer/index.html>`_
- `<https://k2-fsa.github.io/sherpa/python/streaming_asr/conv_emformer/index.html>`_
- `<https://k2-fsa.github.io/sherpa/python/streaming_asr/conformer/index.html>`_
- `<https://k2-fsa.github.io/sherpa/python/offline_asr/conformer/index.html>`_
- `<https://k2-fsa.github.io/sherpa/cpp/offline_asr/gigaspeech.html>`_
- `<https://k2-fsa.github.io/sherpa/cpp/offline_asr/wenetspeech.html>`_
69 changes: 69 additions & 0 deletions docs/source/model-export/export-with-torch-jit-trace.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
.. _export-model-with-torch-jit-trace:

Export model with torch.jit.trace()
===================================

In this section, we describe how to export a model via
``torch.jit.trace()``.

When to use it
--------------

If we want to use our trained model with torchscript,
we can use ``torch.jit.trace()``.

.. hint::

See :ref:`export-model-with-torch-jit-script`
if you want to use ``torch.jit.script()``.

How to export
-------------

We use
`<https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/lstm_transducer_stateless2>`_
as an example in the following.

.. code-block:: bash
iter=468000
avg=16
cd egs/librispeech/ASR
./lstm_transducer_stateless2/export.py \
--exp-dir ./lstm_transducer_stateless2/exp \
--bpe-model data/lang_bpe_500/bpe.model \
--iter $iter \
--avg $avg \
--jit-trace 1
It will generate three files inside ``lstm_transducer_stateless2/exp``:

- ``encoder_jit_trace.pt``
- ``decoder_jit_trace.pt``
- ``joiner_jit_trace.pt``

You can use
`<https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/lstm_transducer_stateless2/jit_pretrained.py>`_
to decode sound files with the following commands:

.. code-block:: bash
cd egs/librispeech/ASR
./lstm_transducer_stateless2/jit_pretrained.py \
--bpe-model ./data/lang_bpe_500/bpe.model \
--encoder-model-filename ./lstm_transducer_stateless2/exp/encoder_jit_trace.pt \
--decoder-model-filename ./lstm_transducer_stateless2/exp/decoder_jit_trace.pt \
--joiner-model-filename ./lstm_transducer_stateless2/exp/joiner_jit_trace.pt \
/path/to/foo.wav \
/path/to/bar.wav \
/path/to/baz.wav
How to use the exported models
------------------------------

Please refer to
`<https://k2-fsa.github.io/sherpa/python/streaming_asr/lstm/index.html>`_
for its usage in `sherpa <https://k2-fsa.github.io/sherpa/python/streaming_asr/lstm/index.html>`_.
You can also find pretrained models there.
14 changes: 14 additions & 0 deletions docs/source/model-export/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Model export
============

In this section, we describe various ways to export models.



.. toctree::

export-model-state-dict
export-with-torch-jit-trace
export-with-torch-jit-script
export-onnx
export-ncnn
Loading

0 comments on commit 11bff57

Please sign in to comment.