Feat: Docker environment for remote speech to text evaluation #110

Epic-Eric · 2024-08-11T20:24:36Z

Description

Docker creates a container where the operating system and dependencies are uniform and the setup process is streamlined.

To build the docker file, first change the directory to speech_to_text in the terminal from Simuleval parent folder:

cd examples/speech_to_text

Then, build the Docker image with:

docker build -t simuleval-speech-to-text:1.0 .

Next, run the remote evaluation server using the Docker image:

docker run -p 8888:8888 simuleval-speech-to-text:1.0

This binds port 8888 of the container (server) to port 8888 on the local machine (client).

To pass data to the server and execute remote evaluation, open another terminal, and change its directory to examples/speech_to_text. Finally, you can access the server with the following code for instance:

Example input

simuleval --remote-eval --remote-port 8888 \
    --source-segment-size 500 \
    --source source.txt --target reference/transcript.txt \
    --source-type speech --target-type text \
    --output output --quality-metrics WER

Example output

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How Has This Been Tested?

Tested locally first, as there were some bugs with the upstream repository as detailed here:
#109
Using COPY . /Simuleval instead of RUN git clone https://github.com/facebookresearch/SimulEval in the Dockerfile, I tested my local changes and ensured they worked as expected.

…into docker

…docker Update to local remote changes

xutaima · 2024-08-14T18:25:12Z

.github/workflows/main.yml

@@ -34,6 +34,7 @@ jobs:
          pip install sentencepiece
          pip install -e .
          if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+          python -c "import nltk; nltk.download('averaged_perceptron_tagger_eng')"


Just curious why is nltk added here?

Not adding nltk made test_tree_pipeline_cmd in test_agent_pipeline.py to fail. It only happened recently, and wasn't an issue when I submitted the visualization PR. I have no clue why, but the error message told me that I need to nltk.download so I did

Epic-Eric and others added 11 commits July 31, 2024 18:24

Add Dockerfile

7dddc3d

added dependecies for docker

09b18be

dummy docker bug fix

db7ee92

dummy docker bug fix

50fdd59

Change dockerfile to do remote evaluation

ca7c6d1

Merge branch 'facebookresearch:main' into docker

1ca0865

Merge branch 'main' of https://github.com/facebookresearch/SimulEval …

9eca342

…into docker

finalize dockerfile

9a7bbaf

debug segments.py

1397f21

test docker locally, works

7a63288

Merge branch 'docker' of https://github.com/Epic-Eric/SimulEval into …

acd55aa

…docker Update to local remote changes

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 11, 2024

Epic-Eric added 4 commits August 11, 2024 13:29

download nltk for pytest

4c02ed8

add docker documentation for speech to text model

b30f311

add | to formatting

d8c8ef2

formatting change

f5497b7

xutaima reviewed Aug 14, 2024

View reviewed changes

take out nltk

f173da9

xutaima merged commit 6101ba1 into facebookresearch:main Aug 14, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Docker environment for remote speech to text evaluation #110

Feat: Docker environment for remote speech to text evaluation #110

Epic-Eric commented Aug 11, 2024 •

edited

Loading

xutaima Aug 14, 2024

Epic-Eric Aug 14, 2024

Feat: Docker environment for remote speech to text evaluation #110

Feat: Docker environment for remote speech to text evaluation #110

Conversation

Epic-Eric commented Aug 11, 2024 • edited Loading

Description

Example input

Example output

Type of change

How Has This Been Tested?

xutaima Aug 14, 2024

Choose a reason for hiding this comment

Epic-Eric Aug 14, 2024

Choose a reason for hiding this comment

Epic-Eric commented Aug 11, 2024 •

edited

Loading