Feat: Visualization tool #107

Epic-Eric · 2024-07-21T03:58:03Z

Description

Use Matplotlib to generate graphs that allow users to visualize speech transcription & translation data.

1st graph: Staircase graph.
Horizontal arrows represent the wait-k delays from reading from the source (s). Vertical arrows represent the output words from writing to the target (words).

2nd graph: Waveform graph.
The waveform is taken from the audio provided, and is displayed below with the x-axis (delay time) in sync with the Staircase graph, which allows convenient comparison and lookup for timestamps of interest.

Related issue: #15, #84

Example inputs

Simply add --visualize in the command line argument, for example:

simuleval \
    --agent whisper_waitk.py \
    --source-segment-size 500 \
    --waitk-lagging 3 \
    --source source.txt --target reference/transcript.txt \
    --output output --quality-metrics WER --visualize

It also supports visualization with --score-only command, where it will read data from instances.log without running inference, which saves time if you just want the scores.

simuleval --score-only --output output --visualize

Both commands will generate the corresponding graph in the output/visual directory

Example output

Type of change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

test_visualization.py. Both modes (inference & --score-only) passed, with the expected number of graphs in the output/visual directory.
A variety of audio files, and their visualization ensures that the format stays consistent and the words are easy to read.

Note

Only audio files where their rate is 22 kHZ have been tested to work. If you use iPhone's voice memo where the rate is 44.1 kHz, make sure to lower your audio rate using the following command:
brew install sox (for Mac user)

Then:

pip install sox
sox test.wav -r 22050 test_22k.wav

Then, put test_22k.wav in source.txt and provide its transcript as the reference text in reference/transcript.txt.

Special thanks to:

My MLH Fellowship mentor: @xutaima

…ed visualize.py via argparse. Now user may add --visualize in the terminal to have the graph output to the output folder

… .png file. Add ability to read multiple dictionaries from instances.log

facebook-github-bot · 2024-07-21T03:58:09Z

Hi @Epic-Eric!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

facebook-github-bot · 2024-07-21T04:04:31Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

.gitignore

examples/speech_to_text/counter_in_tgt_lang_agent.py

examples/speech_to_text/output/config.yaml

examples/speech_to_text/reference/transcript.txt

examples/speech_to_text/whisper_waitk.py

simuleval/cli.py

simuleval/data/dataloader/dataloader.py

simuleval/evaluator/evaluator.py

simuleval/utils/agent.py

simuleval/utils/visualize.py

xutaima · 2024-07-31T14:41:05Z

Hi @Epic-Eric, thanks for the PR! It looks good in general! A few suggestion

could you clean up the code a little bit? e.g. remove the comments and debug files
make sure the test cases are all passed
add your own test here

Epic-Eric · 2024-08-04T19:36:01Z

simuleval/data/dataloader/dataloader.py

@@ -125,8 +125,8 @@ class IterableDataloader:

    @abstractmethod
    def __iter__(self):
-        ...
+        pass


Changed to pass, since the 3.7 and 3.8 Python's black formatters have different versions and they treat this ellipsis differently. The keyword pass also implies coming back to this code later on.

Epic-Eric · 2024-08-05T22:11:25Z

.github/workflows/main.yml

+          pytest simuleval/test/test_evaluator.py
+          pytest simuleval/test/test_remote_evaluation.py
+          pytest simuleval/test/test_s2s.py
+          pytest simuleval/test/test_visualize.py


Black's formatting

Epic-Eric · 2024-08-05T22:12:28Z

setup.py

@@ -40,6 +40,8 @@
        "bitarray==2.6.0",
        "yt-dlp",
        "pydub",
+        "openai-whisper",
+        "editdistance",
    ],


Needed for running whisper

Hi Eric, could you remove the dependency on openai-whisper and pip install openai-whisper in test plam?

Sure! Just curious, why don't we put it in setup.py, so the user can run custom audio files?

You mean to put it in main.yaml instead right

Epic-Eric · 2024-08-05T22:13:04Z

simuleval/evaluator/evaluator.py

+            open(self.output / "instances.log", "a")
+            if self.output
+            else contextlib.nullcontext()
+        ) as file:
            system.reset()


Black's formatting

xutaima

Hi Eric, it looks great! Could address the comments on the openai whipser? After that we can merge the PR

xutaima · 2024-08-09T17:50:16Z

setup.py

@@ -40,6 +40,8 @@
        "bitarray==2.6.0",
        "yt-dlp",
        "pydub",
+        "openai-whisper",
+        "editdistance",
    ],


Hi Eric, could you remove the dependency on openai-whisper and pip install openai-whisper in test plam?

Epic-Eric added 7 commits July 14, 2024 11:51

add stair_case graph to show number of words vs. delays (ms). Connect…

9beb736

…ed visualize.py via argparse. Now user may add --visualize in the terminal to have the graph output to the output folder

add visualize.py

85643c7

add a buffer in front of audio file to show delay

6fb4f4c

modify visual.ipynb to include both staricase and waveform graph in 1…

5918dfc

… .png file. Add ability to read multiple dictionaries from instances.log

add ability to run --score-only with --visualize

f5b2def

add unit test for visualize and update .gitignore

21a68bc

untrack the python notebook used for prototyping

e7afef3

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 21, 2024

auto-generates output/visual directory when visual folder is not created

6893c72

xutaima self-requested a review July 24, 2024 15:21

used black . to format everything

a3e7e53

xutaima requested changes Jul 31, 2024

View reviewed changes

xutaima mentioned this pull request Jul 31, 2024

Fix: typos and logic errors #108

Merged

1 task

Epic-Eric added 13 commits July 31, 2024 23:32

edit according to Xutai's suggestions

2c326cc

add visualize unit test to git workflow

5435663

fix black formatting

f00437e

fix remaining file

78cb2ec

come on black

fc8b712

black is weird. removed white space

a4b9cd2

add install matplotlib to workflow

8915532

idk man black is not blacking

ba80dd7

replace ... with pass

03fdee2

pip==24.0

05dbf9d

returned to ... for dataloader

e9a471e

nvm pass is better for both python 3.7 and 3.8 black formatter

fd3ca69

check for empty config.yaml file, if empty, system exits

34ca78b

Epic-Eric commented Aug 4, 2024

View reviewed changes

Epic-Eric added 5 commits August 5, 2024 10:59

change whisper to openai-whisper

2093ef9

using only python=3.8

e98b4c3

add editdistance for pip install in setup.py

a385388

remove creating an output directory

dfc5490

fix path issue

abb2e7d

Epic-Eric commented Aug 5, 2024

View reviewed changes

Epic-Eric and others added 5 commits August 7, 2024 10:55

add matplotlib for pip

aa6a5b0

Merge branch 'main' into main

69e7cfd

put 3.7 and 3.8 for python version

f2eeff1

correct error on matplotlib

dafdd50

change to only 3.8 for github workflow

31529e4

xutaima requested changes Aug 9, 2024

View reviewed changes

Epic-Eric added 5 commits August 9, 2024 11:04

moved whisper dependencies to main.yml

14e404e

add speech_to_text documentation

10912bb

formatting changes

a4564ff

add line space to correct formatting

41da82e

add nltk download

60904b2

xutaima merged commit 7b45f68 into facebookresearch:main Aug 14, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Visualization tool #107

Feat: Visualization tool #107

Epic-Eric commented Jul 21, 2024 •

edited

Loading

facebook-github-bot commented Jul 21, 2024

facebook-github-bot commented Jul 21, 2024

xutaima commented Jul 31, 2024

Epic-Eric Aug 4, 2024

Epic-Eric Aug 5, 2024

Epic-Eric Aug 5, 2024

xutaima Aug 9, 2024

Epic-Eric Aug 9, 2024

Epic-Eric Aug 9, 2024

Epic-Eric Aug 9, 2024

Epic-Eric Aug 5, 2024

xutaima left a comment

xutaima Aug 9, 2024

Feat: Visualization tool #107

Feat: Visualization tool #107

Conversation

Epic-Eric commented Jul 21, 2024 • edited Loading

Description

Example inputs

Example output

Type of change

How Has This Been Tested?

Note

Special thanks to:

facebook-github-bot commented Jul 21, 2024

Action Required

Process

facebook-github-bot commented Jul 21, 2024

xutaima commented Jul 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xutaima left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Epic-Eric commented Jul 21, 2024 •

edited

Loading