Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks #773

Merged
Merged

Conversation

trungkienbkhn
Copy link
Collaborator

Benchmarks

This PR introduces functionality to measure benchmarking for memory, Word Error Rate (WER), and speed in Faster-whisper.

1. Memory

GPU

Use Python thread with the py3nvml module to monitor GPU memory. This approach will continuously track GPU memory consumption at set intervals, then return the maximum memory used for the inference function.

RAM

Use memory_profiler module to measure maximum increase of memory usage.

2. WER

Evaluate the Faster-whisper model on the LibriSpeech validation-clean dataset with streaming mode, meaning no audio data has to be downloaded to your local device.
Use jiwer and evaluate modules from HF to calculate WER.

3. Speed

Calculate the min of args.repeat x time-averaged over 10 inference function calls.

Feel free to edit/update more benchmarking methods !

@BBC-Esq
Copy link
Contributor

BBC-Esq commented Apr 1, 2024

I'd like to suggest using the psutil library directly instead of memory_profiler, which inherently uses psutil under the hood and hasn't been updated in years. Also, I'd suggest using the official "pynvml" library, which is actually named nvidia-ml-py; it's more up-to-date.

https://pypi.org/project/nvidia-ml-py/

Here's an example of how I'm using it:

https://github.com/BBC-Esq/VectorDB-Plugin-for-LM-Studio/blob/main/src/metrics_bar.py

@trungkienbkhn
Copy link
Collaborator Author

trungkienbkhn commented Apr 2, 2024

@BBC-Esq , tks for your suggestion. But I think your proposed pynvml module seems not official, it is built from this repo. I tried both (py3nvml + nvidia-ml-py) and they gave the same results.
I used py3nvml like in transformers repo from HF.

@BBC-Esq
Copy link
Contributor

BBC-Esq commented Apr 2, 2024

This screenshot says that it's maintained by nvidia so I'm not sure what the confusion is...
image

"py3nvml" on pypi says that some individual maintains it...

image

If it works, it works...but when I went to the repo for py3nvml the last time it was updated was awhile ago...

Any feedback on psutil versus memory_profiler. All the same perhaps?

@BBC-Esq
Copy link
Contributor

BBC-Esq commented Apr 2, 2024

As an aside, it's my understanding that "nvidia-ml-py" (i.e. the one I'm using) imports as "pynvml" NOT "nvidia-ml-py," which is what confused me initially. Multiple forks exist, some that have created pypi packages named similarly so...I'm about 85%-95% sure I have the official one. ;-) The only way it'd matter is if, for example, Nvidia updates nvidia-ml-py (e.g. to support modern gpus) while an unofficial fork doesn't...Can't speak to huggingface...perhaps they're using an unofficial fork that's regularly updated by the guy/gal. Either way, just thought I'd raise the issue in case it matters to you guys!

@BBC-Esq
Copy link
Contributor

BBC-Esq commented Apr 2, 2024

Don't know if it's helpful, but here's a script of mine that collects more than just GPU VRAM usage...might be helpful for benchmarking. I've found that gpu/cuda usage is a useful metric as well as power usage...

https://github.com/BBC-Esq/Nvidia_Gpu_Monitor/blob/main/metrics_pynvml.py

@BBC-Esq
Copy link
Contributor

BBC-Esq commented Apr 2, 2024

If Ctranslate2 ever supports RocM, that'd be great, and you want to benchmark on AMD gpus I've struggled unsuccessfully to do it. I don't own an AMD gpu, almost bought one solely to program, but anyways here's where my research left off...

BBC-Esq/VectorDB-Plugin-for-LM-Studio#130

@trungkienbkhn
Copy link
Collaborator Author

trungkienbkhn commented Apr 3, 2024

This screenshot says that it's maintained by nvidia so I'm not sure what the confusion is...

NVIDIA Corporation can be setup in the author field of the setup.py file.

Don't know if it's helpful, but here's a script of mine that collects more than just GPU VRAM usage...might be helpful for benchmarking

Tks, I will look at this.

If Ctranslate2 ever supports RocM, that'd be great, and you want to benchmark on AMD gpus I've struggled unsuccessfully to do it. I don't own an AMD gpu, almost bought one solely to program, but anyways here's where my research left off...

Unfortunately, I don't have either.

@trungkienbkhn trungkienbkhn force-pushed the faster_whisper_benchmark branch from ba937f9 to 563a51c Compare April 3, 2024 04:43
@trungkienbkhn trungkienbkhn merged commit 6eec077 into SYSTRAN:master May 4, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants