Skip to content

This multilingual transcription web application uses the Whisper speech-to-text model, optimized with LoRA and BitandByte techniques for better accuracy and performance. Built with FastAPI, it offers real-time, scalable transcription in multiple languages, ensuring fast and reliable results.

Notifications You must be signed in to change notification settings

mayssakorbi/WebApp-Multilingual-Speech-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebApp-Multilingual-Speech-Recognition

A cutting-edge web application leveraging AI to perform multilingual speech recognition and transcription.

Table of Contents

About the Project

This web application processes uploaded audio files and transcribes spoken language into text, supporting multiple languages. It is designed to provide high accuracy using Whisper. The app is ideal for generating accessible content, translating recorded meetings, or transcribing multilingual audio files.

Features

Speech-to-text transcription. Supports multiple languages. Intuitive and responsive web interface. AI-powered for high transcription accuracy.

Technologies Used

Frontend: HTML, CSS, JavaScript

Backend: FastAPI, WebSockets

AI Model: Whisper (fine-tuned version)

Other Tools: Python, Hugging Face, Lora, Bitsandbytes

Prerequisites

Python 3.9+: Ensure that Python is installed. You can download it from python.org.

Visual Studio Code: A lightweight and powerful editor with extensions like Python and Prettier. Download it from code.visualstudio.com.

Google Colab Pro : Use this for fine-tuning and training the Whisper model, offering enhanced computing resources.

Usage

Download the files to your local machine.

Launch the application in Visual Studio Code.

Choose a file to transcribe and upload it through the web interface.

View the transcription results once the file is processed.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

About

This multilingual transcription web application uses the Whisper speech-to-text model, optimized with LoRA and BitandByte techniques for better accuracy and performance. Built with FastAPI, it offers real-time, scalable transcription in multiple languages, ensuring fast and reliable results.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published