Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
-
Updated
Dec 5, 2023 - Python
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
Free AI & Community powered Learning Experience
Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine
Extractive Question-Answering with BERT on SQuAD v2.0 (Stanford Question Answering Dataset) using NVIDIA PyTorch Lightning
📄 SmartSRT is a command-line tool for generating accurate subtitles with per-word timestamps. It uses WhisperAI for speech transcription, NVIDIA NeMo for diarization, and OpenCV for face recognition. The program is good at creating high accuracy subtitles. 🎧💻⚙️
Post-training quantization on Nvidia Nemo ASR model
Automatic transcriber made with the Nvidia NeMo AI toolkit. Used to transcribe speech to text in real-time from any source. Requires CUDA capable GPU to run on the local machine, if setup using virtual audio cables can transcribe the audio that is being played in real-time without any other requirements.
This bootcamp is designed to give NLP researchers an end-to-end overview on the fundamentals of NVIDIA NeMo framework, complete solution for building large language models. It will also have hands-on exercises complimented by tutorials, code snippets, and presentations to help researchers kick-start with NeMo LLM Service and Guardrails.
LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.
The simplest & most comprehensible tutorial on speaker identification with NVIDIA's `Nemo`.
Training and Tunning a Text to speech model with Nvidia NeMo and Weights and Biases
Module for russian speech recognition using NVIDIA Nemo.
Generative AI with NVIDIA NeMo
Audio profanity detector desktop app developed with PyQt5 using NVidia-Nemo tech.
Implementation of a Kazakh Speech-to-Text Model using the NVIDIA NeMo toolkit for efficient transcription of spoken Kazakh speech into text.
PodcastProject Analytics Toolkit - Project that creates analytics various input data. Exported data is intended to be used in a PodcastProject website
Add a description, image, and links to the nvidia-nemo topic page so that developers can more easily learn about it.
To associate your repository with the nvidia-nemo topic, visit your repo's landing page and select "manage topics."