PAVoice-Vietnamese-Text-To-Speech-dataset

PAVoice - Vietnamese Text-To-Speech Dataset for Community

Overview

PAVoice is a comprehensive Vietnamese Text-To-Speech dataset recorded in March 2022 by a 23-year-old amateur female vocalist from the North of Vietnam. This dataset is provided to the community for research and development purposes. It offers a substantial collection of spoken sentences, totaling 1.9GB in size and 18.15 hours of audio.

Dataset Composition

The dataset includes 11,256 sentences, each of varying lengths and covering diverse topics. These sentences were carefully selected from online newspapers to represent real-world use cases. The text was sourced from novels and short stories written by "Vu Trong Phung," and it is in the public domain, ensuring no copyright restrictions.

Data Format

Audio Format: The audio data is recorded in '.wav' format at 44.1 kHz and subsequently downsampled to 16 kHz. It uses a single channel with 16 bits per sample.
Noise Reduction: The speech audio has been processed using the Facebook Denoiser to improve audio quality.
Phonetic Alignment: The audio files have been passed through Montreal Forced Aligner (MFA) to obtain a .TextGrid file containing timestamps for each character in the text.

Audio Samples

For a quick glimpse of the dataset, here's a sample:

Sample 1:
- Audio: 20310-20544_117_1408736_1409894.wav
- Text: sao em lại cười

Download

You can access the PAVoice dataset through the following link: Download PAVoice Dataset

Dataset Directory Structure

The dataset is organized as follows:

dataset
└── wav
    ├── 000000.wav
    ├── 000000.TextGrid
    ├── 000001.wav
    ├── 000001.TextGrid
    ...

Dataset Statistics

Number of Clips: 11,256 clips
Total Audio Duration: 18.15 hours
Shortest Audio Clip: 2.17 seconds
Mean Clip Duration: 5.77 seconds
Longest Audio Clip: 12.66 seconds
Sampling Rate: 16 kHz
Bits Per Sample: 16 bit
Number of Channels: 1
Mean Pitch: 226.63 Hz

Screenshot

Please note that the audio data in this dataset is not intended for commercial use.

For any questions or concerns related to the dataset, you can contact [insert your contact information here]. We hope this dataset serves as a valuable resource for your research and development projects. Enjoy exploring the world of Vietnamese Text-To-Speech with PAVoice!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
metadata.csv		metadata.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAVoice-Vietnamese-Text-To-Speech-dataset

Overview

Dataset Composition

Data Format

Audio Samples

Download

Dataset Directory Structure

Dataset Statistics

Screenshot

About

Releases

Packages

lethanhson9901/PAVoice-Vietnamese-Text-To-Speech-dataset

Folders and files

Latest commit

History

Repository files navigation

PAVoice-Vietnamese-Text-To-Speech-dataset

Overview

Dataset Composition

Data Format

Audio Samples

Download

Dataset Directory Structure

Dataset Statistics

Screenshot

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages