Skip to content

The Greek-Multimodal-Speech-Dataset-Corpus-v1 is a comprehensive dataset designed for research and development in multimodal machine learning, speech recognition, and accessibility technologies.

Notifications You must be signed in to change notification settings

dkourem/Multimodal_Greek_Sign_Language_and_Lip_Reading_v1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Multimodal_Greek_Sign_Language_and_Lip_Reading_v1

The Multimodal_Greek_Sign_Language_and_Lip_Reading_v1 is a comprehensive dataset designed for research and development in multimodal machine learning, speech recognition, and accessibility technologies.

Description

The Multimodal_Greek_Sign_Language_and_Lip_Reading_v1 is a comprehensive dataset designed for research and development in multimodal machine learning, speech recognition, and accessibility technologies. This dataset focuses on the Greek language and provides high-quality, synchronized data across three modalities:

  • Speech: High-resolution audio recordings of spoken Greek.
  • Lip Movement: Video recordings capturing detailed lip movements for lip-sync and speech alignment tasks.
  • Subtitles: Accurate text transcriptions and timestamps synchronized with audio and video.

Suitable for Tasks Such As:

  • Automatic Speech Recognition (ASR)
  • Audio-Visual Speech Recognition (AVSR)
  • Text-to-Speech (TTS) and Speech-to-Text (STT) applications
  • Accessibility tools for the deaf and hard of hearing
  • Multimodal sentiment analysis and natural language processing
  • Sign Language recogntion and translations tasks (SLR,SLT).

Key Features:

  • Language: Greek (Hellenic)
  • Format: Includes .wav for audio, .mp4 for video, and .srt/.json for subtitles and timestamps.
  • High Quality: Captured in a professional studio environment for maximum clarity and precision.
  • Accessibility: Designed with a focus on inclusivity and use in accessibility research.
  • Sign Language: High quality Video Greek Sign Language recordings capturing detailed sign language and facial expressions with synced subtitles and audio speech translation.

Versioning

This is version 1.0 of the dataset. Future versions may include additional speakers, dialects, or extended annotations.

Citation

If you use this dataset, please cite it as: Dimitris Kouremenos, Klimis Ntalianis. (2024). Greek-Multimodal-Speech-Dataset-Corpus-v1. Zenodo. https://doi.org/[Insert DOI Here]

Acknowledgments

Special thanks to the contributors, linguists, and accessibility experts involved in the dataset creation process.

Dataset repository

The dataset is located at https://huggingface.co/datasets/dkourem/Multimodal_Greek_Sign_Language_and_Lip_Reading

License and Usage Restrictions

The Multimodal_Greek_Sign_Language_and_Lip_Reading_v1 is protected under the following terms:

Copyright

Copyright © 2024 Dr. Dimitris Kouremenos and Prof. Klimis Ntalianis.
All rights reserved.

Usage Restrictions

  • Any use, reproduction, modification, or distribution of this dataset and the associated code requires prior written permission from the authors.
  • The dataset and code are intended for academic and research purposes only. Commercial use is strictly prohibited unless explicitly authorized and approved in writing by the authors.

Requesting Permission

For inquiries and permissions, please contact:

  • Dr. Dimitris Kouremenos: [dkourem[alpha]@gmail.com]
  • Prof. Klimis Ntalianis: [kntal[alpha]uniwa.gr]

By accessing this repository or dataset, you agree to comply with the terms stated above.

About

The Greek-Multimodal-Speech-Dataset-Corpus-v1 is a comprehensive dataset designed for research and development in multimodal machine learning, speech recognition, and accessibility technologies.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages