Persian_g2p: A seq-to-seq model for Persian (Farsi) Grapheme To Phoneme mapping

forked from g2pE, training is implemented using pytorch; but numpy is used in the inference.

Persian_g2p converts Persian (Farsi) graphemes (spelling) to phonemes (pronunciation), which can be used as the pre-processing step for text-to-speech (TTS). Persian language, alike English, is in the category of complex languages, as Persian words are not always written in the way that they are pronounced.

Thanks to the useful repositories that are helpful in surviving Persian language and making this repo:

Tihu dictionary is a Persian pronunciation dictionary, useful for g2p mapping. However, it does not contain all the words. Thus, in this repo a sequence-to-sequence network with cross-entropy loss is trained on the tihu dictionary to handle OOVs.
hazm is used for the Persian text pre-processing and normalization. It was helpful in Word tokenizing and text cleaning, e.g. replacing ي and ك with ی and ک , respectively.
num2fawords package converts numbers to Persian words, e.g. 12.4 --> دوازده و چهار دهم

usage

installing dependencies

install python 3.
install requirements, as follows:

pip install -r requirements.txt

using a pre-trained model

Run the following command with your desired <Persian-text> to see its pronunciation.

python g2p.py --text <Persian-text>

For instance, this is the output for replacing the <Persian-text> with «زان یار دلنوازم شکریست با شکایت»:

python g2p.py --text زان یار دلنوازم شکریست با شکایت
>>> zAn yAr delnavAzam Sokrist bA SekAyat

Please note that the pronunciation of the words that are already available comes from the dictionary; while the pronunciation of the new words (OOV) are predicted using the trained model i.e. «دلنوازم» in this example.

You can also specify your checkpoint with --checkpoint option, as follows:

python g2p.py --text <Persian-text> --checkpoint <path-to-the-checkpoint>

training model using alternative word-pronunciation dictionary

The network architecture is actually a seq2seq GRU-RNN, in which the last hidden vector of the encoder is fed to the decoder, as the below figure.

You may take a look at the hparams.py for default network parameters. The following command starts the training process, making a log folder, naming logs-<run-name>, for storing checkpoint, results, as well as logs of loss and PERs.

 python train.py --name <run-name> --dictionary <path-to-the-word-pronunciation-dictionary>

TODO:

improving the performance; currently the PER is 3.9% on test set but not enough.
adding informal words (slang, such as چاکرتیم- واس ماس and broken language, e.g. نمیشه- میرم- بگم-همشون) to the dictionary

References

If you use this code for research, please cite:

@misc{Persian_g2p2019,
  author = {Rabiee, Azam},
  title = {Persian g2p},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/AzamRabiee/Persian_G2P}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
imgs		imgs
logs-01		logs-01
README.md		README.md
data_feeder.py		data_feeder.py
expand.py		expand.py
g2p.py		g2p.py
hparams.py		hparams.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Persian_g2p: A seq-to-seq model for Persian (Farsi) Grapheme To Phoneme mapping

usage

installing dependencies

using a pre-trained model

training model using alternative word-pronunciation dictionary

TODO:

References

About

Releases

Packages

Languages

AzamRabiee/Persian_G2P

Folders and files

Latest commit

History

Repository files navigation

Persian_g2p: A seq-to-seq model for Persian (Farsi) Grapheme To Phoneme mapping

usage

installing dependencies

using a pre-trained model

training model using alternative word-pronunciation dictionary

TODO:

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages