Skip to content

sanskrit-sandhi/sanskrit

Repository files navigation

sanskrit

sanskrit computational linguistics

pythonTransliteration: Framework for transliteration using python (Code is not complete contains only framework)

sandhiToolEvaluation: Contains python code for automatically evaluate sandhi tools namely - jnu, inria, uoh

workspace: contains Java frame work for transliteration, and uoh sandhi corpus merging.

astadhyay: contains the splits extracted from (http://sanskritdocuments.org/learning_tools/ashtadhyayi/)

dictionary: contains all words extracted from sanskrit-sanskrit and sanskrit-english dictionary from http://www.sanskrit-lexicon.uni-koeln.de/

goldCorpus: contains manually created sandhis based on 271 rules and their out puts from different sandhi splitting tools.

presentation: midterm presentation

uohCorpus: Manually created sandhi corpush from uoh website http://sanskrit.uohyd.ac.in/Corpus/ - Files with extension '.txt' are the original files - Files with extension '.out' are the files after merging multiple lines. - Files with extension '.single.out' are the words where the left side (merged word) is of length one - Files with extension '.txt.out.dict' are the files where all the right side words (splitted words) are in the dictionary.

goldCorpus.xls: Outputs of manual and automated tool.

About

sanskrit computational linguistics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published