sanskrit computational linguistics
pythonTransliteration: Framework for transliteration using python (Code is not complete contains only framework)
sandhiToolEvaluation: Contains python code for automatically evaluate sandhi tools namely - jnu, inria, uoh
workspace: contains Java frame work for transliteration, and uoh sandhi corpus merging.
astadhyay: contains the splits extracted from (http://sanskritdocuments.org/learning_tools/ashtadhyayi/)
dictionary: contains all words extracted from sanskrit-sanskrit and sanskrit-english dictionary from http://www.sanskrit-lexicon.uni-koeln.de/
goldCorpus: contains manually created sandhis based on 271 rules and their out puts from different sandhi splitting tools.
presentation: midterm presentation
uohCorpus: Manually created sandhi corpush from uoh website http://sanskrit.uohyd.ac.in/Corpus/ - Files with extension '.txt' are the original files - Files with extension '.out' are the files after merging multiple lines. - Files with extension '.single.out' are the words where the left side (merged word) is of length one - Files with extension '.txt.out.dict' are the files where all the right side words (splitted words) are in the dictionary.
goldCorpus.xls: Outputs of manual and automated tool.