Skip to content

Commit

Permalink
Add Idea dump
Browse files Browse the repository at this point in the history
  • Loading branch information
NirantK authored Apr 8, 2018
1 parent 62001fe commit 545dd0f
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,15 @@ State-of-the-Art Language Modeling and Text Classification in Hindi Language

### TODO
- [x] Language modeling based on wikipedia dump
- [x] Release Language Models: [Hindi Language Model](https://www.dropbox.com/s/4xef1wcaoon1wd4/hindi2vec-models.7z?dl=0)**
- [ ] Create text classification data
- [x] Release Language Models: [Hindi Language Model](https://www.dropbox.com/s/4xef1wcaoon1wd4/hindi2vec-models.7z?dl=0)
- [ ] Create Text classification Datasets
- [ ] Benchmark text classification with FastText
- [ ] Fine-tuning model for text classification
- [ ] Add a leaderboard and allow submission, similar to SQuAD

#### Idea Dump
- [ ] Change the custom head to be used for transliteration instead of classification, Hindi script (Devnagri) to English script (Roman)
- [ ] MTL tasks for training and inference using custom heads
- [ ] Text to Speech - using datasets from news recordings or Hindi subtitles of dubbed movies

**Special thanks to Jeremy, Rachel and other contributors to [fastai](https://github.com/fastai/fastai)**. This work is a reproduction of their work in English to Hindi. Thanks to @cstorm125 for [thai2vec](https://github.com/cstorm125/thai2vec) which inspired this work.

0 comments on commit 545dd0f

Please sign in to comment.