Note: This repo is updated regularly as I learn. If you want to learn NLP, just start from the first point and go on till the bottom. Everything is hierarchically arranged (from basic concepts to advanced)
- A comprehensive article on tokenization (covers subword tokenization, Byte Pair Encoding, Unigram Subword Tokenization, WordPiece, SentencePiece)
- Subword Tokenization
- Byte Pair Encoding
- Byte Pair Encoding - Wikipedia (simplest explanation)
- Byte Pair Encoding - Paper
- Byte Pair Encoding (video) - Abhishek Thakur
- Comprehensive notebook
Spacy comes really handy to perform NLP tasks at state-of-the-art computation speeds. Here are some tutorials to get familiar with it
Great, after implementing a basic project it's time to get a bit mathematical
Watch the first lecture of the most sought after course of NLP (CS224N by Stanford)
- Efficient Estimation of Word Representations in Vector space - original word2vec paper
- Distributed Representations of Words and Phrases and their Compositonality - negative sampling paper
After watching above lecture and going through the suggested readings (Stanford CS224N), let's understand more about word2vec
Great, now it's time to do one more projects to solidify the concepts learnt so far
Suggested Readings
- GloVe: Global Vectors for Word Representation (original GloVe paper)
- Improving Distributional Similarity with Lessons Learned from Word Embeddings
- Evaluation methods for unsupervised word embeddings
Additional Readings
- Linear Algebraic Structure of Word Senses, with Applications to Polysemy
- On the Dimensionality of Word Embedding
- A Latent Variable Model Approach to PMI-based Word Embeddings
We have been seeing word embeddings applied in NLP to get the vector representation of the words. But now let's try them on tabular dataset with categorical features. We will convert the categorical features in word embeddings rather than traditional approaches like one-hot encoding, label encoding, etc.
Let's now move on to the deep learning part of NLP.
Suggested Readings
- Matrix Calculus Notes
- Review of differential calculus
- CS231N notes on network architecture
- CS231N notes on backprop
- Derivatives, Backpropagation, and Vectorization
- Learning representations by back-propagating errors
Additional Readings
This material will get you started with RNN
- Will add resources here soon (forgot to add while learning :( )
- Seq to Seq Models with Attention - Jay Alammar
- Attention in detail
- Attention Mechanism in Deep Learning