README.md

Natural Language Processing

Explain differences between the loss functions used for logistic regression and perceptron.

Explain how the chain rule and the Markov assumption are used to estimate the maximum likelihood of a word sequence.

Explain differences between the Laplace smoothing and the Discount smoothing.