This repository represents my work done while completing the Natural Language Processing with Deep Learning course. I follow the 2021 versions for Assignment 1 and 2, the Winter 2024 version for assignment 3, and Spring 2024 version for Assignment 4. All the material is available here https://web.stanford.edu/class/cs224n/
Enjoy the code and the math 😉
Problem statement and my solution 📜
This assignment is about exploring the idea of representing words as vectors. Several methods of embedding a word into a vector are put to the test and are visualized using dimenionality reduction. Examined word embedding methodologies are:
- co-occurence word embeddings
- prediction based word embeddings (word2vec, GloVe)
This assignment studies two loss functions for calculating the accuracy of two prediction-based word embedding techniques: 1) naive-softmax and 2) negative-sampling. The derivatives of the loss functions with respect to all the parameters are then computed to allow the implementation of stochastic gradient descent and to create a word embedding model from scratch.
Latex type-set pdf answering the written portion of the assignment 📜
Implementation of stochastic gradient descent ⛰️
This assignment contains a theoretical and practical part.
- Theoretical: we explore how learning can be improved using Adam Optimizer and Dropout.
- Practical: we create a Neural Network model that predicts the dependencies of words in sentences. The dependency parsing model finds the dependencies by using a transition mechanism. More precisely, the model predicts the next transition from the current state of the parsing.
Write-up of the assignment containing my answers 📜
Transition mechanism between the parser states
Neural network predicting which transition to apply 🧠
Model training code using Adam optimizer 🏋️
Below is a table summarizing the performance of the model on the Unlabeled Attachment Score metric. We can see that the model outperforms the target performance objective by 1.8%.
UAS | Target UAS | |
---|---|---|
Validation set | 88.80 | 87 |
Test set | 89.06 | N/A |