The Deep_Learning_RNNs.ipynb file is divided in two parts, coding and theory.
- LSTM cell
- Vanilla cell
- GRU cell
- Regular RNN model
- Bidirectional RNN model
- What is the vanishing gradients problem and why does it occur? Which activation functions are more or less impacted by this, and why?
- Why do LSTMs help address the vanishing gradient problem compared to a vanilla RNN?
- By observing 3 training curves (epochs vs. performance), which curve belongs to each type of RNN (vanilla, GRU, and LSTM)?
- When might you choose to use each of the three different types of models (vanilla, GRU, LSTM)?