14th Aug 2022
TODOS
- build the validation tensorboard
- simplify the model forward prop code
- test & validate the training process with the mini-batch GDS
- inject the L2 regularsation for the embedding space
- check the outcome of not using a softmax activation funciton
- Need to debug MHAs as the gradient does not change across children