SAINT on Ednet data #4

clara2911 · 2021-06-03T12:45:38Z

Would you be able to add your code for running your implementation of SAINT on EdNet as well - besides the example on random data?

kwonmha · 2022-02-16T03:20:11Z

It would be great for me to have entire codes for reproducing the results on the paper either.
Because I failed to get the performance with my implementation.
Mine was about 55 compared to 78 from the paper.

Nino-SEGALA · 2022-04-13T08:44:52Z

Hi kwonmha,
I also try to reproduce the paper :)

This implementation of SAINT is not completely finished.
For example, the dropout is not added, or the position_embeddings are wrongly added in every layer (instead of just in the first encoder/decoder), the LayerNorm should be placed like in the AttentionIsAllYouNeed paper (after the multi-head and after the FFN).
You can also change the position encoding to the one from AttentionIsAllYouNeed.

All the rest is correct :) I reach AUC=0.76 with it but I'm not able to get the last 2% and also my metrics crash if I use a dimenson_model of 512 like in the paper (it works only with a smaller model)

kwonmha · 2022-04-14T04:23:09Z

Hi, @Nino-SEGALA
Thanks for the informing.

In my case, I think the problem exists in data processing or data itself, not in modeling.
Because my model works fine with Ednet data from Kaggle.

Do you have any plan to upload your code on your github?

Nino-SEGALA · 2022-04-15T13:15:53Z

I will try to upload it here with a Pull Request :)

I don't understand, it works with EdNet from Kaggle, but not with EdNet from the paper?
What is the difference between them?
Can you link both datasets? :)

kwonmha · 2022-04-18T02:00:33Z

@Nino-SEGALA Here's the link to the dataset I mentioned.
https://github.com/riiid/ednet
It's KT-1 and you also need to download content data.

Nino-SEGALA · 2022-04-19T07:17:00Z

Yes, I also use this one (and get 0.76 AUC with dim_model=128, if I use a larger model dim_model=512 I get AUC=0.5 me too :/)

Maybe you can try with a smaller model

And this dataset 'my model works fine with Ednet data from Kaggle' ? :)

kwonmha · 2022-04-19T08:23:29Z

Thanks for informing!

Nino-SEGALA · 2022-04-26T09:28:08Z

@kwonmha
Here's the correct code of SAINT
#6

Let me know if you solved the training of SAINT with a large dimension of the model (d_model=512) :D

kwonmha · 2022-05-25T06:25:52Z

@Nino-SEGALA Have you tried applying Noam scheme learning rate scheduling mentioned on the paper?
It's in Training Details section.

I got the same problem where auc stays around 0.5 with dimension 256, 512.
And validation auc is going above 0.7 with Noam scheme.
It looks neccessary for training large transformer model.

Noam scheduler code link
I used Lina Achaji's code.
And I added

    def zero_grad(self):
        self.optimizer.zero_grad()

In the class for convenience.

As it changes leraning rate regard to step, batch_size looks important which have effect on the number of steps in training.

I got 0.7746 AUC with dim 256, 7727 with dim 512

Nino-SEGALA · 2022-06-04T16:01:39Z

Thanks a lot for your comment kwonmha!

I did my training without Noam Scheme, and since I have implemented it, I didn't retry to do the big trainings.
It is indeed what's making the difference!
I didn't reach metrics as high as you, but my model didn't train until convergence (it stopped a bit before). I'll let you know when I have my final results :D

@kwonmha could you also share your ACC, RMSE and BCE loss if you have them?

kwonmha · 2022-06-08T04:28:05Z

Sorry but I haven't measure metrics other than AUC so far.

Nino-SEGALA · 2022-06-10T12:21:31Z

I got 0.7666 AUC with dim 256, 0.7537 with dim 512
And my dim 512 training crashed after (AUC=0.6), even if it uses Noam Scheme now :/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAINT on Ednet data #4

SAINT on Ednet data #4

clara2911 commented Jun 3, 2021

kwonmha commented Feb 16, 2022

Nino-SEGALA commented Apr 13, 2022

kwonmha commented Apr 14, 2022

Nino-SEGALA commented Apr 15, 2022

kwonmha commented Apr 18, 2022

Nino-SEGALA commented Apr 19, 2022

kwonmha commented Apr 19, 2022

Nino-SEGALA commented Apr 26, 2022

kwonmha commented May 25, 2022 •

edited

Loading

Nino-SEGALA commented Jun 4, 2022 •

edited

Loading

kwonmha commented Jun 8, 2022

Nino-SEGALA commented Jun 10, 2022

SAINT on Ednet data #4

SAINT on Ednet data #4

Comments

clara2911 commented Jun 3, 2021

kwonmha commented Feb 16, 2022

Nino-SEGALA commented Apr 13, 2022

kwonmha commented Apr 14, 2022

Nino-SEGALA commented Apr 15, 2022

kwonmha commented Apr 18, 2022

Nino-SEGALA commented Apr 19, 2022

kwonmha commented Apr 19, 2022

Nino-SEGALA commented Apr 26, 2022

kwonmha commented May 25, 2022 • edited Loading

Nino-SEGALA commented Jun 4, 2022 • edited Loading

kwonmha commented Jun 8, 2022

Nino-SEGALA commented Jun 10, 2022

kwonmha commented May 25, 2022 •

edited

Loading

Nino-SEGALA commented Jun 4, 2022 •

edited

Loading