Reproducibility with baal active learning | Hugging Face | Classification #250
Unanswered
nitish1295
asked this question in
Q&A
Replies: 1 comment 1 reply
-
I don't think it's due to #247, my hunch would be that the seed is not the same for both training? Is If you have a quick unit test for this I could debug more easily. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Consider the following scenario
Experiment Setting:
The 2 experiments are as follows:
Exp_A : You start your active learning loop, similar to what we have in the PR Baal in Production Notebook | Classification | NLP | Hugging Face #245. And you label 5 samples so now your train is 10 samples and pool in 15 samples. You reinitialize the weights and train/fine tune the model based on the 10 samples and evaluate you model on those 20 samples and get metrics_A. You also save your updated pool and train somewhere on the system, this is relevant for the next experiment
Exp_B - Now suppose you use the updated training data(read it from the system) at the start of this experiment, which means we use 10 samples to train the model(the same set of labelled samples which we have at the end of Exp_A) and the same 20 evaluation samples to get, we do this before we get to the active learning part. Along with the same weights for the initial model and get evaluation metrics metrics_B.
Ideally
assert metrics_A == metrics_B
should pass, but based on what I have tried this does not happen.Few things I have tried:
I am reading some more about reproducibility in Pytorch at pytorch/pytorch#7068 and will try to add a gist to this discussion as well to show what is happening.
Q/A
Wanted to check this since it might impact #247 since if we can't have that reproducibility then there are questions around the active learning process as suggested #247.
To be more precise the model.eval() we will get from the our jupyter notebook setup(which runs continuously, this is Exp_A) might be different from what we receive after we run an API call setup(since we are re-initializing the setup again but this time with the updated train data, this is Exp_B)
FYI also one of the reasons I have not merged my PR yet. Not sure why this is happening though.
Will try to solve this at my end(maybe I am doing something incorrectly). Please let me know if you have some suggestions
Beta Was this translation helpful? Give feedback.
All reactions