Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not using entire data in MultiLabelClassification #1302

Closed
Sanjeet-panda-ssp opened this issue Nov 11, 2021 · 12 comments
Closed

Not using entire data in MultiLabelClassification #1302

Sanjeet-panda-ssp opened this issue Nov 11, 2021 · 12 comments
Labels
stale This issue has become stale

Comments

@Sanjeet-panda-ssp
Copy link

Sanjeet-panda-ssp commented Nov 11, 2021

trans1
I made 54000 data points to train a multilabel classifier but it is taking only 110 data points for training
for verification I tried this with other datasets and codes that were available in the net. In each case I observed it was using part of the data.

trans2

@IS5882
Copy link

IS5882 commented Nov 22, 2021

did you find the reason for this ? I am facing the same issue

@Sanjeet-panda-ssp
Copy link
Author

did you find the reason for this ? I am facing the same issue

Not yet

@IS5882
Copy link

IS5882 commented Nov 22, 2021

So when I added those 2 lines in my args, I no longer get the red bar 0%, so supposedly it is training and evaluating all data. But the performance is the same (output results) which is making me skeptical about it! did this actually fix the issues? is there's a way to double-check that my model is training on all the data I provided?

 "use_multiprocessing":False,
 "use_multiprocessing_for_evaluation":False,

@Sanjeet-panda-ssp
Copy link
Author

So when I added those 2 lines in my args, I no longer get the red bar 0%, so supposedly it is training and evaluating all data. But the performance is the same (output results) which is making me skeptical about it! did this actually fix the issues? is there's a way to double-check that my model is training on all the data I provided?

 "use_multiprocessing":False,
 "use_multiprocessing_for_evaluation":False,

It s highly likely not working because the loading time of the dataset is almost the same in my case. As in I have 54000 dataset and it takes almost same time as previous to run that part. Red bar of 0 percent not showing doesn't mean it has considered the entire dataset.

@IS5882
Copy link

IS5882 commented Nov 23, 2021

It s highly likely not working because the loading time of the dataset is almost the same in my case. As in I have 54000 dataset and it takes almost same time as previous to run that part. Red bar of 0 percent not showing doesn't mean it has considered the entire dataset.

Yes I am also skeptical. The thing is I had that 0% red bar on both model.train and model.evaluate, it shows that it is evaluating on 4/1946 sentences and gives an F-measure of 99%! yet it also shows that # of False Positive is 10 and False Negative is 9 (the rest of the 1946 is either TN or TP)! so it does mean that it is evaluating on the whole test set even with the 0% bar! (I would assume the same on training although I can't verify that)

What I also did to verify the 99% F-measure, I evaluated sentence by sentence of my test set using model.predict(sentence)! I calculated the total number of labels mismatch I got a total 19 mismatches (which is the same as FN + FP that I got with model.evaluate) so it means that the evaluation using model.evaluate was correct even with the red bar at 0% and showing 4/1946!

Yet, I am not comfortable in using SimpleTransformers because I am still skeptical that something might be wrong.

@ThilinaRajapakse
Copy link
Owner

I've never encountered this issue myself. If I had to guess, it's probably something to do with the Jupyter environment, tqdm (the progress bar library), and multiprocessing not playing well together. But, it seems to be a problem with the progress bar updating rather than the training/evaluation itself.

@glher
Copy link

glher commented Dec 6, 2021

I'm getting the exact same issue on Google Colab.

Screen Shot 2021-12-06 at 11 11 57 AM

model = ClassificationModel('roberta', 'roberta-base', num_labels=2, args=model_args)
print('Training on {:,} samples...'.format(len(df_train_90)))
# Train the model, testing against the validation set periodically.
model.train_model(df_train_90, eval_df=validation_df)

My model_args are all default for multiprocessing, etc. Considering the results, the plotted WandB outputs (number of FN, TN, etc), the fact that all mentions of this issue have been Google Colab/Jupyter, and the significant size of the cached file, I would find it very likely that as @ThilinaRajapakse says, it's a display problem. It would be fantastic to have the definite proof though!

Note that I tested on two different dataset on my Google Colab, and the displayed progressed bar stopped at 0.2% both times. This is exactly the same as @IS5882 (4 out of 1946). Unlikely to be a coincidence, and unless there's something in the code about a 0.2%, it would seem more a notebook display issue indeed.

Out of curiosity, and for peace of mind and quick sanity check, for the people who have performed calculations without encountering this issue, how long would you expect the feature conversion to take for around 10,000 features? A few seconds, or 30 minutes, as shown in the screenshot?

Looked at it more, and the cached file that is created (e.g. cached_train_roberta_128_2_2) does contain the entire dataset. One can test it by downloading the cached file, and doing:

import torch
data = torch.load('cached_train_roberta_128_2_2')
print(data[0]['input_ids'].size())

@IS5882
Copy link

IS5882 commented Dec 11, 2021

@glher as @ThilinaRajapakse said it is just a display issue, I would assume that your model is training fine. What I did to validate that (as I mentioned in my comment above) is using model.predict on each sentence and manually count the # of fp, fn, tp and tn which are all the same as model.eval, so it is training and testing correctly,

@tarikaltuncu
Copy link

Have you indeed checked the GPU stats when it stalls? I checked mine and saw that GPU power consumption reduces to idle while model still is on GPU memory. I do not think that it will continue the training even if I wait forever.

image

image

@HponeMK
Copy link

HponeMK commented Mar 24, 2022

experiencing same issue here
image

whole dataset is ~48k words but on epochs bar it only shows 6k

@ThilinaRajapakse
Copy link
Owner

@tarikaltuncu Your issue seems to be different. The estimated time for tokenization is 131 hours for some reason. The GPU is idle because training hasn't started yet.

@stale
Copy link

stale bot commented Jun 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale This issue has become stale label Jun 12, 2022
@stale stale bot closed this as completed Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This issue has become stale
Projects
None yet
Development

No branches or pull requests

6 participants