Not using entire data in MultiLabelClassification #1302

Sanjeet-panda-ssp · 2021-11-11T05:38:37Z

I made 54000 data points to train a multilabel classifier but it is taking only 110 data points for training
for verification I tried this with other datasets and codes that were available in the net. In each case I observed it was using part of the data.

IS5882 · 2021-11-22T08:13:55Z

did you find the reason for this ? I am facing the same issue

Sanjeet-panda-ssp · 2021-11-22T08:30:36Z

did you find the reason for this ? I am facing the same issue

Not yet

IS5882 · 2021-11-22T09:42:21Z

So when I added those 2 lines in my args, I no longer get the red bar 0%, so supposedly it is training and evaluating all data. But the performance is the same (output results) which is making me skeptical about it! did this actually fix the issues? is there's a way to double-check that my model is training on all the data I provided?

 "use_multiprocessing":False,
 "use_multiprocessing_for_evaluation":False,

Sanjeet-panda-ssp · 2021-11-23T06:05:56Z

So when I added those 2 lines in my args, I no longer get the red bar 0%, so supposedly it is training and evaluating all data. But the performance is the same (output results) which is making me skeptical about it! did this actually fix the issues? is there's a way to double-check that my model is training on all the data I provided?
 "use_multiprocessing":False,
 "use_multiprocessing_for_evaluation":False,

It s highly likely not working because the loading time of the dataset is almost the same in my case. As in I have 54000 dataset and it takes almost same time as previous to run that part. Red bar of 0 percent not showing doesn't mean it has considered the entire dataset.

IS5882 · 2021-11-23T08:56:47Z

It s highly likely not working because the loading time of the dataset is almost the same in my case. As in I have 54000 dataset and it takes almost same time as previous to run that part. Red bar of 0 percent not showing doesn't mean it has considered the entire dataset.

Yes I am also skeptical. The thing is I had that 0% red bar on both model.train and model.evaluate, it shows that it is evaluating on 4/1946 sentences and gives an F-measure of 99%! yet it also shows that # of False Positive is 10 and False Negative is 9 (the rest of the 1946 is either TN or TP)! so it does mean that it is evaluating on the whole test set even with the 0% bar! (I would assume the same on training although I can't verify that)

What I also did to verify the 99% F-measure, I evaluated sentence by sentence of my test set using model.predict(sentence)! I calculated the total number of labels mismatch I got a total 19 mismatches (which is the same as FN + FP that I got with model.evaluate) so it means that the evaluation using model.evaluate was correct even with the red bar at 0% and showing 4/1946!

Yet, I am not comfortable in using SimpleTransformers because I am still skeptical that something might be wrong.

ThilinaRajapakse · 2021-11-28T23:39:54Z

I've never encountered this issue myself. If I had to guess, it's probably something to do with the Jupyter environment, tqdm (the progress bar library), and multiprocessing not playing well together. But, it seems to be a problem with the progress bar updating rather than the training/evaluation itself.

glher · 2021-12-06T19:31:06Z

I'm getting the exact same issue on Google Colab.

model = ClassificationModel('roberta', 'roberta-base', num_labels=2, args=model_args)
print('Training on {:,} samples...'.format(len(df_train_90)))
# Train the model, testing against the validation set periodically.
model.train_model(df_train_90, eval_df=validation_df)

My model_args are all default for multiprocessing, etc. Considering the results, the plotted WandB outputs (number of FN, TN, etc), the fact that all mentions of this issue have been Google Colab/Jupyter, and the significant size of the cached file, I would find it very likely that as @ThilinaRajapakse says, it's a display problem. It would be fantastic to have the definite proof though!

Note that I tested on two different dataset on my Google Colab, and the displayed progressed bar stopped at 0.2% both times. This is exactly the same as @IS5882 (4 out of 1946). Unlikely to be a coincidence, and unless there's something in the code about a 0.2%, it would seem more a notebook display issue indeed.

Out of curiosity, and for peace of mind and quick sanity check, for the people who have performed calculations without encountering this issue, how long would you expect the feature conversion to take for around 10,000 features? A few seconds, or 30 minutes, as shown in the screenshot?

Looked at it more, and the cached file that is created (e.g. cached_train_roberta_128_2_2) does contain the entire dataset. One can test it by downloading the cached file, and doing:

import torch
data = torch.load('cached_train_roberta_128_2_2')
print(data[0]['input_ids'].size())

IS5882 · 2021-12-11T22:58:52Z

@glher as @ThilinaRajapakse said it is just a display issue, I would assume that your model is training fine. What I did to validate that (as I mentioned in my comment above) is using model.predict on each sentence and manually count the # of fp, fn, tp and tn which are all the same as model.eval, so it is training and testing correctly,

tarikaltuncu · 2022-01-09T06:50:12Z

Have you indeed checked the GPU stats when it stalls? I checked mine and saw that GPU power consumption reduces to idle while model still is on GPU memory. I do not think that it will continue the training even if I wait forever.

HponeMK · 2022-03-24T04:11:46Z

experiencing same issue here

whole dataset is ~48k words but on epochs bar it only shows 6k

ThilinaRajapakse · 2022-03-24T09:25:14Z

@tarikaltuncu Your issue seems to be different. The estimated time for tokenization is 131 hours for some reason. The GPU is idle because training hasn't started yet.

stale · 2022-06-12T17:57:13Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale bot added the stale This issue has become stale label Jun 12, 2022

stale bot closed this as completed Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not using entire data in MultiLabelClassification #1302

Not using entire data in MultiLabelClassification #1302

Sanjeet-panda-ssp commented Nov 11, 2021 •

edited

Loading

IS5882 commented Nov 22, 2021

Sanjeet-panda-ssp commented Nov 22, 2021

IS5882 commented Nov 22, 2021

Sanjeet-panda-ssp commented Nov 23, 2021

IS5882 commented Nov 23, 2021 •

edited

Loading

ThilinaRajapakse commented Nov 28, 2021

glher commented Dec 6, 2021 •

edited

Loading

IS5882 commented Dec 11, 2021 •

edited

Loading

tarikaltuncu commented Jan 9, 2022

HponeMK commented Mar 24, 2022

ThilinaRajapakse commented Mar 24, 2022

stale bot commented Jun 12, 2022

Not using entire data in MultiLabelClassification #1302

Not using entire data in MultiLabelClassification #1302

Comments

Sanjeet-panda-ssp commented Nov 11, 2021 • edited Loading

IS5882 commented Nov 22, 2021

Sanjeet-panda-ssp commented Nov 22, 2021

IS5882 commented Nov 22, 2021

Sanjeet-panda-ssp commented Nov 23, 2021

IS5882 commented Nov 23, 2021 • edited Loading

ThilinaRajapakse commented Nov 28, 2021

glher commented Dec 6, 2021 • edited Loading

IS5882 commented Dec 11, 2021 • edited Loading

tarikaltuncu commented Jan 9, 2022

HponeMK commented Mar 24, 2022

ThilinaRajapakse commented Mar 24, 2022

stale bot commented Jun 12, 2022

Sanjeet-panda-ssp commented Nov 11, 2021 •

edited

Loading

IS5882 commented Nov 23, 2021 •

edited

Loading

glher commented Dec 6, 2021 •

edited

Loading

IS5882 commented Dec 11, 2021 •

edited

Loading