Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/setfithead multi target #272

Merged
merged 17 commits into from
Jan 19, 2023

Conversation

Yongtae723
Copy link
Contributor

I tried to solve conflict error and fix some error

please check this PR is what you intended.

also, some I failed some test...and I cannot figure out the reason...

tests/test_trainer.py Outdated Show resolved Hide resolved
tests/test_trainer.py Show resolved Hide resolved
Yongtae and others added 2 commits January 14, 2023 05:01
Copy link
Member

@tomaarsen tomaarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes! I think we're getting really close. I made some comments based on some recent PRs that got merged since #212 was made. In particular, removing support for numpy arrays in the differentiable head, and removing out_features=1: 2 is now the minimum.

I've made these changes and pushed them to this PR. Make sure to git pull them if you want to make more changes of your own. In short, all of the comments that I made with this review are now resolved (but you can still look at them if you want details on why I made some changes in 36f65bb).

As for your comments regarding the trainer.freeze(), I'm not sure what caused the issue, but it seems to be gone after I made my changes.

src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
src/setfit/modeling.py Outdated Show resolved Hide resolved
@tomaarsen
Copy link
Member

I ran some experiments using the multiclass classification for the different heads.

Dataset

Dataset generation script
from setfit import SetFitModel, SetFitTrainer, sample_dataset
from datasets import load_dataset

dataset = load_dataset("SetFit/hate_speech_offensive")


def to_multiclass(sample):
    """
    from
        (0: 'hate-speech', 1: 'offensive-language' or 2: 'neither')
    to
        ([1, 0]: 'hate-speech', [1, 1]: 'offensive-language' or [0, 0]: 'neither')
    """
    label = sample["label"]
    sample["label"] = [1 if label == 0 else 0, 1 if label == 1 else 0]
    return sample


# Simulate the few-shot regime by sampling 8 examples per class
train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8).map(to_multiclass)
eval_dataset = dataset["test"].map(to_multiclass)

I want to point out that this isn't a very natural use of a multiclass dataset. That said, I couldn't find an actual multiclass dataset on the Hub.

Training Scripts

Logistic Regression head
trainer = SetFitTrainer(
    model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)
trainer.train()
metrics = trainer.evaluate()
Differentiable head
trainer = SetFitTrainer(
    model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

# Train and evaluate
trainer.freeze() # Freeze the head
trainer.train() # Train only the body

# Unfreeze the head and freeze the body -> head-only training
trainer.unfreeze(keep_body_frozen=True)

trainer.train(
    num_epochs=25, # The number of epochs to train the head or the whole model (body and head)
    batch_size=16,
    body_learning_rate=1e-5, # The body's learning rate
    learning_rate=1e-2, # The head's learning rate
    l2_weight=0.0, # Weight decay on **both** the body and head. If `None`, will use 0.01.
)

And the model for testing was "sentence-transformers/paraphrase-mpnet-base-v2".

Results

Model Evaluation Accuracy
Multilabel Logistic Regression ("multi-output") 0.5550 (0.0000)
Multilabel Logistic Regression ("one-vs-rest") 0.5550 (0.0000)
Multilabel Logistic Regression ("classifier-chain") 0.5580 (0.0000)
Multilabel Differentiable Head 0.5849 (0.0044)

Notes:

  1. Evaluation accuracy is displayed as mean and standard deviation between 5 executions.
  2. The same seed is used in SetFitTrainer and sample_dataset between all executions, so all executions work on the same training data.
  3. The Logistic Regression performance was identical between different executions (perhaps due to the same seed between executions)
  4. For the differentiable head, there is no difference between the "multi-output" and "one-vs-rest" strategies, and "classifier-chain" is unsupported. So, I ran experiments only with "multi-output".

To me, this is indicative that the multilabel differentiable head performs equivalently to the logistic regression head. With other words, this PR seems to be successful in adding support to SetFitHead for with multi-label classification! 🎉

  • Tom Aarsen

@tomaarsen tomaarsen added the enhancement New feature or request label Jan 14, 2023
@Yongtae723
Copy link
Contributor Author

Yongtae723 commented Jan 15, 2023

I always feel thank you for your thoughtful comments and edits. (Also comment of previous Issues)
not only that, you did an experiment! for us! thank you!!!!

I will check your editing!
thank you!

@Yongtae723
Copy link
Contributor Author

@tomaarsen
I also think the readme should be edited.

I think it would be better to submit the edit of the readme in a separate PR.
how do you think?

@Yongtae723
Copy link
Contributor Author

Yongtae723 commented Jan 15, 2023

I confirmed your change! thank you!

Also, I did a similar multi-label experiment and got a similar result.

So I think you can merge it into the main!

thank you

@tomaarsen
Copy link
Member

tomaarsen commented Jan 15, 2023

@tomaarsen I also think the readme should be edited.

I think it would be better to submit the edit of the readme in a separate PR. how do you think?

If the changes that you have planned to the README related to the changes from this PR, then I think the changes should be included in this PR. That way, the code and README get updated at the same time.

I'm glad to hear that your experiments work too!

@Yongtae723
Copy link
Contributor Author

I got it!

I edited the readme to reflect our changes.
Since I am not a native English speaker, my written English might be strange.
So please rewrite the readme if my English is not correct.

@tomaarsen tomaarsen linked an issue Jan 18, 2023 that may be closed by this pull request
Copy link
Member

@tomaarsen tomaarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm satisfied with almost everything in this PR, with one exception. I'm not sure what the best course of action is, nor what the "normal" approach for this is. Perhaps to provide the SetFitDataset with a label_postprocessing function that either converts them to floats or to longs, depending on what is needed?

src/setfit/modeling.py Outdated Show resolved Hide resolved
@Yongtae723
Copy link
Contributor Author

Yongtae723 commented Jan 19, 2023

I understand your concern and agree with you.

I fix some code for what you want to do.

@@ -277,6 +277,7 @@ def collate_fn(batch):

# convert to tensors
features = {k: torch.Tensor(v).int() for k, v in features.items()}
labels = torch.Tensor(labels).long()
labels = torch.Tensor(labels)
labels = labels.long() if isinstance(label, int) else labels.float()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my suggestion is to use a type of label

The type of label should be 'int' for the single label classification, but 'List' for multilabel classification.

we can write

label =  torch.Tensor(labels).long() if isinstance(label, int) else  torch.Tensor(labels).float()

but I felt it is too long so that I fix as I push.

Copy link
Contributor Author

@Yongtae723 Yongtae723 Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or

labels = labels.long() if len(labels.size())== 1 else labels.float()

whichever you want!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second solution is best!

labels = labels.long() if len(labels.size()) == 1 else labels.float()

That should accurately measure whether we are in a multitarget situation, even if the user accidentally supplies floats instead of integers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

your suggestion makes sense to me!
I will fix that!

Copy link
Member

@tomaarsen tomaarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now! Thanks for making all of these changes! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

I want to change the loss during multi label classification.
3 participants