Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error when cls_pooling="mean" or cls_pooling="max" for TransformerDocumentEmbeddings #3558

Conversation

fkdosilovic
Copy link
Contributor

@fkdosilovic fkdosilovic commented Oct 19, 2024

  • Refactor "cls" pooling option into a function.
  • Return result from document_mean_pooling and document_max_pooling functions.
  • Add unit tests for testing each cls pooling option.
  • Refactor test_default_embeddings_stay_the_same_after_saving_and_loading and test_embeddings_stay_the_same_after_saving_and_loading in BaseEmbeddingsTest to use torch.allclose

Fixes #3552.

@fkdosilovic
Copy link
Contributor Author

fkdosilovic commented Oct 29, 2024

I thought that the two failing tests (test_default_embeddings_stay_the_same_after_saving_and_loading for TransformerWordEmbeddings and TransformerDocumentEmbeddings) were somehow due to my changes, but this doesn't seem to be the case.

When running tests locally, every test passes. Then I ran the ci.yml using act to simulate the CI job locally. I ran the act -j 'test' and got those tests to fail on master.

Looking at the output, the problem was in comparing the floating point values (tensors) with == instead of using something as torch.allclose.

I changed this == to use torch.allclose in both test_default_embeddings_stay_the_same_after_saving_and_loading and test_embeddings_stay_the_same_after_saving_and_loading.

Did you encounter similar problems before?

tests/embedding_test_utils.py Outdated Show resolved Hide resolved
@helpmefindaname
Copy link
Collaborator

Thank you for the bugfix @fkdosilovic
I will merge as soon as the tests complete

@helpmefindaname helpmefindaname merged commit 9d2c1ed into flairNLP:master Nov 29, 2024
1 check passed
@fkdosilovic fkdosilovic deleted the bug-fix-3552-document-embeddings-with-cls-pooling branch November 29, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Error when cls_pooling="mean" or cls_pooling="max"
2 participants