Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NLP Components to Benchmarking #213

Merged
merged 5 commits into from
May 4, 2023
Merged

Add NLP Components to Benchmarking #213

merged 5 commits into from
May 4, 2023

Conversation

wistuba
Copy link
Contributor

@wistuba wistuba commented May 3, 2023

Add RenateModule for Hugging Face transformer and make datasets and transformers available in benchmarking script. Tests are failing for other reasons :(

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

github-actions bot commented May 3, 2023

Coverage report

The coverage rate went from 85.68% to 85.78% ⬆️

95.65% of new lines are covered.

Diff Coverage details (click to unfold)

src/renate/benchmark/datasets/nlp_datasets.py

100% of new lines are covered (90% of the complete file).

src/renate/benchmark/experiment_config.py

100% of new lines are covered (97.4% of the complete file).

src/renate/memory/buffer.py

100% of new lines are covered (94.11% of the complete file).

src/renate/training/training.py

100% of new lines are covered (85.12% of the complete file).

src/renate/benchmark/models/transformer.py

85.71% of new lines are covered (85.71% of the complete file).
Missing lines: 40, 43

src/renate/defaults.py

100% of new lines are covered (99.01% of the complete file).

@wistuba wistuba requested a review from lballes May 3, 2023 15:36
from renate.models import RenateModule


class HuggingFaceSequenceClassificationTransformer(RenateModule):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that necessary or could we just use a RenateWrapper, as I did in the NLP example? I'm fine with having it, as it might be a bit more convenient. But there's always a trade-off with additional code to maintain etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the RenateWrapper does not allow to save any additional arguments as in this case pretrained_model_name and num_outputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@@ -67,4 +67,4 @@

# Noticed different accuracy scores across Mac and GitHub Actions Workflows (which run on Linux)
# TODO see if we can align the Mac and Linux results
assert pytest.approx(test_config["expected_accuracy"]) == accuracies
assert pytest.approx(test_config["expected_accuracy"]) == accuracies, accuracies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this syntax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if fails, prints accuracies

@@ -13,26 +12,24 @@
from renate.models.renate_module import RenateWrapper


def model_fn(model_state_url: Optional[Union[Path, str]] = None) -> RenateModule:
def model_fn(model_state_url: Optional[str] = None) -> RenateModule:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're disallowing path objects here, should we also do that in the other examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the use of complex types as Path is no longer supported for our *_fn functions since 0.2.0. I can check for more occurrences but I should have removed most of them before

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see occurrences of Path in most example config files. If you don't want to remove then now, we should create an issue to do it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the remaining config files as well as the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@lballes lballes self-requested a review May 4, 2023 13:23
lballes
lballes previously approved these changes May 4, 2023
@wistuba wistuba merged commit 523099f into dev May 4, 2023
@wistuba wistuba deleted the mw-nlp branch May 4, 2023 15:39
lballes pushed a commit that referenced this pull request May 24, 2023
lballes added a commit that referenced this pull request May 24, 2023
* Add NLP Components to Benchmarking (#213)

* Robust Integration Tests (#214)

* Update Renate Config Example (#226)

* Make Wild Time Available in Benchmarking (#187)

* Fix `target_column` bug in `HuggingFaceTextDataModule` (#233)

* Add MMD covariate shift detector (#237)

* Add KS covariate shift detector (#242)

* Update dependabot.yml (#248)

* Update versions of some requirements (#247)

* Add doc page and example for shift detection (#244)

* Bump version (#252)

---------

Co-authored-by: Lukas Balles <lukas.balles@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants