-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Allow passing model and tokenizer to ArgillaTrainer directly #3751
Conversation
The test failures seem unrelated, all |
@tomaarsen, we did not but Gabri just mentioned to rerun it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good! some tiny remarks.
Co-authored-by: David Berenstein <david.m.berenstein@gmail.com>
Also removed tokenizer from setfit (where it didn't do anything) and updated some docstrings
…illa-io/argilla into feat/trainer_model_tokenizer
Codecov ReportPatch coverage is 📢 Thoughts on this report? Let us know!. |
The URL of the deployed environment for this PR is https://argilla-quickstart-pr-3751-ki24f765kq-no.a.run.app |
Hello! # Argilla Community Growers Ever since #3751, `model` can also be an already initialized model. This edge case was being missed before. This should help with the test failures on #3911. **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) - [ ] Documentation update **How Has This Been Tested** `pytest .\tests\integration\client\feedback\training\test_trainer.py::test_argilla_trainer_text_classification_with_model_tokenizer` **Checklist** - [ ] I added relevant documentation - [ ] follows the style guidelines of this project - [x] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --- - Tom Aarsen
Hello!
Description
Closes #3631.
This is important to give users freedom to very specifically set up their tokenizer. This is required e.g. for SFT with TRL.
Type of change
How Has This Been Tested
Updated the relevant tests (TRL, Transformers) to also train with the passed model & tokenizer.
Checklist
CHANGELOG.md
file (See https://keepachangelog.com/)TODO: