-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError when using prepare_for_training() on multi-label annotated Argilla dataset with single annotations #2665
Comments
davidberenstein1957
added a commit
that referenced
this issue
Apr 10, 2023
2 tasks
This was referenced Apr 10, 2023
davidberenstein1957
added a commit
that referenced
this issue
May 3, 2023
…es (#2691) # Description Updated the argilla.training integration Closes #2658 Closes #2665 Closes #2659 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) - [ ] Documentation update **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) argilla/tests/training/* **Checklist** - [ ] I have merged the original branch into my forked branch - [ ] I added relevant documentation - [ ] follows the style guidelines of this project - [ ] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Alvaro Bartolome <alvarobartt@yahoo.com>
@m-newhauser thanks again for reporting this. This was resolved in 1.7.0 |
14 tasks
frascuchon
added a commit
that referenced
this issue
May 9, 2023
# Description Updated the argilla.training integration Closes #2658 Closes #2665 Closes #2659 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) - [ ] Documentation update **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) argilla/tests/training/* **Checklist** - [ ] I have merged the original branch into my forked branch - [ ] I added relevant documentation - [ ] follows the style guidelines of this project - [ ] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: david <david.m.berenstein@gmail.com> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Alvaro Bartolome <alvarobartt@yahoo.com>
Merged
frascuchon
added a commit
that referenced
this issue
May 10, 2023
## [1.7.0](v1.6.0...v1.7.0) ### Added - add `max_retries` and `num_threads` parameters to `rg.log` to run data logging request concurrently with backoff retry policy. See [#2458](#2458) and [#2533](#2533) - `rg.load` accepts `include_vectors` and `include_metrics` when loading data. Closes [#2398](#2398) - Added `settings` param to `prepare_for_training` ([#2689](#2689)) - Added `prepare_for_training` for `openai` ([#2658](#2658)) - Added `ArgillaOpenAITrainer` ([#2659](#2659)) - Added `ArgillaSpanMarkerTrainer` for Named Entity Recognition ([#2693](#2693)) - Added `ArgillaTrainer` CLI support. Closes ([#2809](#2809)) ### Changed - Argilla quickstart image dependencies are externalized into `quickstart.requirements.txt`. See [#2666](#2666) - bulk endpoints will upsert data when record `id` is present. Closes [#2535](#2535) - moved from `click` to `typer` CLI support. Closes ([#2815](#2815)) - Argilla server docker image is built with PostgreSQL support. Closes [#2686](#2686) - The `rg.log` computes all batches and raise an error for all failed batches. - The default batch size for `rg.log` is now 100. ### Fixed - `argilla.training` bugfixes and unification ([#2665](#2665)) - Resolved several small bugs in the `ArgillaTrainer`. ### Deprecated - The `rg.log_async` function is deprecated and will be removed in next minor release.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I'm getting a
ValueError
when trying to use theprepare_for_training()
method to prepare a multi-label annotated Argilla dataset for training. I'm only getting the error when a given record has just a single annotation. Everything works fine if all records have more than one assigned annotation label.To Reproduce
Generates the following error:
Expected behavior
Expect the method to return a
Dataset
that is ready for training.Environment (please complete the following information):
Additional context
The method works properly when the given record is multi-label AND has more than one annotation label, for example:
Returns:
The text was updated successfully, but these errors were encountered: