add `prepare_for_training` methods for `openai` #2658

davidberenstein1957 · 2023-04-06T06:54:07Z

Is your feature request related to a problem? Please describe.
I want to fine-tune models on annotated data.

Describe the solution you'd like

[] ENTITY EXTRACTION
- KEYWORDS
- ENTITIES
[] CLASSIFICATION
- MULTI LABEL
- SINGLE LABEL
[] TEXT2TEXT

Describe alternatives you've considered
Not using an LLM

Additional context
N.A.

…es (#2691) # Description Updated the argilla.training integration Closes #2658 Closes #2665 Closes #2659 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) - [ ] Documentation update **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) argilla/tests/training/* **Checklist** - [ ] I have merged the original branch into my forked branch - [ ] I added relevant documentation - [ ] follows the style guidelines of this project - [ ] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Alvaro Bartolome <alvarobartt@yahoo.com>

# Description Updated the argilla.training integration Closes #2658 Closes #2665 Closes #2659 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) - [ ] Documentation update **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) argilla/tests/training/* **Checklist** - [ ] I have merged the original branch into my forked branch - [ ] I added relevant documentation - [ ] follows the style guidelines of this project - [ ] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: david <david.m.berenstein@gmail.com> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Alvaro Bartolome <alvarobartt@yahoo.com>

## [1.7.0](v1.6.0...v1.7.0) ### Added - add `max_retries` and `num_threads` parameters to `rg.log` to run data logging request concurrently with backoff retry policy. See [#2458](#2458) and [#2533](#2533) - `rg.load` accepts `include_vectors` and `include_metrics` when loading data. Closes [#2398](#2398) - Added `settings` param to `prepare_for_training` ([#2689](#2689)) - Added `prepare_for_training` for `openai` ([#2658](#2658)) - Added `ArgillaOpenAITrainer` ([#2659](#2659)) - Added `ArgillaSpanMarkerTrainer` for Named Entity Recognition ([#2693](#2693)) - Added `ArgillaTrainer` CLI support. Closes ([#2809](#2809)) ### Changed - Argilla quickstart image dependencies are externalized into `quickstart.requirements.txt`. See [#2666](#2666) - bulk endpoints will upsert data when record `id` is present. Closes [#2535](#2535) - moved from `click` to `typer` CLI support. Closes ([#2815](#2815)) - Argilla server docker image is built with PostgreSQL support. Closes [#2686](#2686) - The `rg.log` computes all batches and raise an error for all failed batches. - The default batch size for `rg.log` is now 100. ### Fixed - `argilla.training` bugfixes and unification ([#2665](#2665)) - Resolved several small bugs in the `ArgillaTrainer`. ### Deprecated - The `rg.log_async` function is deprecated and will be removed in next minor release.

davidberenstein1957 added the type: enhancement Indicates new feature requests label Apr 6, 2023

davidberenstein1957 added a commit that referenced this issue Apr 10, 2023

chore: added support for training prep with openai for textcat #2658

ec435d0

davidberenstein1957 mentioned this issue Apr 13, 2023

Feat/2658 add argilla training module for openai with several bug fixes #2691

Merged

14 tasks

davidberenstein1957 added a commit that referenced this issue Apr 28, 2023

chore: #2658 included tests for openai

69ce246

davidberenstein1957 closed this as completed May 4, 2023

frascuchon mentioned this issue May 9, 2023

feat: update the argilla training integration #2858

Merged

14 tasks

frascuchon mentioned this issue May 9, 2023

Release v1.7.0 #2817

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `prepare_for_training` methods for `openai` #2658

add `prepare_for_training` methods for `openai` #2658

davidberenstein1957 commented Apr 6, 2023

add prepare_for_training methods for openai #2658

add prepare_for_training methods for openai #2658

Comments

davidberenstein1957 commented Apr 6, 2023

add `prepare_for_training` methods for `openai` #2658

add `prepare_for_training` methods for `openai` #2658