Feat/3347 feature add unification support for the rankingquestion #3364

davidberenstein1957 · 2023-07-09T14:09:20Z

Description

renamed typing.py to types.py to avoid import errors
added RankingQuestionStrategy
added RankingQuestionUnification
added RankingQuestionsupport for the .for_text_classification method for the TrainingTaskMapping

Closes #3347

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

New feature (non-breaking change which adds functionality)
Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

(Please describe the tests that you ran to verify your changes. And ideally, reference tests)

tests/client/feedback/test_schemas.py:test_ranking_question_strategy
tests/client/feedback/training/test_schemas.py:test_task_mapping_for_text_classification

Checklist

I added relevant documentation
follows the style guidelines of this project
I did a self-review of my code
I made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

…iner

…-for-the-rankingquestion

CHANGELOG.md

src/argilla/client/feedback/schemas.py

src/argilla/client/feedback/unification.py

tests/client/feedback/test_schemas.py

tests/client/feedback/training/test_schemas.py

src/argilla/client/feedback/dataset.py

alvarobartt

Overall LGTM 👍🏻 Just some minor comments and some things to clarify in order to have more context!

dvsrepo · 2023-07-10T20:27:41Z

Hi! As I have limited bandwidth but this is a key feature for Reward Modeling + RLHF. Could someone add an example (with code snippets) to see what's the expected output of this prepare for training method?

Specifically I'm interested and want to double check if we're making it easy or targeting to prepare data for Reward Modeling, i.e., creating pairs of chosen rejected responses (as we've shown with this tutorial. If this is not tackled yet, we need to prepare a brief spec for supporting it.

davidberenstein1957 · 2023-07-10T20:57:38Z

Hi Dani, I think this is not directly related to this part. We should add additional class methods to the TrainingTaskMapping. For now we only support .for_text_classification(), but we should also add things like .for_reward_modelling() given certain frameworks. Similarly other tasks ought to be support. If you dear for "ArgillaTrainer" in our issues you should be able yo already find some aligned with the HF mapping. Cheers,DavidEl 10 jul 2023, a las 22:27, Daniel Vila Suero ***@***.***> escribió: Hi! As I have limited bandwidth but this is a key feature for Reward Modeling + RLHF. Could someone add an example (with code snippets) to see what's the expected output of this prepare for training method? Specifically I'm interested and want to double check if we're making it easy or targeting to prepare data for Reward Modeling, i.e., creating pairs of chosen rejected responses (as we've shown with this tutorial. If this is not tackled yet, we need to prepare a brief spec for supporting it. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

chore: updated changelog chore: added docstrings

davidberenstein1957 · 2023-07-11T07:07:06Z

@dvsrepo
Reward Modelling
trl #3377
trlx#3324
SFT/promp-completion
trl #3379
trlx #3378

davidberenstein1957 · 2023-07-16T05:22:18Z

@tomaarsen I also made some changes here that might be relevant, but I saw this was not merged yet.

…-for-the-rankingquestion

codecov · 2023-07-16T05:24:12Z

Codecov Report

Patch coverage: 93.60% and project coverage change: +0.31 🎉

Comparison is base (6630d7b) 90.13% compared to head (03c6dd9) 90.45%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3364      +/-   ##
===========================================
+ Coverage    90.13%   90.45%   +0.31%     
===========================================
  Files          233      243      +10     
  Lines        12493    13223     +730     
===========================================
+ Hits         11261    11961     +700     
- Misses        1232     1262      +30

Flag	Coverage Δ
pytest	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...ack/integrations/huggingface/card/_dataset_card.py	`100.00% <ø> (ø)`
.../feedback/integrations/huggingface/card/_parser.py	`100.00% <ø> (ø)`
src/argilla/client/feedback/types.py	`100.00% <ø> (ø)`
src/argilla/client/sdk/commons/errors.py	`72.22% <ø> (ø)`
src/argilla/feedback/__init__.py	`100.00% <ø> (ø)`
src/argilla/tasks/database/migrate.py	`39.13% <ø> (-4.87%)`	⬇️
src/argilla/training/autotrain_advanced.py	`0.00% <0.00%> (ø)`
src/argilla/utils/telemetry.py	`89.09% <ø> (ø)`
src/argilla/client/feedback/training/schemas.py	`87.50% <50.00%> (-1.01%)`	⬇️
src/argilla/server/settings.py	`77.41% <50.00%> (-3.76%)`	⬇️
... and 72 more

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

davidberenstein1957 added 2 commits July 6, 2023 08:56

chore: added logic for unifying RankingQuestions

7df2792

chore: added unitttests RankingQuestion integration within ArgillaTra…

3b0bbb7

…iner

davidberenstein1957 requested a review from alvarobartt July 9, 2023 14:09

davidberenstein1957 linked an issue Jul 9, 2023 that may be closed by this pull request

[FEATURE] add unification support for the RankingQuestion #3347

Closed

davidberenstein1957 added 3 commits July 9, 2023 16:12

chore: updated docs

3016339

chore: updated changelog.md

5a1ed75

Merge branch 'develop' into feat/3347-feature-add-unification-support…

a6c168b

…-for-the-rankingquestion

davidberenstein1957 marked this pull request as ready for review July 9, 2023 14:21