Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add RankingQuestion in the Python client #3275

Merged
merged 20 commits into from
Jun 28, 2023

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Jun 27, 2023

Description

This PR includes the RankingQuestion, introduced in #3232, in the Python client, to allow users to create datasets with RankingQuestions and to submit responses for those too.

Usage

import argilla as rg

rg.init(
    api_url="<ARGILLA_API_URL>",
    api_key="<ARGILLA_API_KEY>,
)

ds = rg.FeedbackDataset(
    fields=[
        rg.TextField(name="prompt-1"),
        rg.TextField(name="prompt-2"),  
    ],
    questions=[
        rg.RankingQuestion(
            name="prompt-ranking",
            description="Rank the prompts from most to least natural.",
            required=True,
            values=["prompt-1", "prompt-2"],
        ),
    ],
)

ds.add_records(
    [
        rg.FeedbackRecord(
            fields={
                "prompt-1": "Explain to a broad audience why banana bread is so fluffy.",
                "prompt-2": "Explain banana banana banana.",
            },
            responses=[
                {
                    "values": {
                        "prompt-ranking": {"value": [{"value": "prompt-1", "rank": 1}, {"value": "prompt-2", "rank": 2}]},
                    },
                    "status": "submitted"
                },
            ],
        ), 
    ]
)

ds.push_to_argilla(name="new-dataset", workspace="new-workspace")
ds = rg.FeedbackDataset.from_argilla(name="new-dataset", workspace="new-workspace")

ds.push_to_huggingface(repo_id="<HUGGINGFACE_REPO_ID>", token="<HUGGINGFACE_TOKEN>")
ds = rg.FeedbackDataset.from_huggingface(repo_id="<HUGGINGFACE_REPO_ID>", token="<HUGGINGFACE_TOKEN>")

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested

  • Added unit tests to cover the new RankingQuestion

Checklist

  • I added relevant documentation
  • follows the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I filled out the contributor form (see text above)
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

Copy link
Member

@gabrielmbmb gabrielmbmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just two minor comments

src/argilla/client/feedback/dataset.py Outdated Show resolved Hide resolved
src/argilla/client/feedback/schemas.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Patch coverage: 84.03% and project coverage change: -0.86 ⚠️

Comparison is base (51751ac) 90.91% compared to head (32555c9) 90.05%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3275      +/-   ##
===========================================
- Coverage    90.91%   90.05%   -0.86%     
===========================================
  Files          215      233      +18     
  Lines        11304    12410    +1106     
===========================================
+ Hits         10277    11176     +899     
- Misses        1027     1234     +207     
Flag Coverage Δ
pytest 90.05% <84.03%> (-0.86%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/argilla/__init__.py 86.66% <ø> (+3.33%) ⬆️
...illa/client/feedback/training/frameworks/openai.py 0.00% <0.00%> (ø)
...rgilla/client/feedback/training/frameworks/peft.py 0.00% <0.00%> (ø)
...client/feedback/training/frameworks/span_marker.py 0.00% <0.00%> (ø)
src/argilla/server/contexts/datasets.py 96.01% <ø> (ø)
src/argilla/server/seeds.py 0.00% <ø> (ø)
src/argilla/tasks/users/create.py 91.11% <ø> (-4.45%) ⬇️
src/argilla/training/autotrain_advanced.py 0.00% <0.00%> (ø)
src/argilla/training/peft.py 0.00% <0.00%> (ø)
src/argilla/training/openai.py 42.66% <50.00%> (+0.20%) ⬆️
... and 60 more

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@alvarobartt
Copy link
Member Author

Failing tests are unrelated (as most of the times), I guess we could try to work on a PR to add HTTP retries, or split the integration tests from the unit tests to avoid the CI/CD runs from failing so much, WDYT @frascuchon @gabrielmbmb?

... labels={"cat-1": "Category 1" , "cat-2": "Category 2"},
... required=False,
... visible_labels=4
... ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to add here a rg.RankingQuestion? is the only one missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I think we should shorten the rest of them, as we have a lot of information that ends up being more than the actual code and it's hard to navigate through it, I think we can tackle the docstrings in the next release and use a clearer approach

@alvarobartt alvarobartt added this to the v1.12.0 milestone Jun 28, 2023
@gabrielmbmb gabrielmbmb merged commit c1f7aac into develop Jun 28, 2023
@gabrielmbmb gabrielmbmb deleted the feat/add-ranking-question branch June 28, 2023 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants