-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: add FeedbackDatasetBase
and RemoteFeedbackDataset
while keeping FeedbackDataset
just for local
#3465
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Done to ensure consistency with the current approach, even though in upcoming releases no remapping will be done and the user will need to capture `push_to_argilla` method's output
…/split-local-and-remote-dataset
10 tasks
# Description This PR adds a function in the SDK named `get_metrics` that calls `GET /api/v1/me/datasets/{dataset_id}/metrics` to get the metrics of a certain dataset from a certain user, meaning that each user would just see what can be seen according to its permissions. This PR is created on top of #3465 since we will need to get the `records.count` to override the magic method `__len__` in `_ArgillaFeedbackDataset` so that the length of a dataset is directly retrieved from the record count of the dataset, since we're no longer keeping local data when working with a remote dataset. **Type of change** - [X] New feature (non-breaking change which adds functionality) **How Has This Been Tested** - [X] Add unit tests for the `get_metrics` function **Checklist** - [ ] I added relevant documentation - [X] follows the style guidelines of this project - [X] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [X] My changes generate no new warnings - [X] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/) --------- Co-authored-by: Francisco Aranda <francis@argilla.io>
The code below works fine, the only missing piece is import argilla as rg
ds = rg.FeedbackDataset(...) # Initialises `FeedbackDataset` as always
records = [...]
ds.add_records(records) # Adds records locally
for record in ds.records: # Loops over the local records
print(record)
print(ds.records[0]) # Prints the record with index 0 locally
rg.init(
api_url="...",
api_key="...",
)
ds.push_to_argilla("my-dataset", workspace="alvaro") # Pushes the `FeedbackDataset` to Argilla, and remaps the class instance to be now of type `_ArgillaFeedbackDataset`, also returns it, but to keep backwards compatibility we're also remapping it for the moment
ds = rg.FeedbackDataset.from_argilla("my-dataset", workspace="alvaro") # Retrieves the `FeedbackDataset` from Argilla and returns an `_ArgillaFeedbackDataset` instance
for record in ds: # Iters over the records of the `FeedbackDataset` in Argilla (yield not return)
print(record.id)
print(ds[0]) # Prints the record with index 0 in the `FeedbackDataset` in Argilla
print(ds[1:4]) # Prints the records within the specified slice in the `FeedbackDataset` in Argilla
print(len(ds)) # Prints the total number of records in the `FeedbackDataset` in Argilla
ds.add_records(...) # Adds records directly in Argilla
ds.push_to_argilla("my-dataset", workspace="alvaro") # Raises a `DeprecationWarning` and mentions that updates are automatic |
alvarobartt
added
type: breaking changes
This issue or PR may include breaking changes in the code
type: deprecation
Indicates a feature that will be deprecated and/or support will be dropped
labels
Jul 28, 2023
gabrielmbmb
reviewed
Aug 2, 2023
frascuchon
reviewed
Aug 3, 2023
alvarobartt
changed the title
refactor: add
refactor: add Aug 3, 2023
FeedbackDatasetBase
and _ArgillaFeedbackDataset
while keeping FeedbackDataset
just for localFeedbackDatasetBase
and RemoteFeedbackDataset
while keeping FeedbackDataset
just for local
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
frascuchon
reviewed
Aug 3, 2023
Co-authored-by: Francis Aranda <francis@argilla.io>
Co-authored-by: Gabriel Martin <gabriel@argilla.io>
frascuchon
reviewed
Aug 3, 2023
Co-author-by: Gabriel Martin <gabriel@argilla.io>
Co-authored-by: Francisco Aranda <francis@argilla.io>
for more information, see https://pre-commit.ci
gabrielmbmb
approved these changes
Aug 3, 2023
Co-authored-by: Gabriel Martín Blázquez <gmartinbdev@gmail.com>
for more information, see https://pre-commit.ci
Due to the same message being formatted differently depending on the Python version e.g. `abstract methods records` in Python 3.10, but `abstract method records` in Python 3.7
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
type: breaking changes
This issue or PR may include breaking changes in the code
type: deprecation
Indicates a feature that will be deprecated and/or support will be dropped
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR is still in progress, and the main idea is that we move the common functionality to be used on both local and remote (Argilla) datasets to
FeedbackDatasetBase
, while we keepFeedbackDataset
for working locally with these datasets, and add_ArgillaFeedbackDataset
to be instantiated internally viaFeedbackDataset.from_argilla
and as part of the return statement ofFeedbackDataset.push_to_argilla
, so as to split the behaviour on some operations such asadd_records
, since locally means adding those to a local list, while remotely that means pushing those to Argilla, to avoid having to callpush_to_argilla
right after every record addition.Some more things are tackled as part of this refactoring and will be listed down below as soon as the PR is out of draft!
Closes #3456
Type of change
How Has This Been Tested
FeedbackDataset
Checklist