Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] push_to_huggingface sometimes include tags in resulting argilla.yml file #4089

Closed
gabrielmbmb opened this issue Oct 30, 2023 · 3 comments · Fixed by #4172
Closed

[BUG] push_to_huggingface sometimes include tags in resulting argilla.yml file #4089

gabrielmbmb opened this issue Oct 30, 2023 · 3 comments · Fixed by #4172
Assignees
Labels
team: ml Indicates that the issue or pull request is owned by the Machine Learning (ML) team type: bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@gabrielmbmb
Copy link
Member

Describe the bug
In some environments, when using FeedbackDataset.push_to_argilla, it generates an argilla.yaml file that includes tags that later prevents the from_huggingface function to work correctly as they yaml cannot be loaded back because of the tags.

allow_extra_metadata: true
fields:
- name: prompt
  required: true
  title: Prompt
  type: !!python/object/apply:argilla.client.feedback.schemas.enums.FieldTypes # <-- tag here
  - text
  use_markdown: true

Expected behavior
No matter what environment is used, the push_to_argilla should generate the yaml without tags.

Environment:

  • Argilla Version [e.g. 1.0.0]: 1.18.0
  • ElasticSearch Version [e.g. 7.10.2]: not relevant
  • Docker Image (optional) [e.g. argilla:v1.0.0]: not relevant
@gabrielmbmb gabrielmbmb added type: bug Indicates an unexpected problem or unintended behavior client labels Oct 30, 2023
@gabrielmbmb gabrielmbmb added this to the v1.19.0 milestone Oct 30, 2023
@davidberenstein1957 davidberenstein1957 added the team: ml Indicates that the issue or pull request is owned by the Machine Learning (ML) team label Oct 30, 2023
@davidberenstein1957
Copy link
Member

@jfcalvo
Copy link
Member

jfcalvo commented Nov 8, 2023

I add a link here to a possible solution suggested by @frascuchon that it's use use_enum_values in the Pydantic config for the affected model.

@jfcalvo
Copy link
Member

jfcalvo commented Nov 8, 2023

I couldn't reproduce the problem locally but use_enum_values looks like working:

In [17]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:

In [18]: Response(status=ResponseStatus.draft)
Out[18]: Response(status=<ResponseStatus.draft: 'draft'>)

In [19]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:     class Config:
    ...:         use_enum_values = True
    ...:

In [20]: Response(status=ResponseStatus.draft)
Out[20]: Response(status='draft')

frascuchon added a commit that referenced this issue Nov 8, 2023
…DatasetConfig` serialization (#4172)

# Description

I couldn't reproduce the error locally but looks like `use_enum_values`
could potentially solve the problem:

```python
In [17]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:

In [18]: Response(status=ResponseStatus.draft)
Out[18]: Response(status=<ResponseStatus.draft: 'draft'>)

In [19]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:     class Config:
    ...:         use_enum_values = True
    ...:

In [20]: Response(status=ResponseStatus.draft)
Out[20]: Response(status='draft')
```

Closes #4089 

**Type of change**

- [x] Bug fix (non-breaking change which fixes an issue)

**How Has This Been Tested**

If it's really difficult to test because it was happening only in some
specific environments. Even when environments with the same dependencies
and configuration.

**Checklist**

- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [x] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: Francisco Aranda <francis@argilla.io>
frascuchon added a commit that referenced this issue Nov 8, 2023
…DatasetConfig` serialization (#4172)

# Description

I couldn't reproduce the error locally but looks like `use_enum_values`
could potentially solve the problem:

```python
In [17]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:

In [18]: Response(status=ResponseStatus.draft)
Out[18]: Response(status=<ResponseStatus.draft: 'draft'>)

In [19]: class Response(BaseModel):
    ...:     status: ResponseStatus
    ...:     class Config:
    ...:         use_enum_values = True
    ...:

In [20]: Response(status=ResponseStatus.draft)
Out[20]: Response(status='draft')
```

Closes #4089 

**Type of change**

- [x] Bug fix (non-breaking change which fixes an issue)

**How Has This Been Tested**

If it's really difficult to test because it was happening only in some
specific environments. Even when environments with the same dependencies
and configuration.

**Checklist**

- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [x] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: Francisco Aranda <francis@argilla.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team: ml Indicates that the issue or pull request is owned by the Machine Learning (ML) team type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
4 participants