Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 🚀 responses and suggestion filter #4160

Merged
merged 125 commits into from
Nov 28, 2023

Conversation

damianpumar
Copy link
Contributor

No description provided.

Copy link

github-actions bot commented Nov 7, 2023

The URL of the deployed environment for this PR is https://argilla-quickstart-pr-4160-ki24f765kq-no.a.run.app

damianpumar and others added 25 commits November 24, 2023 16:03
# Description

This is a WIP PR with integration between the new `filters` and `sort`
added to the query for searching for records and the search engine.

The idea is that it will allow us to check if everything is working as
expected in an environment. After this PR we can start working on
refactors and necessary changes.

Ref #4228 

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: Paco Aranda <francis@argilla.io>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…argilla-io/argilla into feature/responses-and-suggestion-filter

* 'feature/responses-and-suggestion-filter' of github.com:argilla-io/argilla:
  feat: integrate search engine with search endpoint (#4310)
<!-- Thanks for your contribution! As part of our Community Growers
initiative 🌱, we're donating Justdiggit bunds in your name to reforest
sub-Saharan Africa. To claim your Community Growers certificate, please
contact David Berenstein in our Slack community or fill in this form
https://tally.so/r/n9XrxK once your PR has been merged. -->

# Description

This PR adds support for indexing suggestions when indexing records in
the search index.

In order to be sure that all record attributes are passed when indexing
records, a workaround has been implemented by forcing a record,
responses, and suggestions to refresh before indexing them. Otherwise is
hard to add that info when loading records in the current code base.

This behavior must be reviewed and simplified cc @gabrielmbmb @jfcalvo 

Changes related to tests will be moved into a separate PR since extra
refactoring work must be done. Once this extra PR is merged here, the PR
will be marked as ready for review. Related PR
#4318


Refs: #3849 

Closes #4230

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: José Francisco Calvo <jose@argilla.io>
Co-authored-by: José Francisco Calvo <josefranciscocalvo@gmail.com>
…apping (#4330)

# Description

After testing suggestions creation and indexing we found an error
creating suggestions using ranking questions types. Specifying an
explicit set of properties fix the problem.

This is the script used to test the suggestions index:

```python
import argilla as rg

dataset = rg.FeedbackDataset(
  fields=[rg.TextField(name="text")],
  questions=[
    rg.RatingQuestion(name="rating-question", values=[1, 2, 3, 4]),
    rg.TextQuestion(name="text-question"),
    rg.LabelQuestion(name="label-question", labels=["one", "two"]),
    rg.MultiLabelQuestion(name="multi-label-question", labels=["one", "two", "three"]),
    rg.RankingQuestion(name="ranking-question", values=["ranking-1", "ranking-2", "ranking-3"]),
  ],
)

dataset.add_records(
  records=[
    rg.FeedbackRecord(
      fields={"text": "record-1"},
      suggestions=[
        {"question_name": "rating-question", "value": 1},
        {"question_name": "text-question", "value": "suggestion-1",}
      ]
    ),
    rg.FeedbackRecord(
      fields={"text": "record-2"},
      suggestions=[
        {"question_name": "label-question", "value": "one"},
        {"question_name": "multi-label-question", "value": ["one", "two"]},
        {
          "question_name": "ranking-question",
          "value": [
            {"rank": 1, "value": "ranking-1"},
            {"rank": 2, "value": "ranking-2"},
            {"rank": 3, "value": "ranking-3"}
          ]
        }
      ]
    ),
  ],
)

dataset.push_to_argilla(name="test-dataset")
```

Refs #4230

Kudos to @gabrielmbmb for helping me doing the manual testing using the
SDK.

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

- [x] Manually creating suggestions using SDK and checking integration
with the search engine.

**Checklist**

- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
…/responses-and-suggestion-filter' of github.com:argilla-io/argilla into feature/responses-and-suggestion-filter

* 'feature/responses-and-suggestion-filter' of github.com:argilla-io/argilla:
  fix: add ranking question type explicit properties to search engine mapping (#4330)

* 'feature/responses-and-suggestion-filter' of github.com:argilla-io/argilla:
  fix: add ranking question type explicit properties to search engine mapping (#4330)
…argilla-io/argilla into feature/responses-and-suggestion-filter

* 'feature/responses-and-suggestion-filter' of github.com:argilla-io/argilla:
  fix: handling errors for non-existing endpoints (#4325)
<!-- Thanks for your contribution! As part of our Community Growers
initiative 🌱, we're donating Justdiggit bunds in your name to reforest
sub-Saharan Africa. To claim your Community Growers certificate, please
contact David Berenstein in our Slack community or fill in this form
https://tally.so/r/n9XrxK once your PR has been merged. -->

# Description

This PR integrates the `POST /api/v1/dataset/:dataset_id/records/search`
with search engine to support `filter` and `sort` body attributes.

Also, the base filter classes are decorated as `dataclass` in order to
provide an easy way to compare args in tests (`dataclass` provide
automatic `eq` method)

Refs #4227

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [] New feature (non-breaking change which adds functionality)
- [X] Refactor (change restructuring the codebase without changing
functionality)
- [X] Improvement (change adding some improvement to an existing
functionality)


**Checklist**

- [X] I added relevant documentation
- [X] I followed the style guidelines of this project
- [X] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [X] My changes generate no new warnings
- [X] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [X] I have added relevant notes to the `CHANGELOG.md` file (See
https://keepachangelog.com/)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…argilla-io/argilla into feature/responses-and-suggestion-filter

* 'feature/responses-and-suggestion-filter' of github.com:argilla-io/argilla:
  feat: add new filter support for search records endpoint (#4327)
@frascuchon frascuchon merged commit 5f7ad22 into develop Nov 28, 2023
@frascuchon frascuchon deleted the feature/responses-and-suggestion-filter branch November 28, 2023 11:43
davidberenstein1957 pushed a commit that referenced this pull request Nov 29, 2023
#4160)

Co-authored-by: leire <leire@recogn.ai>
Co-authored-by: José Francisco Calvo <jose@argilla.io>
Co-authored-by: Gabriel Martín Blázquez <gmartinbdev@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Francisco Aranda <francis@argilla.io>
Co-authored-by: leiyre <leire@argilla.io>
Co-authored-by: José Francisco Calvo <josefranciscocalvo@gmail.com>
leiyre pushed a commit that referenced this pull request Nov 29, 2023
* develop: (30 commits)
  chore: increase dev version release to 1.21.0
  fix: responses and suggestions filter QA (#4337)
  feat: delete suggestion from record on search engine (#4336)
  feat: update suggestion from record on search engine (#4339)
  bug: fix bug and update test (#4341)
  fix: preserve `TextClassificationSettings.label_schema` order (#4332)
  Update issue templates
  feat: 🚀 support for filtering and sorting by responses and suggestions (#4160)
  fix: handling errors for non-existing endpoints (#4325)
  feat: adding utils module and functions (#4121)
  Update labels in github workflows (#4315)
  fix: correct unification implementation for `RankingQuestionStrategy` (#4295)
  fix: update to solve the error of integration tests in CI (#4314)
  docs: revisit install process (#4261)
  feat: increase timeout minutes for python tests (#4307)
  docs: docs export dataset does not apply coloring for code snippets (#4296)
  docs: update final section of the rag haystack blog post (#4294)
  feat: add multi_modal templates and update vector setting (#4283)
  feat: better logging bar for FeedbackDataset (#4267)
  refactor: ArgillaTrainer for unified variable usage (#4214)
  ...

# Conflicts:
#	frontend/v1/infrastructure/repositories/RecordRepository.ts
leiyre pushed a commit that referenced this pull request Dec 5, 2023
* develop: (41 commits)
  chore: update dev version
  chore: update CHANGELOG.md before release v1.20.0 (#4357)
  docs: temporal update to indicate persistent storage (#4355)
  docs: add suggestions and responses filters and sorting (#4345)
  docs: add end2end example on creating a basic text-classification dataset (#4208)
  Fix/responses suggestions filter fine tune (#4356)
  Fix/responses suggestions filter fine tune (#4356)
  fix: Accept draft responses on dataset records creation (#4354)
  Feature/responses operator (#4352)
  Feature/responses operator (#4352)
  chore: increase dev version release to 1.21.0
  chore: remove dev suffix for release branch
  fix: responses and suggestions filter QA (#4337)
  feat: delete suggestion from record on search engine (#4336)
  feat: update suggestion from record on search engine (#4339)
  bug: fix bug and update test (#4341)
  fix: preserve `TextClassificationSettings.label_schema` order (#4332)
  Update issue templates
  feat: 🚀 support for filtering and sorting by responses and suggestions (#4160)
  fix: handling errors for non-existing endpoints (#4325)
  ...

# Conflicts:
#	frontend/v1/domain/entities/question/Question.ts
#	frontend/v1/domain/entities/record/Record.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Allow filtering and sorting using suggestions and responses
4 participants