Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Specific view for Dataset settings #2442

Merged
merged 41 commits into from
Mar 16, 2023

Conversation

keithCuniah
Copy link
Contributor

@keithCuniah keithCuniah commented Feb 28, 2023

Description

Dataset settings feature

Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.

Not merge before:

Closes #2003
Closes #2210
Closes #2368
Closes #2369
Closes #2370
Closes #2371
Closes #2372

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (change restructuring the codebase without changing functionality)
  • Improvement (change adding some improvement to an existing functionality)
  • Documentation update

How Has This Been Tested

(Please describe the tests that you ran to verify your changes. And ideally, reference tests)

  • TextClassification
  • TokenClassification
  • Text2Text

Checklist

  • I have merged the original branch into my forked branch
  • I added relevant documentation
  • follows the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works

keithCuniah and others added 14 commits January 30, 2023 16:10
# Description

This PR includes new description component and some class to simplify
text styling

Closes #2372

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

- [x] DatasetDescription 

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
# Description
Fetch and store dataset labels as proper label schema

Closes #2210 

**Type of change**
- [x] remove obsolete code 
  -  _getDatasetSettings() in Dataset.js
  - initialize() in TextClassification.js and TokenClassification.js
  - initialize() call in the fetchByName() function in Dataset.js
- [x] get labels from api for TextClassification and TokenClassification
- [x] store labels as GlobalEntities in specific vuexorm (see "Ideal
Orm" in this
[link](https://www.notion.so/argilla/Refactoring-60c997e026644a28a2ff6a20c4cd90c3))
- [x] homogenized "AvailableRecords" (from TextClassificationHeader.vue)
and "visibleEntities"(from TokenClassificationHeader) by the same
vuexOrm query
- [x] prepare vuexorm queries to be able to sort globalLabels by any
param (by text, id, ...) and also ascending/descending
- [x] add more feedbacks with toast component if label could be saved,
there was a problem on save or if user try to add a label already saved



Note : 
- With these updates, globalLabels from tokenclassification and
textclassification have the same shape. M**eans they both have a
"color_id" param which can be used to add color for labels in
textClassification**
- the connection between globalEntities and dataset table is with the
"dataset_id" attribute from globalEntities. I didn't implement a
oneToMany relationship to not performance problem as in Weak labelling
implementation. This could be an idea of refactoring for later

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Refactor (change restructuring the codebase without changing
functionality)
- [x] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] TokenClassification task
- [x] TextClassification task
- N/A Text2Text task
 
**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: keithCuniah <keithcuniah@gmail.comh>
@keithCuniah keithCuniah changed the title Feature/front dataset settings Feat : front dataset settings Feb 28, 2023
keithCuniah and others added 7 commits February 28, 2023 16:56
# Description
Edition label component

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

To merge after #2210 

Closes #2371 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] Token classification
- [x] Text classification
- N/A Text2Text 

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: keithCuniah <keithcuniah@gmail.comh>
# Description

This PR includes new base-card component and dataset-delete component to
allow delete dataset from settings and remove the old logic from de
dataset list

Closes #2370

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] Test for BaseCard component
- [x] test for DatasetDeleteComponent

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
# Description
add generic layout for dataset settings page

Closes #2368 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] all task

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- N/A I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: keithCuniah <keithcuniah@gmail.comh>
# Description

This PR includes an icon in the dataset list to go to dataset settings

See [#2003](#2003)

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] My changes generate no new warnings
Adding extra custom validation allow only admin/superuser dataset
settings management.

If for example, a non-admin user tries to upgrade/update label schema,
this validation will return a 403 response.
@codecov
Copy link

codecov bot commented Mar 2, 2023

Codecov Report

Patch coverage: 88.13% and project coverage change: -0.04 ⚠️

Comparison is base (d687380) 93.53% compared to head (d9f64d4) 93.49%.

❗ Current head d9f64d4 differs from pull request most recent head 8a2602a. Consider uploading reports for the commit 8a2602a to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2442      +/-   ##
===========================================
- Coverage    93.53%   93.49%   -0.04%     
===========================================
  Files          157      158       +1     
  Lines         7839     7861      +22     
===========================================
+ Hits          7332     7350      +18     
- Misses         507      511       +4     
Flag Coverage Δ
pytest 93.49% <88.13%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/argilla/client/apis/datasets.py 90.40% <66.66%> (-2.23%) ⬇️
src/argilla/server/commons/telemetry.py 88.67% <88.88%> (+0.54%) ⬆️
src/argilla/_constants.py 100.00% <100.00%> (ø)
...0/handlers/text_classification_dataset_settings.py 100.00% <100.00%> (ø)
.../handlers/token_classification_dataset_settings.py 100.00% <100.00%> (ø)
src/argilla/server/apis/v0/helpers.py 100.00% <100.00%> (ø)
src/argilla/server/apis/v0/validators/commons.py 100.00% <100.00%> (ø)
...a/server/apis/v0/validators/text_classification.py 96.00% <100.00%> (+0.16%) ⬆️
.../server/apis/v0/validators/token_classification.py 94.23% <100.00%> (+0.23%) ⬆️
src/argilla/server/settings.py 76.38% <100.00%> (ø)

... and 6 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

keithCuniah and others added 2 commits March 14, 2023 10:47
# Description
create page dataset settings 

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

Closes #2369 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] TokenClassification
- [x] TextClassification
- [x] Text2Text

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: keithCuniah <keithcuniah@gmail.comh>
Co-authored-by: leire <leire@recogn.ai>
# Description

This PR includes changes of styles and in the components logic after QA
IMPORTANT: Before merge this branch,
`feature/create-page-dataset-settings` must be merged

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: keithCuniah <keithcuniah@gmail.comh>
Co-authored-by: keithCuniah <keithcuniah@gmail.com>
@keithCuniah keithCuniah marked this pull request as ready for review March 14, 2023 12:19
@frascuchon frascuchon changed the title Feat : front dataset settings feat: Specifi view for Dataset settings Mar 16, 2023
@frascuchon frascuchon changed the title feat: Specifi view for Dataset settings feat: Specific view for Dataset settings Mar 16, 2023
@frascuchon frascuchon self-requested a review March 16, 2023 15:26
@frascuchon frascuchon merged commit ff81ca9 into develop Mar 16, 2023
@frascuchon frascuchon deleted the feature/front-dataset-settings branch March 16, 2023 15:27
frascuchon added a commit that referenced this pull request Mar 22, 2023
## [1.5.0](v1.4.0...v1.5.0) -
2023-03-21

### Added

- Add the fields to retrieve when loading the data from argilla.
`rg.load` takes too long because of the vector field, even when users
don't need it. Closes
[#2398](#2398)
- Add new page and components for dataset settings. Closes
[#2442](#2003)
- Add ability to show image in records (for TokenClassification and
TextClassification) if an URL is passed in metadata with the key
\_image_url
- Non-searchable fields support in metadata.
[#2570](#2570)

### Changed

- Labels are now centralized in a specific vuex ORM called GlobalLabel
Model, see #2210. This model
is the same for TokenClassification and TextClassification (so both task
have labels with color_id and shortcuts parameters in the vuex ORM)
- The shortcuts improvement for labels
[#2339](#2339) have been moved
to the vuex ORM in dataset settings feature
[#2444](eb37c3b)
- Update "Define a labeling schema" section in docs.
- The record inputs are sorted alphabetically in UI by default.
[#2581](#2581)

### Fixes

- Allow URL to be clickable in Jupyter notebook again. Closes
[#2527](#2527)

### Removed

- Removing some data scan deprecated endpoints used by old clients. This
change will break compatibility with client `<v1.3.0`
- Stop using old scan deprecated endpoints in python client. This logic
will break client compatibility with server version `<1.3.0`
- Remove the previous way to add labels through the dataset page. Now
labels can be added only through dataset settings page.



### As always, thanks to our amazing contributors!
- Documentation update: tutorial for text classification models
comparison (#2426) by @embonhomme
- Docs: fix little typo (#2522) by @anakin87
- Docs: Tutorial on image classification (#2420) by @burtenshaw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants