Documentation update: tutorial for text classification models comparison #2426

embonhomme · 2023-02-27T13:11:48Z

Description

Context: #2068
In this PR a new tutorial is added: model-comparison for text classification. It is the follow up of the work done during PyConFr in Bordeaux.

Closes #2068

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (change restructuring the codebase without changing functionality)
Improvement (change adding some improvement to an existing functionality)
Documentation update

How Has This Been Tested

(Please describe the tests that you ran to verify your changes. And ideally, reference tests)

Test A
Test B

Checklist

I have merged the original branch into my forked branch
I added relevant documentation
follows the style guidelines of this project
I did a self-review of my code
I made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works

embonhomme · 2023-02-27T13:16:39Z

This PR is a WIP, because I haven't figured out how to add the notebook in docs/source/tutorials

dvsrepo · 2023-03-01T12:55:35Z

Hi @embonhomme this is super cool and useful!!

In order to make it even more useful, would it be possible to use SetFit's zeroshot model instead of the fewshot classy? We've just published a tutorial to show how easy is to use SetFit and many people are asking about comparisons with the zeroshot HF pipeline, so this tutorial would be awesome and make for a better comparison? https://docs.argilla.io/en/latest/tutorials/notebooks/labelling-textclassification-setfit-zeroshot.html#%F0%9F%94%AB-Zero-shot-predictions-with-SetFit

We'd be happy to walk you through if you have questions.

…del-comparison-for-text-classification

embonhomme · 2023-03-06T20:48:44Z

Hello @dvsrepo :) Thank you for the feedback, you can find in the new commit the comparaison with SetFit zero-shot.
Tell me if it is relevant.

dvsrepo · 2023-03-06T21:00:11Z

This is looking just perfect!

The only one remaining change would be to review the remaining mentions of few-shot and classy-classification and replace them with zero-shot and SetFit. Then we are good to go!

We'd love to share this next week via LinkedIn and Twitter, if you'd like us to mention you as the author, send me an email to daniel @ argilla.io

davidberenstein1957 · 2023-03-07T09:04:42Z

@embonhomme Awesome, look great to me too:)

embonhomme · 2023-03-07T11:19:57Z

Thank you! Yes sorry I totally forgot to change the description part. It should be better now.
Also here I have just added a Jupyter Notebook, I didn't figure out how it works with the modal.md, dvc.md,...

I will send you an email with my LinkedIn :)

dvsrepo · 2023-03-07T11:34:54Z

Great stuff @embonhomme!
@embonhomme, in case you can tackle this, the process is:

Create a folder here with the filename, like this one: https://github.com/argilla-io/argilla/tree/develop/docs/_source/_static/tutorials/training-textclassification-setfit-fewshot
Add the reference to the tutorial: here https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/libraries/setfit.md, here: https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/steps/4_monitoring.md , here: https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/tasks/text_classification.md and here: https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/techniques/few_shot.md

I would say this tutorial is about Monitoring, TextClassification, few-shot

Otherwise, let us know and @davidberenstein1957 might be able to help

…rials for integration

embonhomme · 2023-03-08T10:14:21Z

Thank you, I did the integration :)

davidberenstein1957

Hi, @embonhomme could you rename everything to monitoring-textclassification-setfit-explainability. After that, everything should be fine:) Also, did you want to participate in the LinkedIn shoutout and our community program w.r.t. offsetting? https://www.argilla.io/blog/introducing-argilla-community-growers/

…bility

embonhomme · 2023-03-15T18:00:28Z

Hi @davidberenstein1957 I renamed everything :)
Yes, I would like to participate in the LinkedIn shoutout!

davidberenstein1957 · 2023-03-21T06:39:00Z

Lovely!

@embonhomme

## [1.5.0](v1.4.0...v1.5.0) - 2023-03-21 ### Added - Add the fields to retrieve when loading the data from argilla. `rg.load` takes too long because of the vector field, even when users don't need it. Closes [#2398](#2398) - Add new page and components for dataset settings. Closes [#2442](#2003) - Add ability to show image in records (for TokenClassification and TextClassification) if an URL is passed in metadata with the key \_image_url - Non-searchable fields support in metadata. [#2570](#2570) ### Changed - Labels are now centralized in a specific vuex ORM called GlobalLabel Model, see #2210. This model is the same for TokenClassification and TextClassification (so both task have labels with color_id and shortcuts parameters in the vuex ORM) - The shortcuts improvement for labels [#2339](#2339) have been moved to the vuex ORM in dataset settings feature [#2444](eb37c3b) - Update "Define a labeling schema" section in docs. - The record inputs are sorted alphabetically in UI by default. [#2581](#2581) ### Fixes - Allow URL to be clickable in Jupyter notebook again. Closes [#2527](#2527) ### Removed - Removing some data scan deprecated endpoints used by old clients. This change will break compatibility with client `<v1.3.0` - Stop using old scan deprecated endpoints in python client. This logic will break client compatibility with server version `<1.3.0` - Remove the previous way to add labels through the dataset page. Now labels can be added only through dataset settings page. ### As always, thanks to our amazing contributors! - Documentation update: tutorial for text classification models comparison (#2426) by @embonhomme - Docs: fix little typo (#2522) by @anakin87 - Docs: Tutorial on image classification (#2420) by @burtenshaw

initiate tutorial for text classification models comparison

435c1d4

embonhomme marked this pull request as draft February 27, 2023 13:15

adding zero-shot text classification with SetFit model to tutorial mo…

323949b

…del-comparison-for-text-classification

dvsrepo changed the base branch from develop to main March 6, 2023 20:56

dvsrepo changed the base branch from main to develop March 6, 2023 20:56

dvsrepo self-requested a review March 6, 2023 21:00

correct classy-classifier -> SetFit in text explanation

7d437eb

embonhomme marked this pull request as ready for review March 7, 2023 11:15

Add modal.md for model-comparison-for-text-classification and to tuto…

f9541ac

…rials for integration

Update few_shot.md

a6acc32

davidberenstein1957 reviewed Mar 13, 2023

View reviewed changes

rename model-comparison to monitoring-textclassification-set-explaina…

bcaa61a

…bility

davidberenstein1957 approved these changes Mar 21, 2023

View reviewed changes

davidberenstein1957 removed the request for review from dvsrepo March 21, 2023 06:40

davidberenstein1957 merged commit ae2d65b into argilla-io:develop Mar 21, 2023

frascuchon mentioned this pull request Mar 21, 2023

Releases/1.5.0 #2585

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation update: tutorial for text classification models comparison #2426

Documentation update: tutorial for text classification models comparison #2426

embonhomme commented Feb 27, 2023

embonhomme commented Feb 27, 2023

dvsrepo commented Mar 1, 2023

embonhomme commented Mar 6, 2023

dvsrepo commented Mar 6, 2023

davidberenstein1957 commented Mar 7, 2023

embonhomme commented Mar 7, 2023

dvsrepo commented Mar 7, 2023

embonhomme commented Mar 8, 2023

davidberenstein1957 left a comment

embonhomme commented Mar 15, 2023

davidberenstein1957 commented Mar 21, 2023

Documentation update: tutorial for text classification models comparison #2426

Documentation update: tutorial for text classification models comparison #2426

Conversation

embonhomme commented Feb 27, 2023

Description

embonhomme commented Feb 27, 2023

dvsrepo commented Mar 1, 2023

embonhomme commented Mar 6, 2023

dvsrepo commented Mar 6, 2023

davidberenstein1957 commented Mar 7, 2023

embonhomme commented Mar 7, 2023

dvsrepo commented Mar 7, 2023

embonhomme commented Mar 8, 2023

davidberenstein1957 left a comment

Choose a reason for hiding this comment

embonhomme commented Mar 15, 2023

davidberenstein1957 commented Mar 21, 2023