Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump release #768

Merged
merged 1,005 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1005 commits
Select commit Hold shift + click to select a range
2626bc5
Update chapters/ru/chapter7/5.mdx
artyomboyko Jan 10, 2024
cb440a6
Update chapters/ru/chapter7/5.mdx
artyomboyko Jan 10, 2024
068217e
Update chapters/ru/chapter7/5.mdx
artyomboyko Jan 10, 2024
fa12024
Update 5.mdx
artyomboyko Jan 10, 2024
07373e6
Update chapters/ru/chapter7/4.mdx
artyomboyko Jan 10, 2024
95bda7c
Update 2.mdx
artyomboyko Jan 10, 2024
913a9b1
Update 2.mdx
artyomboyko Jan 10, 2024
5f5d4aa
Update 3.mdx
artyomboyko Jan 10, 2024
30ca446
Update 3.mdx
artyomboyko Jan 10, 2024
8c57d55
Update chapters/ru/chapter7/3.mdx
artyomboyko Jan 10, 2024
e1993b5
Update chapters/ru/chapter7/3.mdx
artyomboyko Jan 10, 2024
c2ad2f6
Update chapters/ru/chapter7/3.mdx
artyomboyko Jan 10, 2024
1a19575
Update 3.mdx
artyomboyko Jan 10, 2024
743ea97
Update chapters/ru/chapter7/3.mdx
artyomboyko Jan 10, 2024
184cceb
Update chapters/ru/chapter7/4.mdx
artyomboyko Jan 10, 2024
5eb4f4b
Update 4.mdx
artyomboyko Jan 10, 2024
e7c59a5
Update 5.mdx
artyomboyko Jan 10, 2024
e09613a
Update 5.mdx
artyomboyko Jan 11, 2024
489dcd7
Merge pull request #653 from blademoon/main
MKhalusova Jan 11, 2024
efbd59f
fixed links to other chapters
MKhalusova Jan 11, 2024
0b03f5c
fixed links to chapters' intros
MKhalusova Jan 11, 2024
8f0e044
I added myself to the Languages and translations table.
artyomboyko Jan 13, 2024
5da07e8
Deleted unnecessary folder automatically created by JupyterLab.
artyomboyko Jan 13, 2024
bc2832e
Merge pull request #658 from blademoon/main
MKhalusova Jan 15, 2024
4966980
Fix links to HF docs
mariosasko Jan 16, 2024
2898462
Merge pull request #660 from huggingface/fix-doc-links
mariosasko Jan 17, 2024
fd85628
Finalizing the translation of chapter 7.
artyomboyko Jan 18, 2024
0afa500
Update 6.mdx
artyomboyko Jan 18, 2024
4a5a73d
Update 7.mdx
artyomboyko Jan 18, 2024
386f429
Merge pull request #656 from MKhalusova/links-fix
MKhalusova Jan 19, 2024
818c74c
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
ee2e1ce
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
45a369c
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
5863024
Update chapters/ru/chapter7/7.mdx
artyomboyko Jan 19, 2024
074d4c5
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
3e9107e
Update chapters/ru/chapter7/7.mdx
artyomboyko Jan 19, 2024
8e74a57
Update chapters/ru/chapter7/7.mdx
artyomboyko Jan 19, 2024
18b0fdf
Update chapters/ru/chapter7/8.mdx
artyomboyko Jan 19, 2024
5bfa31b
Update 7.mdx
artyomboyko Jan 19, 2024
9af6080
Update 6.mdx
artyomboyko Jan 19, 2024
f667bea
Update chapters/ru/chapter7/7.mdx
artyomboyko Jan 19, 2024
844825d
Update 6.mdx
artyomboyko Jan 19, 2024
991c4ad
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
9ca1013
Update chapters/ru/chapter7/7.mdx
artyomboyko Jan 19, 2024
3540d5f
Update chapters/ru/chapter7/6.mdx
artyomboyko Jan 19, 2024
6c1f1f8
Merge pull request #661 from blademoon/main
MKhalusova Jan 19, 2024
0373629
8/1-2 done
pdumin Jan 27, 2024
5a822a9
8/3 finished
pdumin Jan 27, 2024
e34e68d
8/4 finished
pdumin Jan 27, 2024
db79001
fix typo
pdumin Jan 27, 2024
582cada
toc update
pdumin Jan 27, 2024
91b35f3
typos fixed
pdumin Jan 27, 2024
6d161ca
removed english text
pdumin Jan 27, 2024
b1a8e02
8/5 finished
pdumin Jan 27, 2024
8c015d3
8/6-7 finished
pdumin Jan 27, 2024
eb16954
fix and update toc
pdumin Jan 28, 2024
a7a0dac
chapter8/1 fixed
pdumin Jan 30, 2024
dd7bb8a
chapter8/2 fixed
pdumin Jan 30, 2024
d24a0c1
chapter8/3 fixed
pdumin Jan 30, 2024
b4bd049
chapter8/4 fixed
pdumin Jan 30, 2024
47c809a
chapter8/5 fixed
pdumin Jan 30, 2024
5893e53
fix title 8/5
pdumin Jan 30, 2024
3f5df27
fix title 8/5 in toc
pdumin Jan 30, 2024
d46939c
Update _toctree.yml title 8
pdumin Jan 30, 2024
e4eae4c
Bump black (#671)
lewtun Jan 31, 2024
e825efb
Merge branch 'huggingface:main' into main
pdumin Jan 31, 2024
b973ea8
fix unexpected token in quiz
pdumin Jan 31, 2024
58ca113
8/2 fixed
pdumin Jan 31, 2024
b62a3a4
8/3 fixed
pdumin Jan 31, 2024
0a33e61
8/4_tf fixed
pdumin Jan 31, 2024
0c72c69
Merge pull request #667 from pdumin/main
MKhalusova Jan 31, 2024
b509066
Update 3b.mdx
yanisallouch Jan 31, 2024
1185e7e
Added translation of chapter 9 and Course Events.
artyomboyko Jan 31, 2024
e3edacb
Added translation of chapter 9 and Course Events.
artyomboyko Jan 31, 2024
e819ed2
Update 5.mdx
yanisallouch Feb 1, 2024
5b4e2bd
Update 7.mdx
yanisallouch Feb 1, 2024
23c121b
Update 10.mdx
yanisallouch Feb 1, 2024
953d8ed
Update chapters/ru/chapter9/6.mdx
artyomboyko Feb 1, 2024
2e98f09
Update chapters/ru/chapter9/7.mdx
artyomboyko Feb 1, 2024
7c70c6b
Update chapters/ru/chapter9/7.mdx
artyomboyko Feb 1, 2024
aebdc91
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
9742e7b
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
05114e9
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
93c9387
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
7fe2432
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
6a8abe0
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
f1719b3
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
e87666a
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
2bba211
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
2991db0
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
d5551b8
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
9c0fce1
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
b9eab04
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
b289071
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
cc95da3
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
d48ca79
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
033ef13
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
4bf3abb
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
c86fb1e
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
e5a6afb
Update chapters/ru/events/1.mdx
artyomboyko Feb 1, 2024
10fb52f
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
198b1c3
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
7ed5ad4
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
6a18d5d
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
8a3edbf
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
f998d19
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
6b24983
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
8e44e89
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
dcac23a
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
b5a52b6
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
41ef527
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
33ddd73
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
2157d04
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
c572649
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
6695d36
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
162a912
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
3eb7d92
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
a044b40
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
9c9fa42
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
ea5b060
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
8a94691
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
95127a1
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
e048955
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
4658529
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
0ea1589
Update chapters/ru/events/2.mdx
artyomboyko Feb 1, 2024
f44d5cb
Merge pull request #673 from blademoon/main
MKhalusova Feb 1, 2024
f9a0ca5
Capture the current state of the translation. Two files are translate…
artyomboyko Feb 2, 2024
5ff4f2c
Made a full translation of Chapter 2.
artyomboyko Feb 4, 2024
fb8c1f8
Fix problem in .
artyomboyko Feb 4, 2024
fe42b4f
Deleting JupyterLab backup files.
artyomboyko Feb 4, 2024
afc8211
Update 8.mdx
artyomboyko Feb 4, 2024
bd349ce
Update 8.mdx
artyomboyko Feb 4, 2024
f8af3dd
remove original sentence
fabridamicelli Feb 13, 2024
84d5caf
Update chapters/ru/chapter2/2.mdx
artyomboyko Feb 13, 2024
ca51532
Update chapters/ru/chapter2/2.mdx
artyomboyko Feb 13, 2024
772cf68
Update chapters/ru/chapter2/4.mdx
artyomboyko Feb 13, 2024
2626239
Update chapters/ru/chapter2/5.mdx
artyomboyko Feb 13, 2024
5b7d0a5
Update chapters/ru/chapter2/5.mdx
artyomboyko Feb 13, 2024
cd84491
Update chapters/ru/chapter2/5.mdx
artyomboyko Feb 13, 2024
316f484
Update chapters/ru/chapter2/5.mdx
artyomboyko Feb 13, 2024
e5c1cfe
Update chapters/ru/chapter2/2.mdx
artyomboyko Feb 13, 2024
aef95c1
Update 2.mdx
artyomboyko Feb 13, 2024
f00ba32
Update 2.mdx
artyomboyko Feb 13, 2024
2a19ee2
Update chapters/ru/chapter2/2.mdx
artyomboyko Feb 13, 2024
8c6a58c
Update 2.mdx
artyomboyko Feb 13, 2024
c177017
Update chapters/ru/chapter2/2.mdx
artyomboyko Feb 13, 2024
7889838
Update chapters/ru/chapter2/3.mdx
artyomboyko Feb 13, 2024
98b0f47
Update chapters/ru/chapter2/4.mdx
artyomboyko Feb 13, 2024
3c686e5
Update 4.mdx
artyomboyko Feb 13, 2024
6dfa7e1
Merge pull request #678 from blademoon/main
MKhalusova Feb 13, 2024
dcab735
Merge pull request #640 from fabridamicelli/de_ch_4_part2
MKhalusova Feb 13, 2024
eaca7dd
Minor edits to the table of contents, and the titles of the final tes…
artyomboyko Feb 19, 2024
e63a663
Merge pull request #685 from blademoon/main
MKhalusova Feb 21, 2024
4acf20c
chapter 2's introduction hackable French translation
FlorentFlament Mar 4, 2024
460d07a
Chapter 2.2 - doesn't make sense French translation
FlorentFlament Mar 4, 2024
c3c4ea0
[doc-fix] Add accelerate required for notebook 3/3
dimaioksha Mar 10, 2024
2aed933
translation fix
brealid Mar 15, 2024
b1e8a6d
fix: typo
qcgzxw Mar 20, 2024
eeacec0
fix formatting
qcgzxw Mar 20, 2024
eef7403
remove unnecessary translate
qcgzxw Mar 20, 2024
cb3fce6
fix typo and formatting in Chinese translation
qcgzxw Mar 20, 2024
f983af5
fix zh-cn translation for chapter 3-6
ruochenhua Mar 21, 2024
bd546fd
fix formatting issue for zh-cn on chapter 5-8
ruochenhua Mar 26, 2024
755ece4
Merge branch 'main' of https://github.com/ruochenhua/huggingface_course
ruochenhua Mar 26, 2024
d01e9c0
fix zh-cn translation on chapter 4-6
ruochenhua Mar 26, 2024
08ae78b
docs: mdx typo
Nagi-ovo Mar 28, 2024
52c1ff0
fix deactivate
qgallouedec Mar 31, 2024
2715b60
tip to install datasets
tal7aouy Apr 24, 2024
194a6a0
修正翻译
buqieryul Apr 24, 2024
742d174
fix zh
buqieryul Apr 25, 2024
b79da5d
fix zh
buqieryul Apr 25, 2024
9f55f69
Changing tokenized_dataset to tokenized_datasets
jpodivin May 5, 2024
b1d79d1
Update 9.mdx
partrita May 8, 2024
43ce993
Update 2.mdx
partrita May 9, 2024
b340fbf
[zh-CN/TW] bugfix of broken image link: Companies using Hugging Face
jks-liu May 15, 2024
4b729b9
[zh-CN/TW] pipeline name translation is not needed
jks-liu May 15, 2024
446da36
[zh-CN/TW] translation of `Hub`: 集线器(集線器) => 模型中心
jks-liu May 15, 2024
231491f
correct zh translation
May 16, 2024
f52d6d8
correct zh-TW translation
May 16, 2024
7065a5f
Fix typo in korean translation
osanseviero May 17, 2024
93e49a8
Changing tokenized_dataset to tokenized_datasets
osanseviero May 17, 2024
60ad585
Fix deactivate command
osanseviero May 17, 2024
bb1c442
Fix typo
osanseviero May 17, 2024
a91486c
Add accelerate required for notebook
osanseviero May 17, 2024
d7e3812
Merge pull request #706 from memorylorry/correct_zh_doc
xianbaoqian May 21, 2024
a51913e
Merge pull request #705 from jks-liu/pr-zh
xianbaoqian May 21, 2024
c8572a9
Merge branch 'main' into main
xianbaoqian May 21, 2024
2cc0ab7
Merge pull request #701 from buqieryul/main
xianbaoqian May 21, 2024
47e5ad7
Merge branch 'main' into main
xianbaoqian May 21, 2024
0c2e096
Merge pull request #696 from ruochenhua/main
xianbaoqian May 21, 2024
3aaa344
Merge branch 'main' into chapter0
xianbaoqian May 21, 2024
bf4f283
Merge pull request #695 from qcgzxw/chapter0
xianbaoqian May 21, 2024
7d9fd84
Merge pull request #693 from qcgzxw/main
xianbaoqian May 21, 2024
80eb463
Merge branch 'main' into main
xianbaoqian May 21, 2024
3f2ad53
Merge pull request #691 from brealid/main
xianbaoqian May 21, 2024
cfb66ae
Merge pull request #700 from tal7aouy/tal7aouy
lunarflu Aug 21, 2024
d6f1620
Merge pull request #688 from FlorentFlament/doesnt-make-sense
lunarflu Aug 21, 2024
647d1ad
Merge pull request #687 from FlorentFlament/main
lunarflu Aug 21, 2024
472f531
Merge pull request #676 from yanisallouch/patch-7
lunarflu Aug 22, 2024
c368427
Merge pull request #675 from yanisallouch/patch-6
lunarflu Aug 22, 2024
b966bcc
Merge pull request #674 from yanisallouch/patch-5
lunarflu Aug 22, 2024
2f54cd0
Merge pull request #672 from yanisallouch/patch-4
lunarflu Aug 22, 2024
9585b2a
Merge pull request #589 from gyheo/main
lunarflu Aug 23, 2024
ea35ade
Merge pull request #533 from askfor/patch-2
lunarflu Aug 23, 2024
a38ba80
Merge pull request #514 from nvoorhies/main
lunarflu Sep 5, 2024
bfd0947
finish review chapter7 8 9
yaoqih Sep 10, 2024
cadd0ab
format fr/chapter9/4.mdx
yaoqih Sep 10, 2024
d388a7e
Merge branch 'main' into main
yaoqih Sep 10, 2024
7806601
Update 7.mdx
yaoqih Sep 10, 2024
1652378
Merge branch 'main' of https://github.com/yaoqih/course
yaoqih Sep 10, 2024
685e7f2
create rum folder
eduard-balamatiuc Sep 22, 2024
1bda24c
Add Steven as reviewer (#746)
lewtun Sep 23, 2024
0313528
update toctree
eduard-balamatiuc Sep 24, 2024
6c1cc5f
translate chapter0-1
eduard-balamatiuc Sep 24, 2024
f4f11b9
update course0-1
eduard-balamatiuc Sep 26, 2024
5a6d86a
remove files and folders that are not updated
eduard-balamatiuc Sep 26, 2024
b330c3a
add the rum folder in the build documentation
eduard-balamatiuc Sep 26, 2024
0dbfc37
Merge pull request #1 from SigmoidAI/translation-chapter-0
eduard-balamatiuc Sep 26, 2024
e5248d2
Introduction to Argilla
nataliaElv Oct 22, 2024
c5cfaaa
Set up Argilla
nataliaElv Oct 22, 2024
dc6251f
Merge pull request #753 from SigmoidAI/main
stevhliu Oct 29, 2024
ef81c0b
Remove mention of chapters 10-12
erinys Nov 5, 2024
6998b7d
Merge pull request #754 from huggingface/ann-intro-cleanup
erinys Nov 6, 2024
14e8f4e
finish review ZH-CN chapter1-6
yaoqih Nov 19, 2024
9623b46
code_format for chaper 1-6
yaoqih Nov 19, 2024
ffbbcc8
Fixed wrong full width colon
yaoqih Nov 19, 2024
b91e806
Initial draft
nataliaElv Nov 20, 2024
5ce7a3d
Merge pull request #743 from yaoqih/main
xianbaoqian Nov 20, 2024
d32647b
Merge branch 'huggingface:main' into argilla-chapter
nataliaElv Nov 20, 2024
d279b9f
Fix
nataliaElv Nov 20, 2024
4d47d89
Corrections section 2
nataliaElv Nov 20, 2024
5b03fdd
Section 3 improvements
nataliaElv Nov 20, 2024
3f54096
More improvements
nataliaElv Nov 20, 2024
b2bf23b
Images & apply review comments
nataliaElv Nov 21, 2024
12d67bb
Apply suggestions from code review
nataliaElv Nov 21, 2024
20f7313
Fix style
nataliaElv Nov 22, 2024
944b877
Updated images and banners
nataliaElv Nov 22, 2024
9bb7a59
More screenshots
nataliaElv Nov 22, 2024
ea07f11
Fix quiz inline code
nataliaElv Nov 22, 2024
cf1d194
More improvements from reviews
nataliaElv Nov 25, 2024
c2a74e4
Added chapter 0 and initiated _toctree for Nepali Language
CRLannister Dec 1, 2024
d87ee46
Added Nepali language code in the workflow github actions
CRLannister Dec 1, 2024
d06b4cc
Ran make styles without any errors!
CRLannister Dec 1, 2024
384735e
Update chapters/ne/chapter0/1.mdx
CRLannister Dec 2, 2024
55353ab
Update chapters/ne/chapter0/1.mdx
CRLannister Dec 2, 2024
474d65e
Made same codeblocks for activate and deactivate
CRLannister Dec 2, 2024
476dc72
Merge pull request #761 from CRLannister/main
stevhliu Dec 4, 2024
4a49e48
New chapter: Argilla (#756)
pcuenca Jan 8, 2025
d7de7f9
Merge branch 'release' into bump_release
lewtun Jan 16, 2025
1097cdd
Merge branch 'release' into bump_release
lewtun Jan 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ jobs:
package: course
path_to_docs: course/chapters/
additional_args: --not_python_module
languages: ar bn de en es fa fr gj he hi id it ja ko pt ru th tr vi zh-CN zh-TW
languages: ar bn de en es fa fr gj he hi id it ja ko ne pt ru rum th tr vi zh-CN zh-TW
secrets:
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
2 changes: 1 addition & 1 deletion .github/workflows/build_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ jobs:
package: course
path_to_docs: course/chapters/
additional_args: --not_python_module
languages: ar bn de en es fa fr gj he hi id it ja ko pt ru th tr vi zh-CN zh-TW
languages: ar bn de en es fa fr gj he hi id it ja ko ne pt ru rum th tr vi zh-CN zh-TW
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ pip install -r requirements.txt
make style
```

Once that's run, commit any changes, open a pull request, and tag [@lewtun](https://github.com/lewtun) for a review. Congratulations, you've now completed your first translation 🥳!
Once that's run, commit any changes, open a pull request, and tag [@lewtun](https://github.com/lewtun) and [@stevhliu](https://github.com/stevhliu) for a review. If you also know other native-language speakers who are able to review the translation, tag them as well for help. Congratulations, you've now completed your first translation 🥳!

> 🚨 To build the course on the website, double-check your language code exists in `languages` field of the `build_documentation.yml` and `build_pr_documentation.yml` files in the `.github` folder. If not, just add them in their alphabetical order.

Expand Down
2 changes: 1 addition & 1 deletion chapters/ar/chapter0/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ ls -a
source .env/bin/activate

# Deactivate the virtual environment
source .env/bin/deactivate
deactivate
```

<div dir="rtl" style="direction:rtl;text-align:right;">
Expand Down
2 changes: 1 addition & 1 deletion chapters/bn/chapter0/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ ls -a
source .env/bin/activate

# virtual environment টি deactivate করার কমান্ড
source .env/bin/deactivate
deactivate
```

`which python` কমান্ড চালিয়ে নিশ্চিত করতে পারেন যে virtual environment টি activate হয়েছে কিনা।
Expand Down
2 changes: 1 addition & 1 deletion chapters/de/chapter0/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Mit den Skripten "activate" und "deactivate" kannst du in deine virtuelle Umgebu
source .env/bin/activate

# Deaktivieren der virtuellen Umgebung
source .env/bin/deactivate
deactivate
```

Du kannst dich vergewissern, dass die Umgebung aktiviert ist, indem du den Befehl `which python` ausführst: Wenn er auf die virtuelle Umgebung verweist, dann hast du sie erfolgreich aktiviert!
Expand Down
3 changes: 3 additions & 0 deletions chapters/de/chapter3/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@ In diesem Abschnitt verwenden wir den MRPC-Datensatz (Microsoft Research Paraphr
Das Hub enthält nicht nur Modelle; Es hat auch mehrere Datensätze in vielen verschiedenen Sprachen. Du kannst die Datensätze [hier](https://huggingface.co/datasets) durchsuchen, und wir empfehlen, einen weiteren Datensatz zu laden und zu verarbeiten, sobald Sie diesen Abschnitt abgeschlossen haben (die Dokumentation befindet sich [hier](https://huggingface.co/docs/datasets/loading)). Aber jetzt konzentrieren wir uns auf den MRPC-Datensatz! Dies ist einer der 10 Datensätze, aus denen sich das [GLUE-Benchmark](https://gluebenchmark.com/) zusammensetzt. Dies ist ein akademisches Benchmark, das verwendet wird, um die Performance von ML-Modellen in 10 verschiedenen Textklassifizierungsaufgaben zu messen.

Die Bibliothek 🤗 Datasets bietet einen leichten Befehl zum Herunterladen und Caching eines Datensatzes aus dem Hub. Wir können den MRPC-Datensatz wie folgt herunterladen:
<Tipp>
⚠️ ** Warnung** Stelle sicher, dass `datasets` installiert ist, indem du `pip install datasets` ausführst. Dann lade den MRPC-Datensatz und drucke ihn aus, um zu sehen, was er enthält.
</Tipp>

```py
from datasets import load_dataset
Expand Down
20 changes: 20 additions & 0 deletions chapters/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,26 @@
title: End-of-chapter quiz
quiz: 9

- title: 10. Curate high-quality datasets
new: true
subtitle: How to use Argilla to create amazing datasets
sections:
- local: chapter10/1
title: Introduction to Argilla
- local: chapter10/2
title: Set up your Argilla instance
- local: chapter10/3
title: Load your dataset to Argilla
- local: chapter10/4
title: Annotate your dataset
- local: chapter10/5
title: Use your annotated dataset
- local: chapter10/6
title: Argilla, check!
- local: chapter10/7
title: End-of-chapter quiz
quiz: 10

- title: Course Events
sections:
- local: events/1
Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter0/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ You can jump in and out of your virtual environment with the `activate` and `dea
source .env/bin/activate

# Deactivate the virtual environment
source .env/bin/deactivate
deactivate
```

You can make sure that the environment is activated by running the `which python` command: if it points to the virtual environment, then you have successfully activated it!
Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter1/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Here is a brief overview of the course:

- Chapters 1 to 4 provide an introduction to the main concepts of the 🤗 Transformers library. By the end of this part of the course, you will be familiar with how Transformer models work and will know how to use a model from the [Hugging Face Hub](https://huggingface.co/models), fine-tune it on a dataset, and share your results on the Hub!
- Chapters 5 to 8 teach the basics of 🤗 Datasets and 🤗 Tokenizers before diving into classic NLP tasks. By the end of this part, you will be able to tackle the most common NLP problems by yourself.
- Chapters 9 to 12 go beyond NLP, and explore how Transformer models can be used to tackle tasks in speech processing and computer vision. Along the way, you'll learn how to build and share demos of your models, and optimize them for production environments. By the end of this part, you will be ready to apply 🤗 Transformers to (almost) any machine learning problem!
- Chapter 9 goes beyond NLP to cover how to build and share demos of your models on the 🤗 Hub. By the end of this part, you will be ready to showcase your 🤗 Transformers application to the world!

This course:

Expand Down
26 changes: 26 additions & 0 deletions chapters/en/chapter10/1.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Introduction to Argilla[[introduction-to-argilla]]

<CourseFloatingBanner
chapter={10}
classNames="absolute z-10 right-0 top-0"
/>

In Chapter 5 you learnt how to build a dataset using the 🤗 Datasets library and in Chapter 6 you explored how to fine-tune models for some common NLP tasks. In this chapter, you will learn how to use [Argilla](https://argilla.io) to **annotate and curate datasets** that you can use to train and evaluate your models.

The key to training models that perform well is to have high-quality data. Although there are some good datasets in the Hub that you could use to train and evaluate your models, these may not be relevant for your specific application or use case. In this scenario, you may want to build and curate a dataset of your own. Argilla will help you to do this efficiently.

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/signin-hf-page.png" alt="Argilla sign in page."/>

With Argilla you can:

- turn unstructured data into **structured data** to be used in NLP tasks.
- curate a dataset to go from a low-quality dataset to a **high-quality dataset**.
- gather **human feedback** for LLMs and multi-modal models.
- invite experts to collaborate with you in Argilla, or crowdsource annotations!

Here are some of the things that you will learn in this chapter:

- How to set up your own Argilla instance.
- How to load a dataset and configure it based on some popular NLP tasks.
- How to use the Argilla UI to annotate your dataset.
- How to use your curated dataset and export it to the Hub.
55 changes: 55 additions & 0 deletions chapters/en/chapter10/2.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Set up your Argilla instance[[set-up-your-argilla-instance]]

<CourseFloatingBanner chapter={10}
classNames="absolute z-10 right-0 top-0"
notebooks={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/en/chapter10/section2.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/en/chapter10/section2.ipynb"},
]} />

To start using Argilla, you will need to set up your own Argilla instance first. Then you will need to install the Python SDK so that you can manage Argilla using Python code.

## Deploy the Argilla UI

The easiest way to set up your Argilla instance is through Hugging Face Spaces. To create your Argilla Space, simply follow [this form](https://huggingface.co/new-space?template=argilla%2Fargilla-template-space). If you need further guidance, check the [Argilla quickstart](https://docs.argilla.io/latest/getting_started/quickstart/).
<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/space_config.png" alt="Space configuration form."/>

>[!WARNING]
> ⚠️ You may want to enable **Persistent storage** so the data isn't lost if the Space is paused or restarted.
> You can do that from the Settings of your Space.

Once Argilla is up and running, you can log in with your credentials.

## Install and connect the Python SDK

Now you can go to your Python environment or notebook and install the argilla library:

`!pip install argilla`

Let's connect with our Argilla instance. To do that you will need the following information:

- **Your API URL**: This is the URL where Argilla is running. If you are using a Space, you can open the Space, click on the three dots in the top right corner, then "Embed this Space" and copy the **Direct URL**. It should look something like `https://<your-username>.<space-name>.hf.space`.
- **Your API key**: To get your key, log in to your Argilla instance and go to "My Settings", then copy the API key.
- **Your HF token**: If your Space is private, you will need to an Access Token in your Hugging Face Hub account with writing permissions.

```python
import argilla as rg

HF_TOKEN = "..." # only for private spaces

client = rg.Argilla(
api_url="...",
api_key="...",
headers={"Authorization": f"Bearer {HF_TOKEN}"}, # only for private spaces
)
```

To check that everything is working properly, we'll call `me`. This should return our user:

```python
client.me
```

If this worked, your Argilla instance is up and running and you're connected to it! Congrats!

We can now get started with loading our first dataset to Argilla.
108 changes: 108 additions & 0 deletions chapters/en/chapter10/3.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Load your dataset to Argilla[[load-your-dataset-to-argilla]]

<CourseFloatingBanner chapter={10}
classNames="absolute z-10 right-0 top-0"
notebooks={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/en/chapter10/section3.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/en/chapter10/section3.ipynb"},
]} />

Depending on the NLP task that you're working with and the specific use case or application, your data and the annotation task will look differently. For this section of the course, we'll use [a dataset collecting news](https://huggingface.co/datasets/SetFit/ag_news) to complete two tasks: a text classification on the topic of each text and a token classification to identify the named entities mentioned.

<iframe
src="https://huggingface.co/datasets/SetFit/ag_news/embed/viewer/default/train"
frameborder="0"
width="100%"
height="560px"
></iframe>

It is possible to import datasets from the Hub using the Argilla UI directly, but we'll be using the SDK to learn how we can make further edits to the data if needed.

## Configure your dataset

The first step is to connect to our Argilla instance as we did in the previous section:

```python
import argilla as rg

HF_TOKEN = "..." # only for private spaces

client = rg.Argilla(
api_url="...",
api_key="...",
headers={"Authorization": f"Bearer {HF_TOKEN}"}, # only for private spaces
)
```

We can now think about the settings of our dataset in Argilla. These represent the annotation task we'll do over our data. First, we can load the dataset from the Hub and inspect its features, so that we can make sure that we configure the dataset correctly.

```python
from datasets import load_dataset

data = load_dataset("SetFit/ag_news", split="train")
data.features
```

These are the features of our dataset:

```python out
{'text': Value(dtype='string', id=None),
'label': Value(dtype='int64', id=None),
'label_text': Value(dtype='string', id=None)}
```

It contains a `text` and also some initial labels for the text classification. We'll add those to our dataset settings together with a `spans` question for the named entities:

```python
settings = rg.Settings(
fields=[rg.TextField(name="text")],
questions=[
rg.LabelQuestion(
name="label", title="Classify the text:", labels=data.unique("label_text")
),
rg.SpanQuestion(
name="entities",
title="Highlight all the entities in the text:",
labels=["PERSON", "ORG", "LOC", "EVENT"],
field="text",
),
],
)
```

Let's dive a bit deeper into what these settings mean. First, we've defined **fields**, these include the information that we'll be annotating. In this case, we only have one field and it comes in the form of a text, so we've choosen a `TextField`.

Then, we define **questions** that represent the tasks that we want to perform on our data:

- For the text classification task we've chosen a `LabelQuestion` and we used the unique values of the `label_text` column as our labels, to make sure that the question is compatible with the labels that already exist in the dataset.
- For the token classification task, we'll need a `SpanQuestion`. We've defined a set of labels that we'll be using for that task, plus the field on which we'll be drawing the spans.

To learn more about all the available types of fields and questions and other advanced settings, like metadata and vectors, go to the [Argilla docs](https://docs.argilla.io/latest/how_to_guides/dataset/#define-dataset-settings).

## Upload the dataset

Now that we've defined some settings, we can create the dataset:

```python
dataset = rg.Dataset(name="ag_news", settings=settings)

dataset.create()
```

The dataset now appears in our Argilla instance, but you will see that it's empty:

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/empty_dataset.png" alt="Screenshot of the empty dataset."/>

Now we need to add the records that we'll be annotating i.e., the rows in our dataset. To do that, we'll simply need to log the data as records and provide a mapping for those elements that don't have the same name in the hub and Argilla datasets:

```python
dataset.records.log(data, mapping={"label_text": "label"})
```

In our mapping, we've specified that the `label_text` column in the dataset should be mapped to the question with the name `label`. In this way, we'll use the existing labels in the dataset as pre-annotations so we can annotate faster.

While the the records continue to log, you can already start working with your dataset in the Argilla UI. At this point, it should look like this:

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/argilla_initial_dataset.png" alt="Screenshot of the dataset in Argilla."/>

Now our dataset is ready to start annotating!
44 changes: 44 additions & 0 deletions chapters/en/chapter10/4.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Annotate your dataset[[annotate-your-dataset]]

<CourseFloatingBanner
chapter={10}
classNames="absolute z-10 right-0 top-0"
/>

Now it is time to start working from the Argilla UI to annotate our dataset.

## Align your team with annotation guidelines

Before you start annotating your dataset, it is always good practice to write some guidelines, especially if you're working as part of a team. This will help you align on the task and the use of the different labels, and resolve questions or conflicts when they come up.

In Argilla, you can go to your dataset settings page in the UI and modify the guidelines and the descriptions of your questions to help with alignment.

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/argilla_dataset_settings.png" alt="Screenshot of the Dataset Settings page in Argilla."/>

If you want to dive deeper into the topic of how to write good guidelines, we recommend reading [this blogpost](https://argilla.io/blog/annotation-guidelines-practices) and the bibliographical references mentioned there.

## Distribute the task

In the dataset settings page, you can also change the dataset distribution settings. This will help you annotate more efficiently when you're working as part of a team. The default value for the minimum submitted responses is 1, meaning that as soon as a record has 1 submitted response it will be considered complete and count towards the progress in your dataset.

Sometimes, you want to have more than one submitted response per record, for example, if you want to analyze the inter-annotator agreement in your task. In that case, make sure to change this setting to a higher number, but always smaller or equal to the total number of annotators. If you're working on the task alone, you want this setting to be 1.

## Annotate records

>[!TIP]
>💡 If you are deploying Argilla in a Hugging Face Space, any team members will be able to log in using the Hugging Face OAuth. Otherwise, you may need to create users for them following [this guide](https://docs.argilla.io/latest/how_to_guides/user/).

When you open your dataset, you will realize that the first question is already filled in with some suggested labels. That's because in the previous section we mapped our question called `label` to the `label_text` column in the dataset, so that we simply need to review and correct the already existing labels:

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/argilla_initial_dataset.png" alt="Screenshot of the dataset in Argilla."/>

For the token classification, we'll need to add all labels manually, as we didn't include any suggestions. This is how it might look after the span annotations:

<img src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter10/argilla_dataset_with_spans.png" alt="Screenshot of the dataset in Argilla with spans annotated."/>

As you move through the different records, there are different actions you can take:
- submit your responses, once you're done with the record.
- save them as a draft, in case you want to come back to them later.
- discard them, if the record souldn't be part of the dataset or you won't give responses to it.

In the next section, you will learn how you can export and use those annotations.
Loading
Loading