You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a code-mixed Indonesian-Javanese-English dataset for token-level language identification. We named this dataset as IJELID (Indonesian-Javanese-English Language Identification). This dataset contains tweets that have been tokenized with the corresponding token and its language label. There are seven language labels in the dataset, namely: ID (Indonesian)JV (Javanese), EN (English),
Dataloader name:
ijelid/ijelid.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?ijelid
The text was updated successfully, but these errors were encountered: