GitHub - ZurichNLP/swissbert: The multilingual language model for Switzerland

SwissBERT is a masked language model for processing Switzerland-related text. It has been trained on more than 21 million Swiss news articles retrieved from Swissdox@LiRI.

The model is based on X-MOD, which has been pre-trained with language adapters in 81 languages. SwissBERT contains adapters for the national languages of Switzerland – German, French, Italian, and Romansh Grischun. In addition, it uses a Switzerland-specific subword vocabulary.

The easiest way to use SwissBERT is via the transformers library and the Hugging Face model hub: https://huggingface.co/ZurichNLP/swissbert

More information on the model design and evaluation is provided in our paper "SwissBERT: The Multilingual Language Model for Switzerland" (SwissText 2023).

License

This code repository: MIT license
Model: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Pre-training code

See pretraining

Evaluation code

Citation

@inproceedings{vamvas-etal-2023-swissbert,
    title = "{S}wiss{BERT}: The Multilingual Language Model for {S}witzerland",
    author = {Vamvas, Jannis  and
      Gra{\"e}n, Johannes  and
      Sennrich, Rico},
    editor = {Ghorbel, Hatem  and
      Sokhn, Maria  and
      Cieliebak, Mark  and
      H{\"u}rlimann, Manuela  and
      de Salis, Emmanuel  and
      Guerne, Jonathan},
    booktitle = "Proceedings of the 8th edition of the Swiss Text Analytics Conference",
    month = jun,
    year = "2023",
    address = "Neuchatel, Switzerland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.swisstext-1.6",
    pages = "54--69",
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
evaluation		evaluation
pretraining		pretraining
LICENSE		LICENSE
README.md		README.md
swissbert-diagram.png		swissbert-diagram.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

License

Pre-training code

Evaluation code

SwissNER

HIPE-2022

X-Stance

German–Romansh alignment

Citation

About

Releases

Packages

Languages

License

ZurichNLP/swissbert

Folders and files

Latest commit

History

Repository files navigation

License

Pre-training code

Evaluation code

SwissNER

HIPE-2022

X-Stance

German–Romansh alignment

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages