Skip to content

NameTag 3.0.0

Latest
Compare
Choose a tag to compare
@strakova strakova released this 05 Feb 12:14
· 3 commits to main since this release

NameTag 3.0 is an open-source tool for both flat and nested named entity recognition (NER). NameTag 3 identifies proper names in text and classifies them into a set of predefined categories, such as names of persons, locations, organizations, etc.

NameTag 3.0 offers state-of-the-art or near state-of-the-art performance in English, German, Spanish, Dutch, Czech and Ukrainian.

NameTag 3.0 is a free software under Mozilla Public License 2.0, and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions. NameTag is versioned using Semantic Versioning.

Copyright 2024 Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic.

Current Release

NameTag 3.0 can be used either as a commandline tool or by requesting the NameTag webservice:

NameTag 3.0 source code can be found at GitHub.

The NameTag website contains download links of both the released packages and trained models, hosts documentation and refers to demo and online web service.

License

Copyright 2024 Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic.

NameTag 3.0 is a free software under Mozilla Public License 2.0 license and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions. NameTag is versioned using Semantic Versioning.

Please Cite as (How to Cite)

If you use this software, please give us credit by referencing Straková et al. (2019):

@inproceedings{strakova-etal-2019-neural,
    title = "Neural Architectures for Nested {NER} through Linearization",
    author = "Strakov{\'a}, Jana  and
      Straka, Milan  and
      Hajic, Jan",
    editor = "Korhonen, Anna  and
      Traum, David  and
      M{\`a}rquez, Llu{\'\i}s",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P19-1527",
    doi = "10.18653/v1/P19-1527",
    pages = "5326--5331",
}

Versions

Compared to NameTag 2, NameTag 3 is a fine-tuned large language model (LLM) with either a classification head for flat NEs (e.g., the CoNLL-2003 English data) or with seq2seq decoding head for nested NEs (e.g., the CNEC 2.0 Czech data). The seq2seq decoding head is the head proposed by Straková et al. (2019).