Skip to content

ro_legal_fl

Pre-release
Pre-release
Compare
Choose a tag to compare
@senisioi senisioi released this 16 Sep 21:48
· 16 commits to main since this release

A Spacy Package for Legal Document Processing & Other Resources

A spacy language model for Romanian with floret embeddings trained on legal documents and with legal NER capabilities.

Feature Description
Name ro_legal_fl
Version 3.6.1
spaCy >=3.6.1,<3.7.0
Default Pipeline tok2vec, tagger, morphologizer, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, tagger, morphologizer, parser, lemmatizer, attribute_ruler, ner
Vectors -1 keys, 100000 unique vectors (280 dimensions)
Sources MARCELL legislative corpus, LegalNeRo, RoNEC
License CC4R https://constantvzw.org/wefts/cc4r.en.html
Author Sergiu Nisioi

Accuracy

Type Score
TOK 99.84
TAG 96.27
POS 97.12
MORPH 96.42
LEMMA 95.73
UAS 89.15
LAS 82.46
SENT_P 94.94
SENT_R 95.20
SENT_F 95.07
ENTS_F 78.35
ENTS_P 79.51
ENTS_R 77.23

NER per type

P R F
MONEY 88.52 72.32 79.61
DATETIME 85.31 84.58 84.94
PERSON 76.71 72.40 74.49
QUANTITY 89.27 84.55 86.85
NUMERIC 86.53 81.72 84.06
LEGAL 71.24 83.85 77.03
ORG 69.24 71.96 70.58
ORDINAL 89.14 89.14 89.14
PERIOD 84.39 74.11 78.92
NAT_REL_POL 85.09 77.46 81.10
GPE 81.95 82.75 82.35
WORK_OF_ART 39.15 28.14 32.74
LOC 55.28 52.35 53.78
EVENT 54.89 43.20 48.34
LANGUAGE 80.28 78.08 79.17
FACILITY 60.14 47.98 53.38