A TermSuite launcher on ISTEX documents.
- Download last termsuite-istex's jar,
- Run istex launcher:
$ java -cp termsuite-istex-1.1.2.jar \
fr.univnantes.termsuite.istex.cli.IstexLauncher \
-t /path/to/tagger \
-l en \
--tsv istex-termino.tsv \
--doc-id F697EDBD85006E482CD1AC91DE9D40F6C629727A,15101397F055B3A872D495F7405D0A3F3E195E0F
Exactly one option in --doc-id
or id-file
must be passed.
--id-file FILE
: A file containing the list of ISTEX document ids of the corpus--doc-id STRING
: The ","-separated list of ids of ISTEX documents
At least one option in --tsv
, --json
, --tbx
must be passed.
--json FILE
: Outputs terminology to JSON file--tbx FILE
: Outputs terminology to TBX file--tsv FILE
: Outputs terminology to TSV file
Many additional configuration options are available (TSV output configuration, filtering, extraction pipeline configuration, etc). All options available for TermSuite script TerminoExtractorCLI
are also available with IstexLauncher
. See official TerminoExtractorCLI
documentation for details.
The main advantage of using docker container for termsuite-istex is that you don't need to install and configure any external tagger anymore.
See termsuite-istex-docker for more information.
to come