Skip to content

itkach/tei2slob

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

tei2slob

This is a tool to convert TEI P5 dictionaries to slob format. Some free TEI P5 dictionaries are available at http://freedict.org/

Installation

Create Python 3 virtual environment and install slob.py as described at http://github.com/itkach/slob/.

In this virtual environment run

pip install git+https://github.com/itkach/tei2slob.git

Usage

Download a dictionary archive and unpack it. For example:

wget http://downloads.sourceforge.net/project/freedict/English%20-%20German/0.3.6/freedict-eng-deu-0.3.6.src.tar.bz2
tar -xvf freedict-eng-deu-0.3.6.src.tar.bz2

Then run converter:

tei2slob eng-deu/eng-deu.tei

eng-deu-0.3.6.slob will be created in the same directory.

Converter attempts to populate dictionary tags based on information in .tei header section, but it may fail because the way some elements (like license name) is not standardized and varies across dictionaries, so be sure to check the tags:

slob info eng-deu-0.3.6.slob

Set tag values as necessary, for example:

slob tag -n license.name -v "GNU General Public License" eng-deu-0.3.6.slob
slob tag -n license.url -v "http://www.gnu.org/licenses/gpl.html" eng-deu-0.3.6.slob
slob tag -n created.by -v me@example.com eng-deu-0.3.6.slob

uri is an important tag. When different dictionaries have the same uri it means they contain keys belonging to the same logical dictionary. So when compiling a new version of existing dictionary make sure uri remains the same.

usage: tei2slob [-h] [-o OUTPUT_FILE] [-c {lzma2,zlib}] [-b BIN_SIZE]
                [-a CREATED_BY] [-w WORK_DIR]
                input_file

positional arguments:
  input_file            TEI file name

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Name of output slob file
  -c {lzma2,zlib}, --compression {lzma2,zlib}
                        Name of compression to use. Default: zlib
  -b BIN_SIZE, --bin-size BIN_SIZE
                        Minimum storage bin size in kilobytes. Default: 256
  -a CREATED_BY, --created-by CREATED_BY
                        Value for created.by tag. Identifier (e.g. name or
                        email) for slob file creator
  -w WORK_DIR, --work-dir WORK_DIR
                        Directory for temporary files created during
                        compilation. Default: .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published