Skip to content
This repository has been archived by the owner on Apr 1, 2020. It is now read-only.

Latest commit

 

History

History
27 lines (18 loc) · 1.22 KB

README.md

File metadata and controls

27 lines (18 loc) · 1.22 KB

CI status Docker status

Deploy to Azure

jp_tokenizer

This repository contains a tiny web service that lets you tokenize and lemmatize Japanese text.

The service is implemented by wrapping the MeCab tokenizer (paper) in a Sanic app.

Usage

Ensure that your server has at least 2-3GB of available RAM (e.g. Azure Standard DS1_v2) and then run:

# start a container for the service and its dependencies
docker run -p 8080:80 cwolff/jp_tokenizer

# call the API
curl -X POST 'http://localhost:8080/tokenize' --data 'サザエさんは走った'
curl -X POST 'http://localhost:8080/lemmatize' --data 'サザエさんは走った'

The API will respond with a space-delimited string of tokens/lemmas.