API ja

natsume-server APIマニュアル

API概要

natsume-serverはユーザ入力を解析し、一般的にJSONで返答する。

Port

natsume-server: http://localhost:5011
cabocha-server: http://localhost:5010

API

文章の分割
- 文章：　　　/text/split
  - 入力： text=1\n2\n3\n\n4
  - 出力： nested JSON array [ [ "1", "2", "3" ], [ "4" ] ]
- パラグラフ：/paragraph/split
  - 入力： text=1\n2\n
  - 出力： JSON array ["1", "2"]
文章の解析
- 文：　　　　/sentence/analyze
  - 入力：
    - text: string (text=なつめを食べる)
  - 出力：文節のJSON array [文節1, 文節2]
- パラグラフ：/paragraph/analyze
  - 入力：
    - text: string (text=なつめを食べる。\n\nお腹を壊す。)
  - 出力：パラグラフの文節のJSON array [ [ 文節1, 文節2 ], [ 文節3, 文節4 ] ]
- 文章：　　　/text/analyze
  - 入力：
    - text: string (text=なつめを食べる。\n\nお腹を壊す。薬局にいく。)
    - filter: array (filter[]=orthBase&filter[]=registerScore)
    - positive: integer (positive=2)
    - negative: integer (negative=1)
  - 出力：文章の文節のJSON array [ [ [ 文節1, 文節2 ] ], [ [ 文節3, 文節4 ], [ 文節5, 文節6 ] ] ]

APIのアクセスポイント

コーパス情報

curl localhost:5011/corpus/genres

[{"name":"検定教科書","id":1},{"name":"韻文","id":2}]

TODO

JavaScriptからの使用例

GET

jQueryからの使用

$.get('http://localhost:5011/text/split', {"text":"今日は、世界！\nHello world.\n\nŽivjo."}, function(d){console.log(d);});

d3からの使用

d3.json('http://localhost:5011/text/analyze?' + JSON.stringify({"text":"今日は、世界！\nHello world.\n\nŽivjo."}), function(d){console.log(d);});

POST

GETのクエリーでは送るデータが大量な場合はPOSTのBODYを使用出来る。(TODO)

Perlからの使用例

PerlからAPIを使用することを前提に説明する。

Perlスクリプトの冒頭

#!/usr/bin/env perl
use 5.014;
use warnings;

use JSON::PP;
use Encode qw(decode_utf8);
use Mojo::UserAgent;

my $json = JSON::PP->new->utf8;

my $ua = Mojo::UserAgent->new;

参考： Mojo::UserAgent - Non-blocking I/O HTTP and WebSocket user agent

データ

my $long_body = decode_utf8 "パラグラフ１です。\n\n次のパラグラフです！文が続いてあります。\nまた文があります。";
my $body      = decode_utf8 'なつめを食べる';

FIXME: decode_utf8

文章の分割

テキストを文まで分割

JSON queryの使用：

say $json->pretty->encode(
    $ua->get('http://localhost:5011/text/split', encode_json({text => $long_body}))->res->json
    );

URL queryの使用

say $json->pretty->encode(
    $ua->get('http://localhost:5011/text/split?text=' . $long_body)->res->json
    );

出力：

[
   [
      "パラグラフ１です。"
   ],
   [
      "次のパラグラフです！",
      "文が続いてあります。",
      "また文があります。"
   ]
]

パラグラフを文に分割

入力：

say $json->pretty->encode(
    $ua->get('http://localhost:5011/paragraph/split', encode_json({text => "1\n2\n3"}))->res->json
    );

出力：

[
   "1",
   "2",
   "3"
]

文章の解析

入力：

say $json->pretty->encode(
    $ua->get('http://localhost:5011/text/analyze', encode_json {text => $body})->res->json
    );

出力：

[
   [
      [
         {
            "link" : 1,
            "head" : 0,
            "head-pos" : "noun",
            "tokens" : [
               {
                  "ne" : "O",
                  "lForm" : "ナツメ",
                  "fForm" : "*",
                  "orthBase" : "なつめ",
                  "pos3" : "一般",
                  "aType" : "0",
                  "fType" : "*",
                  "pos1" : "名詞",
                  "aConType" : "C2",
                  "cForm" : "*",
                  "pos2" : "普通名詞",
                  "begin" : 0,
                  "pron" : "ナツメ",
                  "iType" : "*",
                  "goshu" : "和",
                  "fConType" : "*",
                  "pos4" : "*",
                  "IConType" : "*",
                  "formBase" : "ナツメ",
                  "iForm" : "*",
                  "lemma" : "棗",
                  "pronBase" : "ナツメ",
                  "end" : 3,
                  "kanaBase" : "ナツメ",
                  "kana" : "ナツメ",
                  "orth" : "なつめ",
                  "registerScore" : 12.3833984107791,
                  "aModType" : "*",
                  "cType" : "*"
               },
               {
                  "ne" : "O",
                  "lForm" : "ヲ",
                  "fForm" : "*",
                  "orthBase" : "を",
                  "pos3" : "*",
                  "aType" : "*",
                  "fType" : "*",
                  "pos1" : "助詞",
                  "aConType" : "動詞%F2@0,名詞%F1,形容詞%F2@-1",
                  "cForm" : "*",
                  "pos2" : "格助詞",
                  "begin" : 3,
                  "pron" : "オ",
                  "iType" : "*",
                  "goshu" : "和",
                  "fConType" : "*",
                  "pos4" : "*",
                  "IConType" : "*",
                  "formBase" : "ヲ",
                  "iForm" : "*",
                  "lemma" : "を",
                  "pronBase" : "オ",
                  "end" : 4,
                  "kanaBase" : "ヲ",
                  "kana" : "ヲ",
                  "orth" : "を",
                  "registerScore" : 0.183184737661352,
                  "aModType" : "*",
                  "cType" : "*"
               }
            ],
            "tail-tags" : [],
            "head-tags" : [],
            "prob" : 0,
            "tail-string" : "を",
            "tail" : 1,
            "id" : 0,
            "tail-pos" : "particle",
            "head-string" : "なつめ"
         },
         {
            "link" : -1,
            "head" : 0,
            "head-pos" : "verb",
            "tokens" : [
               {
                  "ne" : "O",
                  "lForm" : "タベル",
                  "fForm" : "*",
                  "orthBase" : "食べる",
                  "pos3" : "*",
                  "aType" : "2",
                  "fType" : "*",
                  "pos1" : "動詞",
                  "aConType" : "C1",
                  "cForm" : "終止形-一般",
                  "pos2" : "一般",
                  "begin" : 4,
                  "pron" : "タベル",
                  "iType" : "*",
                  "goshu" : "和",
                  "fConType" : "*",
                  "pos4" : "*",
                  "IConType" : "*",
                  "formBase" : "タベル",
                  "iForm" : "*",
                  "lemma" : "食べる",
                  "pronBase" : "タベル",
                  "end" : 7,
                  "kanaBase" : "タベル",
                  "kana" : "タベル",
                  "orth" : "食べる",
                  "registerScore" : 9.7443410811638,
                  "aModType" : "*",
                  "cType" : "下一段-バ行"
               }
            ],
            "tail-tags" : null,
            "head-tags" : [],
            "prob" : 0,
            "tail-string" : null,
            "tail" : null,
            "id" : 1,
            "tail-pos" : null,
            "head-string" : "食べる"
         }
      ]
   ]
]

文章の解析と形態素ごとのレジスタ誤用判定

入力：

say $json->pretty->encode(
    $ua->get('http://localhost:5011/text/analyze?positive=1&negative=2', encode_json {text => $body})->res->json
    );

【注意】　現在の誤用判定は簡単な頻度の差分の計算で実装した
出力：

[
   [
      [
         {
            "link" : 1,
            "head" : 0,
            "head-pos" : "noun",
            "tokens" : [
               {
                  ...
                  "registerScore" : 12.3833984107791,
                  ...
               },
               {
                  ...
                  "registerScore" : 0.183184737661352,
                  ...
               }
            ],
            "tail-tags" : [],
            ...
         },
         {
            "link" : -1,
            "head" : 0,
            "head-pos" : "verb",
            "tokens" : [
               {
                  ...
                  "registerScore" : 9.7443410811638,
                  ...
               }
            ],
            "tail-tags" : null,
            ...
         }
      ]
   ]
]

前回同様でさらに形態素のfeatureでフィルターする

入力：

say $json->pretty->encode(
    $ua->get('http://localhost:5011/text/analyze?positive=2&negative=1&filter[]=lemma&filter[]=registerScore', encode_json {text => $body})->res->json
    );

出力：

[
   {
      "registerScore" : 13.9286901250108,
      "lemma" : "棗"
   },
   {
      "registerScore" : 3.42779566419561,
      "lemma" : "を"
   },
   {
      "registerScore" : 8.48627241448905,
      "lemma" : "食べる"
   }
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API ja

natsume-server APIマニュアル

API概要

Port

API

APIのアクセスポイント

コーパス情報

TODO

JavaScriptからの使用例

GET

POST

Perlからの使用例

Perlスクリプトの冒頭

データ

文章の分割

テキストを文まで分割

パラグラフを文に分割

文章の解析

文章の解析と形態素ごとのレジスタ誤用判定

前回同様でさらに形態素のfeatureでフィルターする

Clone this wiki locally