-
Notifications
You must be signed in to change notification settings - Fork 0
API ja
Bor Hodošček edited this page Nov 27, 2012
·
10 revisions
natsume-serverはユーザ入力を解析し、一般的にJSONで返答する。
- natsume-server: http://localhost:5011
- cabocha-server: http://localhost:5010
-
文章の分割
- 文章: /text/split
- 入力:
text=1\n2\n3\n\n4
- 出力: nested JSON array
[ [ "1", "2", "3" ], [ "4" ] ]
- 入力:
- パラグラフ:/paragraph/split
- 入力:
text=1\n2\n
- 出力: JSON array
["1", "2"]
- 入力:
- 文章: /text/split
-
文章の解析
- 文: /sentence/analyze
- 入力:
- text: string (
text=なつめを食べる
)
- text: string (
- 出力: 文節のJSON array
[文節1, 文節2]
- 入力:
- パラグラフ:/paragraph/analyze
- 入力:
- text: string (
text=なつめを食べる。\n\nお腹を壊す。
)
- text: string (
- 出力: パラグラフの文節のJSON array
[ [ 文節1, 文節2 ], [ 文節3, 文節4 ] ]
- 入力:
- 文章: /text/analyze
- 入力:
- text: string (
text=なつめを食べる。\n\nお腹を壊す。薬局にいく。
) - filter: array (
filter[]=orthBase&filter[]=registerScore
) - positive: integer (
positive=2
) - negative: integer (
negative=1
)
- text: string (
- 出力: 文章の文節のJSON array
[ [ [ 文節1, 文節2 ] ], [ [ 文節3, 文節4 ], [ 文節5, 文節6 ] ] ]
- 入力:
- 文: /sentence/analyze
curl localhost:5011/corpus/genres
[{"name":"検定教科書","id":1},{"name":"韻文","id":2}]
- jQueryからの使用
$.get('http://localhost:5011/text/split', {"text":"今日は、世界!\nHello world.\n\nŽivjo."}, function(d){console.log(d);});
- d3からの使用
d3.json('http://localhost:5011/text/analyze?' + JSON.stringify({"text":"今日は、世界!\nHello world.\n\nŽivjo."}), function(d){console.log(d);});
GETのクエリーでは送るデータが大量な場合はPOSTのBODYを使用出来る。(TODO)
PerlからAPIを使用することを前提に説明する。
#!/usr/bin/env perl
use 5.014;
use warnings;
use JSON::PP;
use Encode qw(decode_utf8);
use Mojo::UserAgent;
my $json = JSON::PP->new->utf8;
my $ua = Mojo::UserAgent->new;
参考: Mojo::UserAgent - Non-blocking I/O HTTP and WebSocket user agent
my $long_body = decode_utf8 "パラグラフ1です。\n\n次のパラグラフです!文が続いてあります。\nまた文があります。";
my $body = decode_utf8 'なつめを食べる';
- FIXME:
decode_utf8
- JSON queryの使用:
say $json->pretty->encode(
$ua->get('http://localhost:5011/text/split', encode_json({text => $long_body}))->res->json
);
- URL queryの使用
say $json->pretty->encode(
$ua->get('http://localhost:5011/text/split?text=' . $long_body)->res->json
);
- 出力:
[
[
"パラグラフ1です。"
],
[
"次のパラグラフです!",
"文が続いてあります。",
"また文があります。"
]
]
- 入力:
say $json->pretty->encode(
$ua->get('http://localhost:5011/paragraph/split', encode_json({text => "1\n2\n3"}))->res->json
);
- 出力:
[
"1",
"2",
"3"
]
- 入力:
say $json->pretty->encode(
$ua->get('http://localhost:5011/text/analyze', encode_json {text => $body})->res->json
);
- 出力:
[
[
[
{
"link" : 1,
"head" : 0,
"head-pos" : "noun",
"tokens" : [
{
"ne" : "O",
"lForm" : "ナツメ",
"fForm" : "*",
"orthBase" : "なつめ",
"pos3" : "一般",
"aType" : "0",
"fType" : "*",
"pos1" : "名詞",
"aConType" : "C2",
"cForm" : "*",
"pos2" : "普通名詞",
"begin" : 0,
"pron" : "ナツメ",
"iType" : "*",
"goshu" : "和",
"fConType" : "*",
"pos4" : "*",
"IConType" : "*",
"formBase" : "ナツメ",
"iForm" : "*",
"lemma" : "棗",
"pronBase" : "ナツメ",
"end" : 3,
"kanaBase" : "ナツメ",
"kana" : "ナツメ",
"orth" : "なつめ",
"registerScore" : 12.3833984107791,
"aModType" : "*",
"cType" : "*"
},
{
"ne" : "O",
"lForm" : "ヲ",
"fForm" : "*",
"orthBase" : "を",
"pos3" : "*",
"aType" : "*",
"fType" : "*",
"pos1" : "助詞",
"aConType" : "動詞%F2@0,名詞%F1,形容詞%F2@-1",
"cForm" : "*",
"pos2" : "格助詞",
"begin" : 3,
"pron" : "オ",
"iType" : "*",
"goshu" : "和",
"fConType" : "*",
"pos4" : "*",
"IConType" : "*",
"formBase" : "ヲ",
"iForm" : "*",
"lemma" : "を",
"pronBase" : "オ",
"end" : 4,
"kanaBase" : "ヲ",
"kana" : "ヲ",
"orth" : "を",
"registerScore" : 0.183184737661352,
"aModType" : "*",
"cType" : "*"
}
],
"tail-tags" : [],
"head-tags" : [],
"prob" : 0,
"tail-string" : "を",
"tail" : 1,
"id" : 0,
"tail-pos" : "particle",
"head-string" : "なつめ"
},
{
"link" : -1,
"head" : 0,
"head-pos" : "verb",
"tokens" : [
{
"ne" : "O",
"lForm" : "タベル",
"fForm" : "*",
"orthBase" : "食べる",
"pos3" : "*",
"aType" : "2",
"fType" : "*",
"pos1" : "動詞",
"aConType" : "C1",
"cForm" : "終止形-一般",
"pos2" : "一般",
"begin" : 4,
"pron" : "タベル",
"iType" : "*",
"goshu" : "和",
"fConType" : "*",
"pos4" : "*",
"IConType" : "*",
"formBase" : "タベル",
"iForm" : "*",
"lemma" : "食べる",
"pronBase" : "タベル",
"end" : 7,
"kanaBase" : "タベル",
"kana" : "タベル",
"orth" : "食べる",
"registerScore" : 9.7443410811638,
"aModType" : "*",
"cType" : "下一段-バ行"
}
],
"tail-tags" : null,
"head-tags" : [],
"prob" : 0,
"tail-string" : null,
"tail" : null,
"id" : 1,
"tail-pos" : null,
"head-string" : "食べる"
}
]
]
]
- 入力:
say $json->pretty->encode(
$ua->get('http://localhost:5011/text/analyze?positive=1&negative=2', encode_json {text => $body})->res->json
);
-
【注意】 現在の誤用判定は簡単な頻度の差分の計算で実装した
-
出力:
[
[
[
{
"link" : 1,
"head" : 0,
"head-pos" : "noun",
"tokens" : [
{
...
"registerScore" : 12.3833984107791,
...
},
{
...
"registerScore" : 0.183184737661352,
...
}
],
"tail-tags" : [],
...
},
{
"link" : -1,
"head" : 0,
"head-pos" : "verb",
"tokens" : [
{
...
"registerScore" : 9.7443410811638,
...
}
],
"tail-tags" : null,
...
}
]
]
]
- 入力:
say $json->pretty->encode(
$ua->get('http://localhost:5011/text/analyze?positive=2&negative=1&filter[]=lemma&filter[]=registerScore', encode_json {text => $body})->res->json
);
- 出力:
[
{
"registerScore" : 13.9286901250108,
"lemma" : "棗"
},
{
"registerScore" : 3.42779566419561,
"lemma" : "を"
},
{
"registerScore" : 8.48627241448905,
"lemma" : "食べる"
}
]