Skip to content

Commit

Permalink
Merge pull request #29 from WorksApplications/update-dictionaries
Browse files Browse the repository at this point in the history
Update dictionaries
  • Loading branch information
kazuma-t committed Jun 9, 2021
2 parents c68accc + 96beb01 commit 1df5098
Show file tree
Hide file tree
Showing 3 changed files with 248 additions and 259 deletions.
36 changes: 19 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ A lexicon for Japanese tokenizer

Click [here](http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachidict/) for pre-built dictionaries.

Pre-built synonym dictionaries for [Chikkar](https://github.com/WorksApplications/chikkar/) is [here](http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachisynonym/).

### Python packages

You can install the dictionaries for [WorksApplications/SudachiPy](https://github.com/WorksApplications/SudachiPy), the Python version of Sudachi, as Python packages.
Expand All @@ -17,37 +19,36 @@ In SudachiPy v0.5.2 and later, you can specify a dictionary directly from a comm

please see the following links for more details on the dictionary option.

* english
* [https://github.com/WorksApplications/SudachiPy#dictionary-edition](https://github.com/WorksApplications/SudachiPy#dictionary-edition)
* japanese
* [https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)
- english
- [https://github.com/WorksApplications/SudachiPy#dictionary-edition](https://github.com/WorksApplications/SudachiPy#dictionary-edition)
- japanese
- [https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)

#### Install

```bash
$ pip install sudachidict_core
pip install sudachidict_core
```

```bash
$ pip install sudachidict_small
pip install sudachidict_small
```

```bash
$ pip install sudachidict_full
pip install sudachidict_full
```

* [SudachiDict-small · PyPI](https://pypi.org/project/SudachiDict-small/)
* [SudachiDict-core · PyPI](https://pypi.org/project/SudachiDict-core/)
* [SudachiDict-full · PyPI](https://pypi.org/project/SudachiDict-full/)

- [SudachiDict-small · PyPI](https://pypi.org/project/SudachiDict-small/)
- [SudachiDict-core · PyPI](https://pypi.org/project/SudachiDict-core/)
- [SudachiDict-full · PyPI](https://pypi.org/project/SudachiDict-full/)

## Dictionary types

Sudachi has three types of dictionaries.

* Small: includes only the vocabulary of UniDic
* Core: includes basic vocabulary (default)
* Full: includes miscellaneous proper nouns
- Small: includes only the vocabulary of UniDic
- Core: includes basic vocabulary (default)
- Full: includes miscellaneous proper nouns

## Build from sources

Expand All @@ -58,9 +59,9 @@ Git LFS and `git lfs pull`.
Building the dictionaries fails with a locale other than UTF-8.
Add `-Dfile.encoding=UTF-8` to `MAVEN_OPTS`.


## Licenses

```text
SudachiDict by Works Applications Co., Ltd. is licensed under the [Apache License, Version2.0](http://www.apache.org/licenses/LICENSE-2.0.html)
Copyright (c) 2017 Works Applications Co., Ltd.
Expand All @@ -78,6 +79,7 @@ SudachiDict by Works Applications Co., Ltd. is licensed under the [Apache Licens
limitations under the License.
This project includes UniDic and a part of NEologd.
```

- http://unidic.ninjal.ac.jp/
- https://github.com/neologd/mecab-ipadic-neologd
- <http://unidic.ninjal.ac.jp/>
- <https://github.com/neologd/mecab-ipadic-neologd>
23 changes: 11 additions & 12 deletions python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ The dictionary files are not included in the packages; It will be downloaded upo

The version (e.g., `20200330`) and the edition (`small`, `core`, or `full`) is specified in `INFO.json`.


## Commands to download and set the dictionaries

In SudachiPy v0.5.2 and later, you can specify a dictionary directly from a command line or program.
Expand All @@ -22,30 +21,30 @@ In SudachiPy v0.5.2 and later, you can specify a dictionary directly from a comm

Please see the following links for more details on the dictionary option.

* english
* [https://github.com/WorksApplications/SudachiPy#dictionary-edition](https://github.com/WorksApplications/SudachiPy#dictionary-edition)
* japanese
* [https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)
- english
- [https://github.com/WorksApplications/SudachiPy#dictionary-edition](https://github.com/WorksApplications/SudachiPy#dictionary-edition)
- japanese
- [https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/develop/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)

### Install

```bash
$ pip install sudachidict_core
pip install sudachidict_core
```

```bash
$ pip install sudachidict_small
pip install sudachidict_small
```

```bash
$ pip install sudachidict_full
pip install sudachidict_full
```

### Dictionary option in SudachiPy before v0.5.2

In case you are using SudachiPy before v0.5.2, please visit the old SudachiPy documentation.

* english
* [https://github.com/WorksApplications/SudachiPy/tree/v0.5.1#dictionary-edition](https://github.com/WorksApplications/SudachiPy/tree/v0.5.1#dictionary-edition)
* japanese
* [https://github.com/WorksApplications/SudachiPy/blob/v0.5.1/docs/tutorial.md#%辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/v0.5.1/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)
- english
- [https://github.com/WorksApplications/SudachiPy/tree/v0.5.1#dictionary-edition](https://github.com/WorksApplications/SudachiPy/tree/v0.5.1#dictionary-edition)
- japanese
- [https://github.com/WorksApplications/SudachiPy/blob/v0.5.1/docs/tutorial.md#%辞書の種類](https://github.com/WorksApplications/SudachiPy/blob/v0.5.1/docs/tutorial.md#%E8%BE%9E%E6%9B%B8%E3%81%AE%E7%A8%AE%E9%A1%9E)
Loading

0 comments on commit 1df5098

Please sign in to comment.