Skip to content

Commit

Permalink
Optimize Native unsupervised FastText (#1742)
Browse files Browse the repository at this point in the history
* adds fasttext extension to setup

* cythonizes training using skipgram with negative sampling

* loop over indexes using index iterator

* cythonizes training using skipgram with hierarchical softmax

* adds cython generated .c file

* resolves segmentation fault with multiple workers

* fixes accuracy issues due to reference counts of word_subwords becoming 0

* cythonizes fasttext cbow architecture

* cleans extra variables/values

* corrects parameters order for word_locks* in cbow

* fixes indentation, unused imports and logging warning for slow version

* splits long lines and removes redundant `import`/`else`

* minor: removes redundant `else`

* adds docstring

* changes docstrings style, splits long lines

* fix references in fasttext docstring

* adds deleted else in cbow-neg

* fixes docstring format

* adds missing docstrings

* add missing __getitem__ to rst

* add missing import to __init__ (`from gensim.models import FastText` instead of `from gensim.models.fasttext ...`)

* fix docs
  • Loading branch information
manneshiva authored and menshikh-iv committed Dec 7, 2017
1 parent cf46f69 commit d2cb79c
Show file tree
Hide file tree
Showing 7 changed files with 14,154 additions and 75 deletions.
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@ include gensim/models/word2vec_inner.pyx
include gensim/models/word2vec_inner.pxd
include gensim/models/doc2vec_inner.c
include gensim/models/doc2vec_inner.pyx
include gensim/models/fasttext_inner.c
include gensim/models/fasttext_inner.pyx
1 change: 1 addition & 0 deletions docs/src/models/fasttext.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@
:synopsis: FastText model
:members:
:inherited-members:
:special-members: __getitem__
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions gensim/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from .normmodel import NormModel # noqa:F401
from .atmodel import AuthorTopicModel # noqa:F401
from .ldaseqmodel import LdaSeqModel # noqa:F401
from .fasttext import FastText # noqa:F401

from . import wrappers # noqa:F401

Expand Down
533 changes: 458 additions & 75 deletions gensim/models/fasttext.py

Large diffs are not rendered by default.

Loading

0 comments on commit d2cb79c

Please sign in to comment.