Skip to content

Commit

Permalink
Improve prune_at parameter description for `gensim.corpora.Dictiona…
Browse files Browse the repository at this point in the history
…ry` (#2128)

* Make clear `prune_at` documentation

According to the code, the `prune_at` parameter in `Dictionary.__init__` and `add_documents` is only for reducing memory usage, and has no guarantee on correctness, but the documentation of this parameter was confusing to users.

* add link to method
  • Loading branch information
yxonic authored and menshikh-iv committed Jul 31, 2018
1 parent 4d921da commit a6c4ea4
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions gensim/corpora/dictionary.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,9 @@ def __init__(self, documents=None, prune_at=2000000):
documents : iterable of iterable of str, optional
Documents to be used to initialize the mapping and collect corpus statistics.
prune_at : int, optional
Dictionary will keep no more than `prune_at` words in its mapping, to limit its RAM footprint.
Dictionary will try to keep no more than `prune_at` words in its mapping, to limit its RAM
footprint, the correctness is not guaranteed.
Use :meth:`~gensim.corpora.dictionary.Dictionary.filter_extremes` to perform proper filtering.
Examples
--------
Expand Down Expand Up @@ -172,7 +174,9 @@ def add_documents(self, documents, prune_at=2000000):
documents : iterable of iterable of str
Input corpus. All tokens should be already **tokenized and normalized**.
prune_at : int, optional
Dictionary will keep no more than `prune_at` words in its mapping, to limit its RAM footprint.
Dictionary will try to keep no more than `prune_at` words in its mapping, to limit its RAM
footprint, the correctness is not guaranteed.
Use :meth:`~gensim.corpora.dictionary.Dictionary.filter_extremes` to perform proper filtering.
Examples
--------
Expand Down

0 comments on commit a6c4ea4

Please sign in to comment.