Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make clear prune_at documentation #2128

Merged
merged 2 commits into from
Jul 31, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions gensim/corpora/dictionary.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,9 @@ def __init__(self, documents=None, prune_at=2000000):
documents : iterable of iterable of str, optional
Documents to be used to initialize the mapping and collect corpus statistics.
prune_at : int, optional
Dictionary will keep no more than `prune_at` words in its mapping, to limit its RAM footprint.
Dictionary will try to keep no more than `prune_at` words in its mapping, to limit its RAM
footprint, the correctness is not guaranteed.
Use :meth:`~gensim.corpora.dictionary.Dictionary.filter_extremes` to perform proper filtering.

Examples
--------
Expand Down Expand Up @@ -172,7 +174,9 @@ def add_documents(self, documents, prune_at=2000000):
documents : iterable of iterable of str
Input corpus. All tokens should be already **tokenized and normalized**.
prune_at : int, optional
Dictionary will keep no more than `prune_at` words in its mapping, to limit its RAM footprint.
Dictionary will try to keep no more than `prune_at` words in its mapping, to limit its RAM
footprint, the correctness is not guaranteed.
Use :meth:`~gensim.corpora.dictionary.Dictionary.filter_extremes` to perform proper filtering.

Examples
--------
Expand Down