Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix misleading Doc2Vec.docvecs comment #2472

Merged
merged 3 commits into from
May 4, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 5 additions & 10 deletions gensim/models/doc2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,18 +447,13 @@ class Doc2Vec(BaseWordEmbeddingsModel):
directly to query those embeddings in various ways. See the module level docstring for examples.

docvecs : :class:`~gensim.models.keyedvectors.Doc2VecKeyedVectors`
This object contains the paragraph vectors. Remember that the only difference between this model and
:class:`~gensim.models.word2vec.Word2Vec` is that besides the word vectors we also include paragraph embeddings
to capture the paragraph.
This object contains the paragraph vectors learned from the training data. There will be one such vector
for each unique document tag supplied during training. They may be individually accessed using the tag
as an indexed-access key. For example, if one of the training documents used a tag of 'doc003':

In this way we can capture the difference between the same word used in a different context.
For example we now have a different representation of the word "leaves" in the following two sentences ::

1. Manos leaves the office every day at 18:00 to catch his train
2. This season is called Fall, because leaves fall from the trees.
.. sourcecode:: pycon

In a plain :class:`~gensim.models.word2vec.Word2Vec` model the word would have exactly the same representation
in both sentences, in :class:`~gensim.models.doc2vec.Doc2Vec` it will not.
>>> model.docvecs['doc003']

vocabulary : :class:`~gensim.models.doc2vec.Doc2VecVocab`
This object represents the vocabulary (sometimes called Dictionary in gensim) of the model.
Expand Down