Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WmdSimilarity documentation #2217

Merged
merged 3 commits into from
Oct 8, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 9 additions & 12 deletions gensim/similarities/docsim.py
Original file line number Diff line number Diff line change
Expand Up @@ -826,7 +826,7 @@ def get_similarities(self, query):

Parameters
----------
query : {list of (int, number), iterable of list of (int, number), :class:`scipy.sparse.csr_matrix`
query : {list of (int, number), iterable of list of (int, number), :class:`scipy.sparse.csr_matrix`}
Document or collection of documents.

Return
Expand Down Expand Up @@ -938,7 +938,7 @@ def get_similarities(self, query):

Parameters
----------
query : {list of (int, number), iterable of list of (int, number)
query : {list of (int, number), iterable of list of (int, number)}
Document or collection of documents.

Return
Expand Down Expand Up @@ -978,7 +978,7 @@ def __str__(self):


class WmdSimilarity(interfaces.SimilarityABC):
"""Compute negative WMD similarity against a corpus of documents by storing the index matrix in memory.
"""Compute negative WMD similarity against a corpus of documents.

See :class:`~gensim.models.keyedvectors.WordEmbeddingsKeyedVectors` for more information.
Also, tutorial `notebook
Expand All @@ -999,17 +999,14 @@ class WmdSimilarity(interfaces.SimilarityABC):
.. sourcecode:: pycon

>>> from gensim.test.utils import common_texts
>>> from gensim.corpora import Dictionary
>>> from gensim.models import Word2Vec
>>> from gensim.similarities import WmdSimilarity
>>>
>>> model = Word2Vec(common_texts, size=20, min_count=1) # train word-vectors
>>> dictionary = Dictionary(common_texts)
>>> bow_corpus = [dictionary.doc2bow(document) for document in common_texts]
>>>
>>> index = WmdSimilarity(bow_corpus, model)
>>> index = WmdSimilarity(common_texts, model)
>>> # Make query.
>>> query = 'trees'
>>> query = ['trees']
>>> sims = index[query]

"""
Expand All @@ -1018,8 +1015,8 @@ def __init__(self, corpus, w2v_model, num_best=None, normalize_w2v_and_replace=T

Parameters
----------
corpus: iterable of list of (int, float)
A list of documents in the BoW format.
corpus: iterable of list of str
A list of documents, each of which is a list of tokens.
w2v_model: :class:`~gensim.models.word2vec.Word2VecTrainables`
A trained word2vec model.
num_best: int, optional
Expand Down Expand Up @@ -1058,7 +1055,7 @@ def get_similarities(self, query):

Parameters
----------
query : {list of (int, number), iterable of list of (int, number)
query : {list of str, iterable of list of str}
Document or collection of documents.

Return
Expand Down Expand Up @@ -1194,7 +1191,7 @@ def get_similarities(self, query):

Parameters
----------
query : {list of (int, number), iterable of list of (int, number), :class:`scipy.sparse.csr_matrix`
query : {list of (int, number), iterable of list of (int, number), :class:`scipy.sparse.csr_matrix`}
Document or collection of documents.

Return
Expand Down