You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The old code for the function _raw_word_count is breaking the word2vec model:
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/gensim/models/word2vec.py", line 744, in
return sum(len(sentence) for sentence in job)
TypeError: object of type 'map' has no len()
The solution was already implemented. Why is it not yet available?
From (Issue #535 ):
def _raw_word_count(self, job):
"""Return the number of words in a given job."""
return sum(len(sentence.words) for sentence in job)
The text was updated successfully, but these errors were encountered:
What is the type of input that you are giving to sentence? Really keen to reproduce the error. Do you have a code snippet?
The code is current for last 12 months. In (word2vec)[https://github.com/RaRe-Technologies/gensim/blame/2a70e3a726404cd4230542a35cfd2dc4d63da6f1/gensim/models/word2vec.py#L747] len(sentence) was added in #535. The change to len(sentence.words) only affects doc2vec and not word2vec.
I just realized the problem is that I'm using the 'map' function directly as sentences, instead of a list:
walks = [map(str, walk) for walk in walks]
model = Word2Vec(walks, size=args.dimensions, window=args.window_size, min_count=0, sg=1, workers=args.workers,
iter=args.iter)
model.save_word2vec_format(args.output)
The old code for the function _raw_word_count is breaking the word2vec model:
The solution was already implemented. Why is it not yet available?
From (Issue #535 ):
def _raw_word_count(self, job):
"""Return the number of words in a given job."""
return sum(len(sentence.words) for sentence in job)
The text was updated successfully, but these errors were encountered: