Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when most_important_docs in summarizer.py is None #1597

Closed
shengyang998 opened this issue Sep 23, 2017 · 2 comments
Closed

Error when most_important_docs in summarizer.py is None #1597

shengyang998 opened this issue Sep 23, 2017 · 2 comments

Comments

@shengyang998
Copy link

In [0]: gensim.__version__
Out [0]: '2.3.0'

Description:
I was working on a set of Chinese sentences. And when I call the function gensim.summarization.summarize().The Error below was occurred:

  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/gensim/summarization/summarizer.py", line 215, in summarize
    extracted_sentences = _extract_important_sentences(sentences, corpus, most_important_docs, word_count)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/gensim/summarization/summarizer.py", line 114, in _extract_important_sentences
    important_sentences = _get_important_sentences(sentences, corpus, important_docs)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/gensim/summarization/summarizer.py", line 89, in _get_important_sentences
    return [sentences_by_corpus[tuple(important_doc)] for important_doc in important_docs]
TypeError: 'NoneType' object is not iterable

It seems that the important_doc is None, and NoneType cannot be iterated.
Well, I didn't learn so much of TextRank Algorithm, and I am trying to go on to work. Maybe someone can tell what is happening?

PS: Sorry that I could not afford the test case I was using, for it is full of Chinese name. (If someone ask me privately, maybe i could.) Anyway, there is a bug in it. For some reason the most_important_docs at line 212, summarizer.py is None. This situation should be handled properly. I suppose that summarize() should return None or raise some other Error for debugging when most_important_docs is None. Or even better, optimize the implementation of TextRank Algorithm, which is fully out of my ability for now...

>>> s = '`a string full of different Chinese name, with the number of more than 1 thousand.`'
>>> import gensim.summarization as gsum
>>> gsum.summarize(s)
@zsef123
Copy link
Contributor

zsef123 commented Sep 24, 2017

check #1531

@shengyang998
Copy link
Author

OK, thank you! @zsef123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants