Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #851, Error is raised instead of returning text [WiP] #902

Merged
merged 6 commits into from
Sep 29, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Changes
- bigram construction can now support multiple bigrams within one sentence
* Fixed issue #838, RuntimeWarning: overflow encountered in exp (@markroxor, [#895](https://github.com/RaRe-Technologies/gensim/pull/895))
* Changed some log messages to warnings as suggested in issue #828. (@rhnvrm, [#884](https://github.com/RaRe-Technologies/gensim/pull/884))
* Fixed issue #851, In summarizer.py, check for single sentence as an input added to avoid ZeroDivionError, added test cases in test/test_summarization.py(@metalaman, #887)
* Fixed issue #851, In summarizer.py, RunTimeError is raised if single sentence input is provided to avoid ZeroDivionError. (@metalaman, #887)


0.13.2, 2016-08-19
Expand Down
5 changes: 2 additions & 3 deletions gensim/summarization/summarizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,10 +193,9 @@ def summarize(text, ratio=0.2, word_count=None, split=False):
logger.warning("Input text is empty.")
return

# If only one sentence is present, the function return the input text (Avoids ZeroDivisionError).
# If only one sentence is present, the function raises an error (Avoids ZeroDivisionError).
if len(sentences) == 1:
logger.warning("Summarization not performed since the document has only one sentence.")
return text
raise ValueError("input must have more than one sentence")

# Warns if the text is too short.
if len(sentences) < INPUT_MIN_LENGTH:
Expand Down
4 changes: 2 additions & 2 deletions gensim/test/test_summarization.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def test_text_summarization_raises_exception_on_short_input_text(self):
text = "\n".join(text.split('\n')[:8])

self.assertTrue(summarize(text) is not None)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No trailing whitespace please.

def test_text_summarization_returns_input_on_single_input_sentence(self):
pre_path = os.path.join(os.path.dirname(__file__), 'test_data')

Expand All @@ -97,7 +97,7 @@ def test_text_summarization_returns_input_on_single_input_sentence(self):
# Keeps the first sentence only.
text = text.split('\n')[0]

self.assertEqual(summarize(text),text)
self.assertRaises(ValueError,summarize,text)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: space after comma.


def test_corpus_summarization_raises_exception_on_short_input_text(self):
pre_path = os.path.join(os.path.dirname(__file__), 'test_data')
Expand Down
6 changes: 3 additions & 3 deletions gensim/test/test_wikicorpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,17 @@
import os
import sys
import types

import logging
import unittest

from gensim.corpora.wikicorpus import WikiCorpus



module_path = os.path.dirname(__file__) # needed because sample data files are located in the same folder
datapath = lambda fname: os.path.join(module_path, 'test_data', fname)
FILENAME = 'enwiki-latest-pages-articles1.xml-p000000010p000030302-shortened.bz2'

logger = logging.getLogger(__name__)

class TestWikiCorpus(unittest.TestCase):

Expand All @@ -31,7 +31,7 @@ def setUp(self):


def test_get_texts_returns_generator_of_lists(self):

logger.debug("Current Python Version is "+str(sys.version_info))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: space around binary operators.

if sys.version_info < (2, 7, 0):
return
wc = WikiCorpus(datapath(FILENAME))
Expand Down