Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 2.2.0 issues #1441

Closed
menshikh-iv opened this issue Jun 22, 2017 · 1 comment
Closed

Windows 2.2.0 issues #1441

menshikh-iv opened this issue Jun 22, 2017 · 1 comment
Labels
bug Issue described a bug testing Issue related with testing (code, documentation, etc)

Comments

@menshikh-iv
Copy link
Contributor

menshikh-iv commented Jun 22, 2017

We have several errors in tests if I use Windows

2 errors, connected with multithreading in windows (???) from #1349, python2 and python3

----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python35-x64\lib\site-packages\gensim\test\test_text_analysis.py", line 57, in test_occurrence_counting
    self.assertEqual(3, accumulator.get_occurrences("this"))
AssertionError: 3 != 0
-------------------- >> begin captured logging << --------------------
gensim.topic_coherence.text_analysis: INFO: 1 batches submitted to accumulate stats from 64 documents (3 virtual)
gensim.topic_coherence.text_analysis: INFO: 2 accumulators retrieved from output queue
gensim.topic_coherence.text_analysis: INFO: accumulated word occurrence stats for 4 virtual documents
--------------------- >> end captured logging << ---------------------
======================================================================
FAIL: test_occurrence_counting2 (gensim.test.test_text_analysis.TestParallelWordOccurrenceAccumulator)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python35-x64\lib\site-packages\gensim\test\test_text_analysis.py", line 67, in test_occurrence_counting2
    self.assertEqual(2, accumulator.get_occurrences("human"))
AssertionError: 2 != 0
-------------------- >> begin captured logging << --------------------
gensim.topic_coherence.text_analysis: INFO: 2 accumulators retrieved from output queue
gensim.topic_coherence.text_analysis: INFO: accumulated word occurrence stats for 10 virtual documents
--------------------- >> end captured logging << ---------------------

3 errors connected with encoding from #1402,
python3 only

======================================================================
ERROR: `Dictionary` can be loaded from textfile.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python35-x64\lib\site-packages\gensim\test\test_corpora_dictionary.py", line 234, in test_loadFromText
    d = Dictionary.load_from_text(tmpf)
  File "C:\Python35-x64\lib\site-packages\gensim\corpora\dictionary.py", line 358, in load_from_text
    line = utils.to_unicode(line)
  File "C:\Python35-x64\lib\site-packages\gensim\utils.py", line 235, in any2unicode
    return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 5: invalid continuation byte
======================================================================
ERROR: `Dictionary` can be loaded from textfile in legacy format.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python35-x64\lib\site-packages\gensim\test\test_corpora_dictionary.py", line 220, in test_loadFromText_legacy
    d = Dictionary.load_from_text(tmpf)
  File "C:\Python35-x64\lib\site-packages\gensim\corpora\dictionary.py", line 358, in load_from_text
    line = utils.to_unicode(line)
  File "C:\Python35-x64\lib\site-packages\gensim\utils.py", line 235, in any2unicode
    return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 5: invalid continuation byte
======================================================================
FAIL: `Dictionary` can be saved as textfile.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python35-x64\lib\site-packages\gensim\test\test_corpora_dictionary.py", line 197, in test_saveAsText
    self.assertEqual(serialized_lines[1][1:], "\tdruh�\t2\n")
AssertionError: '\tdruhé\t2\n' != '\tdruh�\t2\n'
- 	druhé	2
? 	    ^^
+ 	druh�	2
? 	    ^
-------------------- >> begin captured logging << --------------------
gensim.corpora.dictionary: INFO: adding document #0 to Dictionary(0 unique tokens: [])
gensim.corpora.dictionary: INFO: built Dictionary(3 unique tokens: ['druh�', 'prv�', 'slovo']) from 3 documents (total 6 corpus positions)
gensim.corpora.dictionary: INFO: saving dictionary mapping to C:\Users\appveyor\AppData\Local\Temp\1\save_dict_test.txt
--------------------- >> end captured logging << ---------------------
@menshikh-iv menshikh-iv added bug Issue described a bug testing Issue related with testing (code, documentation, etc) labels Jun 22, 2017
@menshikh-iv
Copy link
Contributor Author

Ping @macks22 @vlejd

macks22 pushed a commit to macks22/gensim that referenced this issue Jun 25, 2017
macks22 pushed a commit to macks22/gensim that referenced this issue Jun 26, 2017
…ction by passing explicit `offset` parameter.
@menshikh-iv menshikh-iv reopened this Jul 6, 2017
saparina pushed a commit to saparina/gensim that referenced this issue Jul 9, 2017
 (piskvorky#1449)

* piskvorky#1441: Fix issues with `WordOccurenceAccumulator` on Windows.

* piskvorky#1441: Use pre-scipy0.17 version of `scipy.sparse.diags` function by passing explicit `offset` parameter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug testing Issue related with testing (code, documentation, etc)
Projects
None yet
Development

No branches or pull requests

1 participant