Skip to content

Commit

Permalink
Fix docstrings for gensim.test.utils (#1904)
Browse files Browse the repository at this point in the history
* docstrings for test/utils.py and its rsts

* typos

* fix utils

* fix type reference
  • Loading branch information
yurkai authored and menshikh-iv committed Feb 15, 2018
1 parent 2bd2092 commit fb42d80
Show file tree
Hide file tree
Showing 3 changed files with 151 additions and 10 deletions.
1 change: 1 addition & 0 deletions docs/src/apiref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ Modules:
sklearn_api/text2bow
sklearn_api/tfidf
sklearn_api/w2vmodel
test/utils
topic_coherence/aggregation
topic_coherence/direct_confirmation_measure
topic_coherence/indirect_confirmation_measure
Expand Down
9 changes: 9 additions & 0 deletions docs/src/test/utils.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
:mod:`test.utils` -- Common utils
===========================================================

.. automodule:: gensim.test.utils
:synopsis: Common utils
:members:
:inherited-members:
:undoc-members:
:show-inheritance:
151 changes: 141 additions & 10 deletions gensim/test/utils.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,64 @@
#!/usr/bin/env python
# encoding: utf-8

"""Module contains common utilities used in automated code tests for Gensim modules.
Attributes:
-----------
module_path : str
Full path to this module directory.
common_texts : list of list of str
Toy dataset.
common_dictionary : :class:`~gensim.corpora.dictionary.Dictionary`
Dictionary of toy dataset.
common_corpus : list of list of (int, int)
Corpus of toy dataset.
Examples:
---------
It's easy to keep objects in temporary folder and reuse'em if needed:
>>> from gensim.models import word2vec
>>> from gensim.test.utils import get_tmpfile, common_texts
>>>
>>> model = word2vec.Word2Vec(common_texts, min_count=1)
>>> temp_path = get_tmpfile('toy_w2v')
>>> model.save(temp_path)
>>>
>>> new_model = word2vec.Word2Vec.load(temp_path)
>>> result = new_model.wv.most_similar("human", topn=1)
Let's print first document in toy dataset and then recreate it using its corpus and dictionary.
>>> from gensim.test.utils import common_texts, common_dictionary, common_corpus
>>> print(common_texts[0])
['human', 'interface', 'computer']
>>> assert common_dictionary.doc2bow(common_texts[0]) == common_corpus[0]
We can find our toy set in test data directory.
>>> from gensim.test.utils import datapath
>>>
>>> with open(datapath("testcorpus.txt")) as f:
... texts = [line.strip().split() for line in f]
>>> print(texts[0])
['computer', 'human', 'interface']
If you don't need to keep temporary objects on disk use :func:`~gensim.test.utils.temporary_file`:
>>> from gensim.test.utils import temporary_file, common_corpus, common_dictionary
>>> from gensim.models import LdaModel
>>>
>>> with temporary_file("temp.txt") as tf:
... lda = LdaModel(common_corpus, id2word=common_dictionary, num_topics=3)
... lda.save(tf)
"""
Common utils for tests
"""

import contextlib
import tempfile
import os
Expand All @@ -16,27 +70,104 @@


def datapath(fname):
"""Return full path to the pre created file with test data (basically corpus)."""
"""Get full path for file `fname` in test data directory placed in this module directory.
Usually used to place corpus to test_data directory.
Parameters
----------
fname : str
Name of file.
Returns
-------
str
Full path to `fname` in test_data folder.
Example
-------
Let's get path of test GloVe data file and check if it exits.
>>> from gensim.corpora import MmCorpus
>>> from gensim.test.utils import datapath
>>>
>>> corpus = MmCorpus(datapath("testcorpus.mm"))
>>> for document in corpus:
... pass
"""
return os.path.join(module_path, 'test_data', fname)


def get_tmpfile(suffix):
"""
Return full path to temporary file with required suffix.
"""Get full path to file `suffix` in temporary folder.
This function doesn't creates file (only generate unique name).
Also, it may return different paths in consecutive calling.
Parameters
----------
suffix : str
Suffix of file.
Returns
-------
str
Path to `suffix` file in temporary folder.
Examples
--------
Using this function we may get path to temporary file and use it, for example, to store temporary model.
>>> from gensim.models import LsiModel
>>> from gensim.test.utils import get_tmpfile, common_dictionary, common_corpus
>>>
>>> tmp_f = get_tmpfile("toy_lsi_model")
>>>
>>> model = LsiModel(common_corpus, id2word=common_dictionary)
>>> model.save(tmp_f)
>>>
>>> loaded_model = LsiModel.load(tmp_f)
Function doesn't create file. Double calling with the same suffix can return different paths.
"""
return os.path.join(tempfile.gettempdir(), suffix)


@contextlib.contextmanager
def temporary_file(name=""):
"""create a temporary directory and return a path to "name" in that directory
At the end of the context, the directory is removed.
"""This context manager creates file `name` in temporary directory and returns its full path.
Temporary directory with included files will deleted at the end of context. Note, it won't create file.
Parameters
----------
name : str
Filename.
Yields
------
str
Path to file `name` in temporary directory.
Examples
--------
This example demonstrates that created temporary directory (and included
files) will deleted at the end of context.
>>> import os
>>> from gensim.test.utils import temporary_file
>>> with temporary_file("temp.txt") as tf, open(tf, 'w') as outfile:
... outfile.write("my extremely useful information")
... print("Is this file exists? {}".format(os.path.exists(tf)))
... print("Is this folder exists? {}".format(os.path.exists(os.path.dirname(tf))))
Is this file exists? True
Is this folder exists? True
>>>
>>> print("Is this file exists? {}".format(os.path.exists(tf)))
Is this file exists? False
>>> print("Is this folder exists? {}".format(os.path.exists(os.path.dirname(tf))))
Is this folder exists? False
The function doesn't create the file.
"""

# note : when dropping python2.7 support, we can use tempfile.TemporaryDirectory
tmp = tempfile.mkdtemp()
try:
Expand Down

0 comments on commit fb42d80

Please sign in to comment.