Test docs code snippets with pytest doctests #362

kataev · 2017-05-11T09:20:56Z

I propose integrate doctest with documentations code snippets using py.test doctest.
This should helps maintains code snippets up to date.

This is have one aesthetic issue, snippets will have ... and >>> prefix.

i rewrited a random file in documentation for presentation idea.

How to run doctests:

py.test --doctest-modules --doctest-glob='*.rst' --ignore=doc/build/changelog/ doc/build/core/custom_types.rst ```

… doctests

zzzeek · 2017-05-11T14:39:29Z

OK, you may be aware that we do use doctests for our two tutorials at http://docs.sqlalchemy.org/en/latest/orm/tutorial.html and http://docs.sqlalchemy.org/en/latest/core/tutorial.html. The doctests for these are also integrated within our test suite. These are not using the py.test integration points, however, and I'm not familiar with what additional behaviors the py.test points bring. My comments below are based on what I know about doctest.

I am uncomfortable using doctest for 100% of .rst files for the following reasons:

the format of the code is inconsistent vs. the code examples that are present within non-rst docstrings. lots of the documentation comes from module docstrings. So the scale of the change here would need to accommodate for all code examples in modules as well. Not sure if that was part of the plan.
docstring format necessitates that code examples are verbose. the >>> / ... symbols, the full sets of imports, and the initialization of all variables, can get in the way of what should be a simple demonstration. Example. Docstring for "column.in_()", wants to show a simple example of how to do "in":

some_column.in_([1, 2, 3])

Above, this is not acceptable for doctest. It must instead be:

>>> from sqlalchemy import Column, Integer
>>> some_column = Column("some_column", Integer)
>>> some_column.in_([1, 2, 3])

The reader is now confused. Does "in_()" only work with Integer? Is this Column part of a Table ? E.g., doctest requires that our example must have a lot more than just the thing being demonstrated itself, leading to confusion as to which part of the code is the thing that's being illustrated, and which parts are just to satisfy doctest.

From my experience with doctest (perhaps this can be changed somehow), doctest segments in an .rst file are codependent on each other. If I've imported or defined certain objects /variables in a previous example, now they are already present in subsequent examples, leading to more conflicts and confusion. It means that the effect of doctest to force us to have the correct imports in example sections does not actually work; they only have to be in one of the example sections, then can be missing from subsequent ones. The tutorials do not suffer from this issue because they are written as continuous narratives over a single large example, but reference documentation is not organized in this way.
doctest has strict requirements for output. I notice some of the print statements above are testing the resulting SQL string against one that is nicely formatted, rather than the actual format that would be produced. Doctest will reject this unless you do IGNORE_WHITESPACE. Overall, getting all outputs to match perfectly is a very time consuming job (having written the tutorials) and ongoing maintenance is also very time consuming. Doctest everywhere will add significantly not only to the immediate workload of migrating hundreds (thousands?) of examples, but also forever going forward as all new examples would need to meet stringent output matching requirements as well as minor changes in output format need to be tended throughout the entire documentation / module string base.

Overall, the main reason I dont think "doctest everywhere" is viable for SQLAlchemy comes down to level of effort and ongoing maintenance burden in conjunction with the specific nature of doctest (e.g., the idea is good but in practice it needs to be more flexible). I do 95% of the work on SQLAlchemy myself and I simply do not have the time to maintain thousands of doctest code examples and to also have these requirements when writing new code. They don't make the documentation clearer, lead to confusion as to what's being illustrated and what's just boilerplate, new documentation becomes much more time consuming to write; documentation is already by far the most time consuming thing I have to deal with.

That said, if there was some alternate form of "doctest" that could simply test a code example both for Python syntax, pep8 compliance (which would be AWESOME) as well as symbol consistency, that would be helpful. The tool could be configured with common imports and symbols significant to SQLAlchemy examples and be helpful as a basic sanity check for code examples. As it is, when writing new documentation I have to organize and run the code in a separate .py file to make sure it does the right thing. So this is a problem, just my experience with doctest in writing the tutorials has shown me what it's good at and where it's likely getting in the way.

kataev · 2017-05-14T18:40:43Z

Thank you for detailed description your position, i am fully agree with your reasoning.

I am try to research something for your proposed syntax checks and pep8.

Rewrited code snippets in documentation for compatibility with pytest…

5ad9e08

… doctests

kataev closed this May 14, 2017

kataev mentioned this pull request Aug 28, 2018

rst and docstrings code snippets syntax #469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test docs code snippets with pytest doctests #362

Test docs code snippets with pytest doctests #362

kataev commented May 11, 2017

zzzeek commented May 11, 2017

kataev commented May 14, 2017

Test docs code snippets with pytest doctests #362

Test docs code snippets with pytest doctests #362

Conversation

kataev commented May 11, 2017

zzzeek commented May 11, 2017

kataev commented May 14, 2017