Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: update validate_docstrings.py Validation script checks for tabs #20112

Merged
merged 1 commit into from
Mar 14, 2018

Conversation

mariocj89
Copy link
Contributor

Add a check to validate the docstrings don't have tabs.
The documentation uses whitespace only, adding the check will prevent
tabs being added in the sprint or future submissions

Sample output (elided non changed parts)

...

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	...
	Tabs found in the docstring, please use whitespace only
	...

@@ -455,6 +455,10 @@ def validate_one(func_name):
if not rel_desc:
errs.append('Missing description for '
'See Also "{}" reference'.format(rel_name))

if "\t" in doc.raw_doc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this check for tabs only at the start of lines? We may have cases where we include the literal \t in the docstring (e.g. read_csv).

@pep8speaks
Copy link

pep8speaks commented Mar 10, 2018

Hello @mariocj89! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 10, 2018 at 12:40 Hours UTC

@@ -455,6 +455,12 @@ def validate_one(func_name):
if not rel_desc:
errs.append('Missing description for '
'See Also "{}" reference'.format(rel_name))

for line in doc.raw_doc.splitlines():
if re.match(" *\t", line):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the regex would be
if re.match(r"^\t", line)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, missing the start of the line. Sorry about that. I was checking as well there is no spaces followed by tabs. Is it something we should check or just a plain "starts with a tab"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see. So maybe ^ *\t? That's lines starting with 0 or more spaces followed by a tab.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to add a test file for this script? I can create a tests folder with a test for this script that will 1) prevent future mistakes and would have made easier to review a PR like this one. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, There is already another PR adding the test file :)

Add a check to validate the docstrings dont have tabs at the begining of
the line.
@TomAugspurger
Copy link
Contributor

@datapythonista @jorisvandenbossche if you could glance over this. It looks good to me.

@@ -455,6 +455,12 @@ def validate_one(func_name):
if not rel_desc:
errs.append('Missing description for '
'See Also "{}" reference'.format(rel_name))

for line in doc.raw_doc.splitlines():
if re.match("^ *\t", line):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'\t' in line is good enough, yes? we don't all any tabs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see the comments above @TomAugspurger requested "Starts with tab" as there might be tabs to format a table or the like I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, not sure this is actually strict enough. as I said we don't allow any tabs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to change it back to allow no tabs at all. Let me know what you think @TomAugspurger

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback There are cases where a \t is included in the text as literal character (for the explanation).
So I think only checking the start of a line is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to where \t is included, we could use re.search(r'^ *\t', self.raw_doc, flags=re.MULTILINE) and avoid the loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups, sorry, I see now that you want to report the lines with the tabs...

@jreback jreback added the Docs label Mar 10, 2018
@jorisvandenbossche jorisvandenbossche merged commit c857b4f into pandas-dev:master Mar 14, 2018
@jorisvandenbossche
Copy link
Member

@mariocj89 thanks!

@jorisvandenbossche jorisvandenbossche added this to the 0.23.0 milestone Mar 14, 2018
@mariocj89
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants