Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: documentation examples should use meaningful data where possible #16709

Closed
colinmorris opened this issue Jun 15, 2017 · 3 comments
Closed
Labels

Comments

@colinmorris
Copy link

Something like this is a pretty good example. I know something about animals, cats, dogs, and hair, so I can sort of keep the structure of the data in my head and follow along with the transformations without having to scroll up and down to check against the original dataframe.

But a lot of examples just use meaningless column names like A, B, C, D... or foo, bar, baz..., which makes it a lot harder to gain an intuition about what's going on. For example, if you don't know what groupby does, this example:

In [13]: df2 = pd.DataFrame({'X' : ['B', 'B', 'A', 'A'], 'Y' : [1, 2, 3, 4]})

In [14]: df2.groupby(['X']).sum()
Out[14]: 
   Y
X   
A  7
B  3

might be less useful than this version:

In [13]: pets = pd.DataFrame({'animal' : ['dog', 'dog', 'cat', 'cat'], 'weight' : [10, 20, 8, 9]})

In [14]: pets.groupby(['weight']).mean()
Out[14]: 
   weight
animal   
dog  15
cat  8.5

I realize re-doing all the examples like this would be a significant amount of work, but if there's agreement that this is a desirable thing, I'd be happy to kick things off with a small P.R.

Also, I think it would be good to add this as a guideline to the documentation section of the contributing doc. (Again, if people agree this is worthwhile and not misguided.)

@jreback
Copy link
Contributor

jreback commented Jun 15, 2017

I am -0 on this. I don't think this adds much value, would be a lot of works to change until its consistent, and the current examples are a bit shorter.

@jreback jreback added the Docs label Jun 15, 2017
@TomAugspurger
Copy link
Contributor

#16520 is starting on something like this.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Mar 5, 2018

Closing this in favor of #19710.
@colinmorris I think this is a good idea for certain cases like the groupby example above (not for all docstrings), feel free to comment on #19710.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants