-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: greater consistency and spell-check for intro docs #18948
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18948 +/- ##
==========================================
- Coverage 91.59% 91.57% -0.03%
==========================================
Files 150 150
Lines 48959 48964 +5
==========================================
- Hits 44845 44838 -7
- Misses 4114 4126 +12
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. thanks! some comments, mainly would love to update to use references where we have not already, e.g. :meth:`DataFrame.apply`
for example rather than apply
.
doc/source/basics.rst
Outdated
@@ -764,7 +764,7 @@ For example, we can fit a regression using statsmodels. Their API expects a form | |||
The pipe method is inspired by unix pipes and more recently dplyr_ and magrittr_, which | |||
have introduced the popular ``(%>%)`` (read pipe) operator for R_. | |||
The implementation of ``pipe`` here is quite clean and feels right at home in python. | |||
We encourage you to view the source code (``pd.DataFrame.pipe??`` in IPython). | |||
We encourage you to view the source code of ``pd.DataFrame.pipe``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this a func reference
doc/source/basics.rst
Outdated
@@ -786,7 +786,7 @@ statistics methods, take an optional ``axis`` argument: | |||
df.apply(np.cumsum) | |||
df.apply(np.exp) | |||
|
|||
``.apply()`` will also dispatch on a string method name. | |||
The ``.apply()`` method will also dispatch on a string method name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this a func reference (use DataFrame.apply is fine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like on line 866
doc/source/basics.rst
Outdated
@@ -1008,7 +1009,7 @@ function name or a user defined function. | |||
tsdf.transform('abs') | |||
tsdf.transform(lambda x: x.abs()) | |||
|
|||
Here ``.transform()`` received a single function; this is equivalent to a ufunc application | |||
Here ``.transform()`` received a single function; this is equivalent to a ufunc application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the refernce here
doc/source/basics.rst
Outdated
@@ -1515,7 +1516,7 @@ To iterate over the rows of a DataFrame, you can use the following methods: | |||
over the values. See the docs on :ref:`function application <basics.apply>`. | |||
|
|||
* If you need to do iterative manipulations on the values but performance is | |||
important, consider writing the inner loop using e.g. cython or numba. | |||
important, consider writing the inner loop using for instance cython or numba. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a comma before for instance
doc/source/basics.rst
Outdated
@@ -1594,7 +1595,7 @@ index value along with a Series containing the data in each row: | |||
|
|||
To preserve dtypes while iterating over the rows, it is better | |||
to use :meth:`~DataFrame.itertuples` which returns namedtuples of the values | |||
and which is generally much faster as ``iterrows``. | |||
and which is generally much faster than ``iterrows``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the reference
|
||
* A set of labeled array data structures, the primary of which are | ||
Series and DataFrame | ||
Series and DataFrame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this same text exists in pandas/__init__.py
can you update there as well.
doc/source/10min.rst
Outdated
@@ -425,7 +426,7 @@ String Methods | |||
Series is equipped with a set of string processing methods in the `str` | |||
attribute that make it easy to operate on each element of the array, as in the | |||
code snippet below. Note that pattern-matching in `str` generally uses `regular | |||
expressions <https://docs.python.org/2/library/re.html>`__ by default (and in | |||
expressions <https://docs.python.org/3.5/library/re.html>`__ by default (and in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use /3/
instead of /3.5/
? That way it should always link to the most recent version of Python 3.
@jreback and @jschendel : I have made the requested changes. |
thanks @tommyod built docs are: http://pandas-docs.github.io/pandas-docs-travis/ (likely take a few hours). please make sure changes are rendered ok. thanks! |
@jreback - Great, will do. I hope to find the time to read more of the documentation and make similar changes if I see room for improvement, i.e. punctuation, function references, small changes to sentences, and so forth. If so, what is the sensible size of a PR? Should I create a PR for every file, every 3-4 files in the documentation, or wait and submit a PR when I have read and made changes to most of the docs? Any preference? |
@tommyod : I don't think there's really a preference in terms of the size of a PR. Whatever works best for you should be fine. My advice is to lean towards submitting PR's more frequently as opposed to waiting to submit a huge number of changes at once. Waiting could lead to merge conflicts if other people are updating the same sections of the docs, which could lead to extra work resolving the conflicts. Thanks for improving the docs! |
I read through introductory docs, and made the following changes:
}
, which I removed.