Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement check for groupby slicing and aggregation patterns. #54

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

simchuck
Copy link
Collaborator

Pending determination of acceptable use cases for this pattern, this PR checks for explicit use of the following patterns with explicit slicing syntax on the groupby() method:

df.groupby(A)[B]          # for ast.Subscript nodes
df.groupby(A)[B].agg(C)   # for ast.Call nodes

Note that this requires two separate check functions due to above distinction in AST. Also, includes check to distinguish between method (node.func.attr) vs. function (node.func.id).

The implementation does not check syntax for functions, although commented tests are included in test_PD014.py in case this is determined to be acceptable syntax for check.

Closes #24 (pending)

Add check to distinguish method vs. function.

Also, due to structure of AST, need to treat chained aggregation method
separately from sliced method, i.e.,

    df.groupby()[]          # visitor node is ast.Subscript
    df.groupby()[].agg()    # visitor node is ast.Call
@deppen8
Copy link
Owner

deppen8 commented Aug 4, 2019

@simchuck, any interest in revisiting this? Once the off-by-default framework (#69) is merged, I think this would be a good PD902. It is probably too opinionated for most everyday pandas applications.

Let me know if you want to take it on. If not, I'll give it a shot.

@simchuck
Copy link
Collaborator Author

simchuck commented Aug 7, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check for .groupby aggregation patterns
2 participants