Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: Narrow down types of arguments (DataFrame) #52752

Merged

Conversation

benedikt-mangold
Copy link
Contributor

closes #8 (noatamir/pyladies-workshop#8)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Preview discussion:
We prefer using Literal["a"] | Literal["b"] instead of str. Look into the documentation what is allowed and the narrow the types down appropriately. Generally, we probably already have type definitions in pandas/_typing.py like NaPosition for the first item in the list below.

You can locate the appropriate file through searching for class DataFrame in the repository.

DataFrame methods:

  • Specifykind and na_position more precisely in sort_values (can look at sort_index, it is already done there)
  • Specify keep for nsmallest and nlargest more precisely
  • Specify join and errors in update more precisely
  • Specify na_action in applymap more precisely
  • Specify validate in merge and join more precisely
  • Specify how in to_timestamp more precisely
  • Specify orient in from_dict more precisely
  • Specify if_exists in to_gbq more precisely
  • Specify byteorder in to_stata more precisely
  • Specify engine in to_parquet more precisely
  • Specify engine in to_orc more precisely
  • Specify parser in to_xml more precisely
  • Specify method in reindex more precisely

#Reviewers
@phofl @noatamir @jorisvandenbossche and @MarcoGorelli

pschleiter and others added 3 commits April 18, 2023 14:33
Specify parser in to_xml of class dataframe

Update doc string to_orc in class dataframeUpdate doc string to_orc in class dataframe

Specify engine in to_parquet in class dataframe
…e argument of join and merge method

Change byteorder argument typing for to_stata method to literal, added definition in pandas/_typing.py

Change if_exists argument typing for to_gbq method to literal, added definition in pandas/_typing.py

Change orient argument typing for from_dict method to literal, added definition in pandas/_typing.py

Change how argument typing for to_timestamp method to literal, added definition in pandas/_typing.py

Change validate argument typing for merge and join methods to literal, added definition in pandas/_typing.py

Change na_action arguments typing for applymap method to literal, added definition in pandas/_typing.py

Change join and errors arguments typing for update method to litaral, added definition in pandas/_typing.py

Change keep argument typing for nlargest and nsallest to litaera, added definition in pandas/_typing.py

Specify the kind and na_position more precisely in sort_values, reusing type definitions in pandas/_typing.py
Comment on lines 425 to 426
# applymap
NaAction = Literal["None", "ignore"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your pr

this isn't quite right, you can't pass "None"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for that, resolved in 7e1c7c9

@noatamir noatamir added the Sprints Sprint Pull Requests label Apr 18, 2023
@benedikt-mangold
Copy link
Contributor Author

Help, @phofl - i keep getting a mypy error

pandas/core/frame.py:1738: error: Incompatible types in assignment (expression has type "str", variable has type "Literal['columns', 'index', 'tight']")  [assignment]
Found 1 error in 1 file (checked 1418 source files)

this is completely new to me, so I do not have any idea how to fix that. Could you help out, please?

@pschleiter
Copy link
Contributor

@benedikt-mangold: Since we changed orient from str to Literal orient must be of type Literal. Maybe we could remove the line 1738 in frame.py orient = orient.lower() at all.
@phofl Would this be ok in terms of backward compatibility?

@phofl
Copy link
Member

phofl commented Apr 18, 2023

This is an annoying case…

You can add „ type: ignore[assignment]“ after the lower in the same line.

no we would have to deprecate first, which is something I’d be open to do. But not in this pr (we try to keep things separated)

@benedikt-mangold
Copy link
Contributor Author

@pschleiter and @phofl -> worked, build is green 🎉

Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you merge main to resolve conflicts? otherwise lgtm

@benedikt-mangold
Copy link
Contributor Author

@pschleiter and @phofl - halp -> why on earth would it be red now ??

@benedikt-mangold
Copy link
Contributor Author

@pschleiter and @phofl - halp -> why on earth would it be red now ??

merging did not contain all changes in .typing.py

@benedikt-mangold
Copy link
Contributor Author

@phofl ready to merge 🎉

@mroeschke mroeschke merged commit e09a193 into pandas-dev:main Apr 20, 2023
@mroeschke mroeschke added this to the 2.1 milestone Apr 20, 2023
@mroeschke
Copy link
Member

Thanks @benedikt-mangold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sprints Sprint Pull Requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

np.fix doesn't work
6 participants