Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Pandas v1.0 #1

Open
smmaurer opened this issue Jun 18, 2020 · 1 comment
Open

Supporting Pandas v1.0 #1

smmaurer opened this issue Jun 18, 2020 · 1 comment

Comments

@smmaurer
Copy link
Member

smmaurer commented Jun 18, 2020

These notes are from March – June 2020, originally compiled in Notion.

Background

Pandas v1.0 was released in Feb 2020. It removes a wide range of syntax that's been deprecated over the years, and will require minor updates across many of our codebases.

What's been removed: https://pandas.pydata.org/docs/whatsnew/v1.0.0.html#whatsnew-100-prior-deprecations

Strategies for restoring compatibility

If code raises errors in Pandas v1.0, try switching to Pandas v0.25. This should restore earlier functionality and also provide deprecation warnings describing what needs to be changed. (Versions prior to v0.25 may not include all the deprecation warnings.)

As a temporary fix, you can require pandas < 1.0 in the setup files of a library or project.

How risky are these changes?

Some of the compatibility fixes require judgment about what exactly the code is doing, but as far as I can tell if the updated code runs, it's very likely to do the same thing as the old code.

So I don't expect any of these changes to affect software logic -- but for codebases without unit tests we should be extra careful and try to do whatever testing is feasible.

What to update

Here are things that have come up for us so far:

  1. DataFrame.as_matrix() and Series.as_matrix() are removed

    These can be directly replaced with Series.values and DataFrame.values.

  2. DataFrame.ix[] and Series.ix[] are removed

    In most cases you can use .loc[] in its place, with identical arguments.

    • One exception is if .ix[] was being used for implicit positional indexing, which happens if the DataFrame or Series's index contains non-integer values but you pass integers to .ix[]. In this case, replacing it with .loc[] will raise an error and you should use .iloc[] instead.
    • Another exception is if an unlabeled list of index values is passed to .ix[], as described in the next section. Because .loc[] no longer supports this usage, you'll need to replace df.ix[list] with df.reindex(list).
    • Documentation of old .ix[] behavior
    • Discussion of removal
  3. DataFrame.loc[] and Series.loc[] no longer accept an unlabeled list of index values

    For example, to get the rows with ids 30, 50, and 40, you can no longer use df.loc[[30,50,40]]. Instead, replace this with df.reindex([30,50,40]). The behavior should be identical.

    (But i feel like this makes the code less readable, because it's not intuitive that "reindex" is going to yield a reordered subset of the DataFrame. If you have better ideas for what to replace this with, let me know!)

  4. DataFrame.get_value() and DataFrame.set_value() are removed

    And the equivalent for Series.

    You can replace df.set_value(row, col, val) with df.at[row, col] = val.

Support status

Which UDST libraries are currently compatible with Pandas v1.0?

Last updated Sep 30, 2020.

  • Choicemodels v0.2.2: compatible
  • Orca v1.5.3: compatible
  • Orca Test v0.1: compatible
  • OSMNet v0.1.5: compatible
  • Pandana v0.4.4: compatible
  • Spandex v0.1dev: not sure, tests are failing for other reasons
  • Synthpop v0.1.1: not sure, tests are failing
  • UrbanAccess v0.2: compatible
  • UrbanSim v3.2: compatible, but earlier versions are not (see Compatibility with Pandas v1.0 urbansim#222)
  • UrbanSim Defaults v0.2: not sure, still looking into it (no unit tests)
  • UrbanSim Templates v0.1.3: compatible
@smmaurer
Copy link
Member Author

Another deprecation relevant to us in Pandas 1.2+: pandas.Index.to_native_types()

UDST/urbansim#230

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant