Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing compatibility with pandas v2 #915

Closed
ml-evs opened this issue Nov 10, 2023 · 11 comments
Closed

Missing compatibility with pandas v2 #915

ml-evs opened this issue Nov 10, 2023 · 11 comments
Labels
dependencies Issues or PRs that regard dependencies

Comments

@ml-evs
Copy link
Collaborator

ml-evs commented Nov 10, 2023

In the current state of master we are allowing people to install matminer with pandas v2, yet many featurizers do not work in this state. I made as much progress as I can in #912 but cannot justify spending any time on this myself right now. This should be addressed before the next release.

@ml-evs ml-evs changed the title Compatibility with pandas v2 Missing compatibility with pandas v2 Nov 10, 2023
@ml-evs ml-evs added the critical ISSUE which makes large part of package unusable label Dec 8, 2023
@bkmi
Copy link

bkmi commented Jan 20, 2024

If only to add a public voice that this is an issue...

I have also found that this dependency makes matminer difficult to use alongside other modern packages.

@ml-evs
Copy link
Collaborator Author

ml-evs commented Feb 6, 2024

Proposed solution: deprecate the featurizers that are in some sense "broken" by this pandas upgrade and be more strict in our pinning of dependencies (in order to match the current sustainable release cadence of this package without dedicated maintainers) in preparation for a v0.10.0 release.

In the future, when things break due to upstream packages, people are more than welcome to submit fixes (and I am happy to review/merge/release) but I cannot sign up for the ongoing maintenance of keeping deps up to date.

@JaGeo
Copy link
Contributor

JaGeo commented Feb 6, 2024

Any opinions? @ardunn @tschaume ? I guess it could break subsequent code.
I would love to keep on using matminer but in it's current state that's probably hard to do.

@tschaume
Copy link
Collaborator

tschaume commented Feb 8, 2024

@JaGeo matminer used to have a pin on pandas~=1.5 which is keeping MP from staying up-to-date with pandas (This would be true with any upward pins that matminer enforces in its dependencies). I use present tense here because there hasn't been a release of matminer since the pandas requirement was removed in 237603c. As @ml-evs mentioned, we're waiting for 0.10.0 to be released. We're also looking into how matminer enters MP's dependency stack as a required dependency and how we could make that dependency optional.

I agree with @ml-evs that featurizers that don't support recent versions of pandas should probably be deprecated. Alternatively, they could run if a compatible version of pandas happens to be installed and throw a warning otherwise.

HTH

@ml-evs
Copy link
Collaborator Author

ml-evs commented Feb 8, 2024

If it's not that urgent then can we please just reintroduce the pandas pin so that this package actually works for the majority of its userbase? i.e. people who don't mind having a separate virtualenv for matminer do its featurising. I can't see what MP are using it for in any of the open repos but I'm sure it would be less overhead for everyone if you could just vendor the bit you need (or contribute back pandas V2 support yourselves, as it seems like no-one needs it enough to implement it).

We'll have exactly the same problem with pandas 3 soon too.

@ml-evs
Copy link
Collaborator Author

ml-evs commented Feb 8, 2024

As an aside, it's likely that it's actually numpy that's causing most of the problems, a side effect of bumping pandas

@tschaume
Copy link
Collaborator

tschaume commented Feb 8, 2024

@ml-evs We were able to remove the matminer dependency from the MP stack entirely (other than our builders that depend on robocrys). Feel free to manage matminer's pandas dependency as you, @ardunn and @computron think is best for your user community.

@ml-evs ml-evs added dependencies Issues or PRs that regard dependencies and removed critical ISSUE which makes large part of package unusable labels Feb 9, 2024
@ml-evs
Copy link
Collaborator Author

ml-evs commented Mar 26, 2024

I think I've resolved the remaining issues with numpy 1.24+ support in #925, after merging I plan to do a v0.9.1 release unless there are any objections. Hopefully it will then be easier to upgrade pandas etc in the future.

@ml-evs
Copy link
Collaborator Author

ml-evs commented Mar 26, 2024

Release is made -- if people run into issues then we can reopen this and consider steps to support pandas v2 (it may not be as difficult as described above, as numpy compatibility has now been fixed [at least as far as it is tested]).

@ml-evs
Copy link
Collaborator Author

ml-evs commented Mar 27, 2024

Right, #929 seemed pretty painless so I've just done an immediate follow-up release that enables pandas v2 for those that want it. Hopefully this ends the saga (until the next one ;)).

@JaGeo
Copy link
Contributor

JaGeo commented Mar 27, 2024

@ml-evs awesome work. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Issues or PRs that regard dependencies
Projects
None yet
Development

No branches or pull requests

4 participants