You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using numpy 2.1.0, certain methods in pandas (e.g. first_valid_index() and .at[] access) return numpy.int64 instead of plain Python integers as seen when using numpy 1.26.4. This creates inconsistencies in behavior.
…ing Spark branches
### What changes were proposed in this pull request?
Upgrade numpy to 2.1.0 for building and testing Spark branches.
Failed tests are categorized into the following groups:
- Most of test failures fixed are related to pandas-dev/pandas#59838 (comment).
- Replaced np.mat with np.asmatrix.
- TODO: SPARK-49793
### Why are the changes needed?
Ensure compatibility with newer NumPy, which is utilized by Pandas (on Spark).
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing tests.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#48180 from xinrong-meng/np_upgrade.
Authored-by: Xinrong Meng <xinrong@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
When using numpy 2.1.0, certain methods in pandas (e.g. first_valid_index() and .at[] access) return
numpy.int64
instead of plain Python integers as seen when using numpy 1.26.4. This creates inconsistencies in behavior.To reproduce
Issue
Environment 1 (numpy 1.26.4)
Environment 2 (numpy 2.1.0)
Discusion
Is this intended behavior, or is it a compatibility issue between pandas 2.2.2 and numpy 2.1.0?
The text was updated successfully, but these errors were encountered: