Skip to content

Commit

Permalink
[py312] Fix test_tensor.py (#46669)
Browse files Browse the repository at this point in the history
To upgrade to py312 we need to upgrade to `pandas>=2.0.0`. This upgrade
introduced a breaking change in the syntax of some of our code / tests:
> Changed behavior in setting values with df.loc[:, foo] = bar or
df.iloc[:, foo] = bar, these now always attempt to set values inplace
before falling back to casting ([GH
45333](pandas-dev/pandas#45333))

As a result we needed to update these uses of loc to do direct
assignment.

<!-- For example: "Closes #1234" -->

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Matthew Owen <mowen@anyscale.com>
Signed-off-by: can <can@anyscale.com>
  • Loading branch information
omatthew98 authored and can-anyscale committed Jul 18, 2024
1 parent 193315b commit 77d02bd
Show file tree
Hide file tree
Showing 3 changed files with 464 additions and 130 deletions.
4 changes: 2 additions & 2 deletions python/ray/air/util/data_batch_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ def _cast_ndarray_columns_to_tensor_extension(df: "pd.DataFrame") -> "pd.DataFra
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=FutureWarning)
warnings.simplefilter("ignore", category=SettingWithCopyWarning)
df.loc[:, col_name] = TensorArray(col)
df[col_name] = TensorArray(col)
except Exception as e:
raise ValueError(
f"Tried to cast column {col_name} to the TensorArray tensor "
Expand Down Expand Up @@ -354,5 +354,5 @@ def _cast_tensor_columns_to_ndarrays(df: "pd.DataFrame") -> "pd.DataFrame":
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=FutureWarning)
warnings.simplefilter("ignore", category=SettingWithCopyWarning)
df.loc[:, col_name] = pd.Series(list(col.to_numpy()))
df[col_name] = list(col.to_numpy())
return df
8 changes: 4 additions & 4 deletions python/ray/data/tests/test_tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -564,7 +564,7 @@ def test_tensors_in_tables_pandas_roundtrip(
ds_df = ds.to_pandas()
expected_df = df + 1
if enable_automatic_tensor_extension_cast:
expected_df.loc[:, "two"] = list(expected_df["two"].to_numpy())
expected_df["two"] = list(expected_df["two"].to_numpy())
pd.testing.assert_frame_equal(ds_df, expected_df)


Expand All @@ -585,7 +585,7 @@ def test_tensors_in_tables_pandas_roundtrip_variable_shaped(
ds_df = ds.to_pandas()
expected_df = df + 1
if enable_automatic_tensor_extension_cast:
expected_df.loc[:, "two"] = _create_possibly_ragged_ndarray(
expected_df["two"] = _create_possibly_ragged_ndarray(
expected_df["two"].to_numpy()
)
pd.testing.assert_frame_equal(ds_df, expected_df)
Expand Down Expand Up @@ -873,8 +873,8 @@ def test_tensors_in_tables_iter_batches(
)
df = pd.concat([df1, df2], ignore_index=True)
if enable_automatic_tensor_extension_cast:
df.loc[:, "one"] = list(df["one"].to_numpy())
df.loc[:, "two"] = list(df["two"].to_numpy())
df["one"] = list(df["one"].to_numpy())
df["two"] = list(df["two"].to_numpy())
ds = ray.data.from_pandas([df1, df2])
batches = list(ds.iter_batches(batch_size=2, batch_format="pandas"))
assert len(batches) == 3
Expand Down
Loading

0 comments on commit 77d02bd

Please sign in to comment.