Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support polars 1.13 #17299

Merged
merged 5 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 1 addition & 22 deletions ci/test_cudf_polars_polars_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,6 @@

set -eou pipefail

# We will only fail these tests if the PR touches code in pylibcudf
# or cudf_polars itself.
# Note, the three dots mean we are doing diff between the merge-base
# of upstream and HEAD. So this is asking, "does _this branch_ touch
# files in cudf_polars/pylibcudf", rather than "are there changes
# between upstream and this branch which touch cudf_polars/pylibcudf"
# TODO: is the target branch exposed anywhere in an environment variable?
if [ -n "$(git diff --name-only origin/branch-24.12...HEAD -- python/cudf_polars/ python/cudf/cudf/_lib/pylibcudf/)" ];
then
HAS_CHANGES=1
rapids-logger "PR has changes in cudf-polars/pylibcudf, test fails treated as failure"
else
HAS_CHANGES=0
rapids-logger "PR does not have changes in cudf-polars/pylibcudf, test fails NOT treated as failure"
fi

rapids-logger "Download wheels"

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
Expand Down Expand Up @@ -63,9 +47,4 @@ if [ ${EXITCODE} != 0 ]; then
else
rapids-logger "Running polars test suite PASSED"
fi

if [ ${HAS_CHANGES} == 1 ]; then
exit ${EXITCODE}
else
exit 0
fi
exit ${EXITCODE}
23 changes: 1 addition & 22 deletions ci/test_wheel_cudf_polars.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,6 @@

set -eou pipefail

# We will only fail these tests if the PR touches code in pylibcudf
# or cudf_polars itself.
# Note, the three dots mean we are doing diff between the merge-base
# of upstream and HEAD. So this is asking, "does _this branch_ touch
# files in cudf_polars/pylibcudf", rather than "are there changes
# between upstream and this branch which touch cudf_polars/pylibcudf"
# TODO: is the target branch exposed anywhere in an environment variable?
if [ -n "$(git diff --name-only origin/branch-24.12...HEAD -- python/cudf_polars/ python/pylibcudf/)" ];
then
HAS_CHANGES=1
rapids-logger "PR has changes in cudf-polars/pylibcudf, test fails treated as failure"
else
HAS_CHANGES=0
rapids-logger "PR does not have changes in cudf-polars/pylibcudf, test fails NOT treated as failure"
fi

rapids-logger "Download wheels"

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
Expand Down Expand Up @@ -65,9 +49,4 @@ if [ ${EXITCODE} != 0 ]; then
else
rapids-logger "Testing PASSED"
fi

if [ ${HAS_CHANGES} == 1 ]; then
exit ${EXITCODE}
else
exit 0
fi
exit ${EXITCODE}
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ dependencies:
- pandas
- pandas>=2.0,<2.2.4dev0
- pandoc
- polars>=1.11,<1.13
- polars>=1.11,<1.14
- pre-commit
- ptxcompiler
- pyarrow>=14.0.0,<19.0.0a0
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ dependencies:
- pandas
- pandas>=2.0,<2.2.4dev0
- pandoc
- polars>=1.11,<1.13
- polars>=1.11,<1.14
- pre-commit
- pyarrow>=14.0.0,<19.0.0a0
- pydata-sphinx-theme!=0.14.2
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cudf-polars/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ requirements:
run:
- python
- pylibcudf ={{ version }}
- polars >=1.11,<1.12
- polars >=1.11,<1.14
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}

test:
Expand Down
2 changes: 1 addition & 1 deletion dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -734,7 +734,7 @@ dependencies:
common:
- output_types: [conda, requirements, pyproject]
packages:
- polars>=1.11,<1.13
- polars>=1.11,<1.14
run_dask_cudf:
common:
- output_types: [conda, requirements, pyproject]
Expand Down
75 changes: 32 additions & 43 deletions python/cudf_polars/cudf_polars/callback.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,12 +148,7 @@ def _callback(
return ir.evaluate(cache={}).to_polars()


def execute_with_cudf(
nt: NodeTraverser,
*,
config: GPUEngine,
exception: type[Exception] | tuple[type[Exception], ...] = Exception,
) -> None:
def execute_with_cudf(nt: NodeTraverser, *, config: GPUEngine) -> None:
"""
A post optimization callback that attempts to execute the plan with cudf.

Expand All @@ -165,10 +160,15 @@ def execute_with_cudf(
config
GPUEngine configuration object

exception
Optional exception, or tuple of exceptions, to catch during
translation. Defaults to ``Exception``.
Raises
------
ValueError
If the config contains unsupported keys.
NotImplementedError
If translation of the plan is unsupported.

Notes
-----
The NodeTraverser is mutated if the libcudf executor can handle the plan.
"""
device = config.device
Expand All @@ -178,38 +178,27 @@ def execute_with_cudf(
raise ValueError(
f"Engine configuration contains unsupported settings {unsupported}"
)
try:
with nvtx.annotate(message="ConvertIR", domain="cudf_polars"):
translator = Translator(nt)
ir = translator.translate_ir()
ir_translation_errors = translator.errors
if len(ir_translation_errors):
# TODO: Display these errors in user-friendly way.
# tracked in https://github.com/rapidsai/cudf/issues/17051
unique_errors = sorted(set(ir_translation_errors), key=str)
error_message = "Query contained unsupported operations"
verbose_error_message = (
f"{error_message}\nThe errors were:\n{unique_errors}"
)
unsupported_ops_exception = NotImplementedError(
error_message, unique_errors
)
if bool(int(os.environ.get("POLARS_VERBOSE", 0))):
warnings.warn(verbose_error_message, UserWarning, stacklevel=2)
if raise_on_fail:
raise unsupported_ops_exception
else:
nt.set_udf(
partial(
_callback, ir, device=device, memory_resource=memory_resource
)
)
except exception as e:
if bool(int(os.environ.get("POLARS_VERBOSE", 0))):
warnings.warn(
f"Query execution with GPU not supported, reason: {type(e)}: {e}",
PerformanceWarning,
stacklevel=2,
with nvtx.annotate(message="ConvertIR", domain="cudf_polars"):
translator = Translator(nt)
ir = translator.translate_ir()
ir_translation_errors = translator.errors
if len(ir_translation_errors):
# TODO: Display these errors in user-friendly way.
# tracked in https://github.com/rapidsai/cudf/issues/17051
unique_errors = sorted(set(ir_translation_errors), key=str)
formatted_errors = "\n".join(
f"- {e.__class__.__name__}: {e}" for e in unique_errors
)
error_message = (
"Query execution with GPU not possible: unsupported operations."
f"\nThe errors were:\n{formatted_errors}"
)
exception = NotImplementedError(error_message, unique_errors)
if bool(int(os.environ.get("POLARS_VERBOSE", 0))):
warnings.warn(error_message, PerformanceWarning, stacklevel=2)
if raise_on_fail:
raise exception
else:
nt.set_udf(
partial(_callback, ir, device=device, memory_resource=memory_resource)
)
if raise_on_fail:
raise
3 changes: 2 additions & 1 deletion python/cudf_polars/cudf_polars/dsl/ir.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,7 @@ class ErrorNode(IR):
def __init__(self, schema: Schema, error: str):
self.schema = schema
self.error = error
self.children = ()


class PythonScan(IR):
Expand Down Expand Up @@ -546,7 +547,7 @@ def do_evaluate(
# shifts the row index.
# But prior to 1.13, polars had this wrong, so we match behaviour
# https://github.com/pola-rs/polars/issues/19607
offset += skip_rows # pragma: no cover; polars 1.13 not yet released
offset += skip_rows
dtype = schema[name]
step = plc.interop.from_arrow(
pa.scalar(1, type=plc.interop.to_arrow(dtype))
Expand Down
4 changes: 1 addition & 3 deletions python/cudf_polars/cudf_polars/dsl/nodebase.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,7 @@ class Node(Generic[T]):
def _ctor_arguments(self, children: Sequence[T]) -> Sequence[Any | T]:
return (*(getattr(self, attr) for attr in self._non_child), *children)

def reconstruct(
self, children: Sequence[T]
) -> Self: # pragma: no cover; not yet used
def reconstruct(self, children: Sequence[T]) -> Self:
"""
Rebuild this node with new children.

Expand Down
2 changes: 1 addition & 1 deletion python/cudf_polars/cudf_polars/testing/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def pytest_configure(config: pytest.Config) -> None:
)
config.addinivalue_line(
"filterwarnings",
"ignore:.*Query execution with GPU not supported",
"ignore:.*Query execution with GPU not possible",
)


Expand Down
2 changes: 1 addition & 1 deletion python/cudf_polars/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ authors = [
license = { text = "Apache 2.0" }
requires-python = ">=3.10"
dependencies = [
"polars>=1.11,<1.13",
"polars>=1.11,<1.14",
"pylibcudf==24.12.*,>=0.0.0a0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
classifiers = [
Expand Down
2 changes: 1 addition & 1 deletion python/cudf_polars/tests/test_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def raise_unimplemented(self, *args):
pytest.raises(pl.exceptions.ComputeError),
pytest.warns(
pl.exceptions.PerformanceWarning,
match="Query execution with GPU not supported",
match="Query execution with GPU not possible",
),
):
# And ensure that collecting issues the correct warning.
Expand Down
Loading