-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install cuDF-py against python 3.10 on Databricks #11477
Conversation
Fix on Databricks runtime for : NVIDIA#11394 Enable the udf_cudf_test test case for Databricks-13.3 Rapids 24.10+ drops python 3.9 or below conda packages. ref: https://docs.rapids.ai/notices/rsn0040/ Install cuDF-py packages against python 3.10 and above on Databricks runtime to run UDF cuDF tests, because on DB-13.3 Conda is not installed by default. Signed-off-by: timl <timl@nvidia.com>
jenkins/databricks/init_cudf_udf.sh
Outdated
@@ -61,9 +47,41 @@ REQUIRED_PACKAGES=( | |||
requests | |||
sre_yield | |||
) | |||
if base=$(conda info --base); then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support install cudf-py packages via either Conda or PIP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: if no conda in PATH this would error out? use detect existence + var assignment in the same expression is not recommended here
can we try to use the the command
to check if exist first? this would help make the if expression and else branches more clear
if command -v conda >/dev/null 2>&1; then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does any of Databricks Runtime 13.3+ have conda? If no, I think we can drop supporting conda installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: if no conda in PATH this would error out?
The if expression will not error out but get 'false' value instead.
can we try to use the the
command
to check if exist first? this would help make the if expression and else branches more clearif command -v conda >/dev/null 2>&1; then
Sounds good to me, let me update it to make the if/else more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does any of Databricks Runtime 13.3+ have conda?
DB-13.3+ do not install conda by default
If no, I think we can drop supporting conda installation.
I'd like to keep conda installation there for some time, because:
Keep it to maintain compatibility with DB13.3 or earlier conda installations.
We also have the jenkins/databricks/cudf_udf_test.sh script, which is still compatible with conda installations.
Signed-off-by: timl <timl@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
build |
failed unrelated cases: test_broadcast_join_with_conditionals in scala213 could be related to latest cudf change |
build |
Fix on Databricks runtime for : #11394
Enable the udf_cudf_test test case for Databricks-13.3
Rapids 24.10+ drops python 3.9 or below conda packages. ref: https://docs.rapids.ai/notices/rsn0040/
Install cuDF-py packages against python 3.10 and above on Databricks runtime to run UDF cuDF tests, because on DB-13.3 Conda is not installed by default.