-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49306][PYTHON][SQL] Create DataFrame API support for new 'zeroifnull' and 'nullifzero' SQL functions #47851
Conversation
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
respond to code review comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MaxGekk @HyukjinKwon @allisonwang-db for your reviews, responded to your comments, please take another look.
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
+------+ | ||
|result| | ||
+------+ | ||
| None| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you expect None
if the function nullifzero()
should return NULL
for 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this; it was indeed a typo and should say NULL
instead of None
.
+1, LGTM. Merging to mater. |
…l' and 'nullifzero' ### What changes were proposed in this pull request? In apache#47817 we added new SQL functions `zeroifnull` and `nullifzero`. In this PR we add Scala and Python DataFrame API endpoints for them. For example, in Scala: ``` var df = Seq((0)).toDF("a") df.selectExpr("nullifzero(0)").collect() > null df.select(nullifzero(lit(0))).collect() > null df.selectExpr("nullifzero(a)").collect() > null df.select(nullifzero(lit(5))).collect() > 5 df = Seq[(Integer)]((null)).toDF("a") df.selectExpr("zeroifnull(null)").collect() > 5 df.select(nullifzero(lit(null))).collect() > 0 df.selectExpr("zeroifnull(a)").collect() > 0 df.select(zeroifnull(lit(5))) > 5 ``` ### Why are the changes needed? This improves DataFrame parity with the SQL API. ### Does this PR introduce _any_ user-facing change? Yes, see above. ### How was this patch tested? This PR adds unit test coverage. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47851 from dtenedor/dataframe-zeroifnull. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…l' and 'nullifzero' ### What changes were proposed in this pull request? In apache#47817 we added new SQL functions `zeroifnull` and `nullifzero`. In this PR we add Scala and Python DataFrame API endpoints for them. For example, in Scala: ``` var df = Seq((0)).toDF("a") df.selectExpr("nullifzero(0)").collect() > null df.select(nullifzero(lit(0))).collect() > null df.selectExpr("nullifzero(a)").collect() > null df.select(nullifzero(lit(5))).collect() > 5 df = Seq[(Integer)]((null)).toDF("a") df.selectExpr("zeroifnull(null)").collect() > 5 df.select(nullifzero(lit(null))).collect() > 0 df.selectExpr("zeroifnull(a)").collect() > 0 df.select(zeroifnull(lit(5))) > 5 ``` ### Why are the changes needed? This improves DataFrame parity with the SQL API. ### Does this PR introduce _any_ user-facing change? Yes, see above. ### How was this patch tested? This PR adds unit test coverage. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47851 from dtenedor/dataframe-zeroifnull. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
What changes were proposed in this pull request?
In #47817 we added new SQL functions
zeroifnull
andnullifzero
.In this PR we add Scala and Python DataFrame API endpoints for them.
For example, in Scala:
Why are the changes needed?
This improves DataFrame parity with the SQL API.
Does this PR introduce any user-facing change?
Yes, see above.
How was this patch tested?
This PR adds unit test coverage.
Was this patch authored or co-authored using generative AI tooling?
No.