Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45640][SQL][TESTS] Fix flaky ProtobufCatalystDataConversionSuite #43493

Closed
wants to merge 2 commits into from

Conversation

panbingkun
Copy link
Contributor

What changes were proposed in this pull request?

The pr aims to fix flaky ProtobufCatalystDataConversionSuite.

As can be seen from the GA test below
https://github.com/panbingkun/spark/actions/runs/6612141762/job/17982780917
image

When data.get(0) is null, data.get(0).asInstanceOf[Array[Byte]].isEmpty will be thrown java.lang.NullPointerException.

image

Why are the changes needed?

Fix flaky ProtobufCatalystDataConversionSuite.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.
Manually test:

./build/sbt "protobuf/testOnly org.apache.spark.sql.protobuf.ProtobufCatalystDataConversionSuite"
image

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun
Copy link
Contributor Author

cc @LuciferYang @HyukjinKwon

@MaxGekk
Copy link
Member

MaxGekk commented Oct 24, 2023

@SandishKumarHN Could you take a look at the PR, please. Seems similar changes as in #38515 . Also cc @rangadi

Copy link
Contributor

@SandishKumarHN SandishKumarHN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -138,6 +138,7 @@ class ProtobufCatalystDataConversionSuite
data != null &&
(data.get(0) == defaultValue ||
(dt.fields(0).dataType == BinaryType &&
data.get(0) != null &&
Copy link
Contributor

@rangadi rangadi Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow null value for the field. I.e. change this to:

  (data.get(0) == null || data.get(0).asInstanceOf[Array[Byte]].isEmpty)

I will test on my side to ensure this test stays stable with range of seed values.
cc: @HeartSaVioR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this suggestion. data.get(0) != null is correct. It allows testing null.

@HeartSaVioR
Copy link
Contributor

Please run the test suite with all combination for available seeds and available testing types and make sure all of them are passing. Let's fix all the broken-but-not-caught cases.

@rangadi
Copy link
Contributor

rangadi commented Oct 24, 2023

@HeartSaVioR I have run this fix with see values from 0 to 10000. All were successful. Could you merge this?

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

@rangadi
I just indicated that the test suite with seed existed from 3.4. I guess you've also fixed some of the test issue in other PR. Should we port back that PR as well?

@rangadi
Copy link
Contributor

rangadi commented Oct 24, 2023

@HeartSaVioR yes, backporting will be good.

@HyukjinKwon
Copy link
Member

Merged to master

@rangadi
Copy link
Contributor

rangadi commented Oct 24, 2023

Thanks @HyukjinKwon

@HeartSaVioR
Copy link
Contributor

HeartSaVioR commented Oct 24, 2023

This has to go to 3.5/3.4 as well.

@panbingkun Would you mind filing PRs for backporting this to 3.5 and 3.4? Separate PRs would be preferred.
Please also include the change in this PR #43424 - only need to incorporate the change in connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufCatalystDataConversionSuite.scala

Thanks in advance!

@panbingkun
Copy link
Contributor Author

Would you mind filing PRs for backporting this to 3.5 and 3.4? Separate PRs would be preferred. Please also include the change in this PR #43424 - only need to incorporate the change in connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufCatalystDataConversionSuite.scala

Thanks in advance!

Let me do it right away.
Thank for all reviewing and help testing @rangadi @HeartSaVioR @MaxGekk @HyukjinKwon @LuciferYang

@panbingkun
Copy link
Contributor Author

This has to go to 3.5/3.4 as well.

@panbingkun Would you mind filing PRs for backporting this to 3.5 and 3.4? Separate PRs would be preferred. Please also include the change in this PR #43424 - only need to incorporate the change in connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufCatalystDataConversionSuite.scala

Thanks in advance!

Done.
Branch-3.4: #43520
Branch-3.5: #43521

HeartSaVioR pushed a commit that referenced this pull request Oct 25, 2023
…tDataConversionSuite

### What changes were proposed in this pull request?
The pr aims to fix flaky ProtobufCatalystDataConversionSuite, include:
- Fix the type check (when the random value was empty array, we didn't skip it. Original intention is to skip default values for types.) [SPARK-45588]
- When data.get(0) is null, data.get(0).asInstanceOf[Array[Byte]].isEmpty will be thrown java.lang.NullPointerException. [SPARK-45640]

Backport above to branch 3.4.
Master branch pr: #43424 & #43493

### Why are the changes needed?
Fix flaky ProtobufCatalystDataConversionSuite.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA
- Manually test

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43520 from panbingkun/branch-3.4_SPARK-45640.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
HyukjinKwon pushed a commit that referenced this pull request Oct 25, 2023
…tDataConversionSuite

### What changes were proposed in this pull request?
The pr aims to fix flaky ProtobufCatalystDataConversionSuite, include:
- Fix the type check (when the random value was empty array, we didn't skip it. Original intention is to skip default values for types.) [SPARK-45588]
- When data.get(0) is null, data.get(0).asInstanceOf[Array[Byte]].isEmpty will be thrown java.lang.NullPointerException. [SPARK-45640]

Backport above to branch 3.5.
Master branch pr: #43424 & #43493

### Why are the changes needed?
Fix flaky ProtobufCatalystDataConversionSuite.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA
- Manually test

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43521 from panbingkun/branch-3.5_SPARK-45640.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…tDataConversionSuite

### What changes were proposed in this pull request?
The pr aims to fix flaky ProtobufCatalystDataConversionSuite, include:
- Fix the type check (when the random value was empty array, we didn't skip it. Original intention is to skip default values for types.) [SPARK-45588]
- When data.get(0) is null, data.get(0).asInstanceOf[Array[Byte]].isEmpty will be thrown java.lang.NullPointerException. [SPARK-45640]

Backport above to branch 3.4.
Master branch pr: apache#43424 & apache#43493

### Why are the changes needed?
Fix flaky ProtobufCatalystDataConversionSuite.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA
- Manually test

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#43520 from panbingkun/branch-3.4_SPARK-45640.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants