Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] json_matrix_test.py::test_scan_json_string fails on Spark 4 #11154

Open
mythrocks opened this issue Jul 8, 2024 · 0 comments
Open

[BUG] json_matrix_test.py::test_scan_json_string fails on Spark 4 #11154

mythrocks opened this issue Jul 8, 2024 · 0 comments
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues

Comments

@mythrocks
Copy link
Collaborator

json_matrix_test.py::test_scan_json_string fails on Spark 4 because of whitespaces:

../../../../integration_tests/src/main/python/json_matrix_test.py::test_scan_json_strings[read_json_df-int_struct_formatted.json][DATAGEN_SEED=1720459618, TZ=UTC] FAILED [ 92%]
...
cpu = '{"A": 0, "B": 1}', gpu = '{"A":0,"B":1}'

We need a way for the tests to compare strings extracted from JSON inputs in a manner that isn't susceptible to immaterial whitespace differences. (Note that the whitespace differences are in neither the keys nor values, but in between.)

This test will be xfailed for the moment, as part of #11029.

@mythrocks mythrocks added bug Something isn't working ? - Needs Triage Need team to review and classify Spark 4.0+ Spark 4.0+ issues labels Jul 8, 2024
mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Jul 8, 2024
Record comparisons do not currently account for legitimate whitespace
differences.
See NVIDIA#11154.
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jul 9, 2024
@sameerz sameerz mentioned this issue Jul 18, 2024
49 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

No branches or pull requests

2 participants