-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] fastparquet_compatibility_test fails on dataproc #9603
Comments
Hmm. This is odd. I just tested the
I wonder if this is version-specific. I'll try again with an older version of Dataproc, to match Spark version 3.1.3, as in the reported failure. Edit: Appears to be version specific. I have a repro on 3.1.3. More as it develops. |
Fixes NVIDIA#9603. This commit changes the integration test setup to specifically install fastparquet-0.8.3. Prior to this change, when the fastparquet version is not specified, the pip install caused 0.5.0 to be installed on some nodes, e.g. on Dataproc 2.0 (with Spark 3.1.1). The older fastparquet versions do not support reading the contents of input directories recursively, causing the tests to fail. Signed-off-by: MithunR <mythrocks@gmail.com>
Fixes NVIDIA#9603. This commit changes the integration test setup to specifically install fastparquet-0.8.3. Prior to this change, when the fastparquet version is not specified, the pip install caused 0.5.0 to be installed on some nodes, e.g. on Dataproc 2.0 (with Spark 3.1.1). The older fastparquet versions do not support reading the contents of input directories recursively, causing the tests to fail. Note that this change doesn't bump the version all the way to 2023.8.0, so as to preserve compatibility with Dataproc 2.0. v0.8.3 seems to have the broadest support. Signed-off-by: MithunR <mythrocks@gmail.com>
The problem is the There is a fix posted in #9607. |
…#9607) * Integration tests: Install specific fastparquet version. Fixes #9603. This commit changes the integration test setup to specifically install fastparquet-0.8.3. Prior to this change, when the fastparquet version is not specified, the pip install caused 0.5.0 to be installed on some nodes, e.g. on Dataproc 2.0 (with Spark 3.1.1). The older fastparquet versions do not support reading the contents of input directories recursively, causing the tests to fail. Note that this change doesn't bump the version all the way to 2023.8.0, so as to preserve compatibility with Dataproc 2.0. v0.8.3 seems to have the broadest support. Signed-off-by: MithunR <mythrocks@gmail.com> * Switching to one specific fastparquet version. * Added requirements.txt to the IT tar.gz package. This should allow the installation of the right `fastparquet` version. --------- Signed-off-by: MithunR <mythrocks@gmail.com>
fastparquet_compatibility_test failed in a recent nightly Dataproc build:
Details for one of the test failures
The text was updated successfully, but these errors were encountered: