Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(hadoop): Remove the schema for hdfs path when reading file #11963

Closed
wants to merge 1 commit into from

Conversation

JkSelf
Copy link
Collaborator

@JkSelf JkSelf commented Dec 26, 2024

Although we support JVM libhdfs, Gluten's internal benchmark still uses Libhdfs3. We encountered a 'File Not Found' exception when reading the HDFS path with libhdfs3.

Reason: Unable to get file path info for file: hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet. got error: FileNotFoundException: Path hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet does not exist.
Retriable: False
Context: Split [Hive: hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet 0 - 1489456566] Task Gluten_Stage_8_TID_842_VTID_27
Additional Context: Operator: TableScan[0] 0
Function: Impl
File: /home/sparkuser/workspace/workspace/Gluten_TPCH_Spark32_test/ep/build-velox/build/velox_ep/velox/connectors/hive/storage_adapters/hdfs/HdfsReadFile.cpp
Line: 79

This PR reverts some changes from a previous PR to ensure continued support for libhdfs3 reading in Velox

@JkSelf JkSelf requested a review from majetideepak as a code owner December 26, 2024 07:37
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 26, 2024
Copy link

netlify bot commented Dec 26, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 102f4c7
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/677cbb28a74a23000820ae1f

@majetideepak
Copy link
Collaborator

@JkSelf did you figure out why velox_hdfs_file_test is failing?

@JkSelf JkSelf force-pushed the hdfs3-read-fix branch 2 times, most recently from 491adaa to 102f4c7 Compare January 7, 2025 05:27
@JkSelf
Copy link
Collaborator Author

JkSelf commented Jan 9, 2025

@majetideepak Can you help to review this PR again? Thanks.

@majetideepak majetideepak added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Jan 9, 2025
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@JkSelf
Copy link
Collaborator Author

JkSelf commented Jan 10, 2025

@Yuhta Can you help to merge? Thanks.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in 923dcc8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants