Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support files with locations ending with whitespace #18206

Conversation

findepi
Copy link
Member

@findepi findepi commented Jul 10, 2023

Whitespace, and especially trailing whitespace, may have confusing consequences when used in file locations so it's not recommended. However, whitespace, including trailing whitespace, is valid part of file locations and Trino should be able to read such files (objects).

# Hive, Iceberg, Delta, Hudi
* Fix query failure when table has a file with location ending with a whitespace. 

@cla-bot cla-bot bot added the cla-signed label Jul 10, 2023
@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch from d36e8ec to a9c4f5a Compare July 10, 2023 08:31
@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch from a9c4f5a to e49af31 Compare July 10, 2023 10:22
@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch from e49af31 to 5d3d471 Compare July 10, 2023 10:54
@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch 2 times, most recently from bfd42c1 to abb0ece Compare July 10, 2023 13:08
@alexjo2144
Copy link
Member

Were you able to register locations ending in whitespace previously with the Hadoop Path implementation, or was it normalized out?

If it was normalized before, maybe we don't need to support these types of locations.

@findepi
Copy link
Member Author

findepi commented Jul 12, 2023

Were you able to register locations ending in whitespace previously with the Hadoop Path implementation, or was it normalized out?

// prints "s3://bucket/whitespace x"
System.out.println(new org.apache.hadoop.fs.Path("s3://bucket/whitespace ") + "x");

@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch 2 times, most recently from 30a067f to 199d84e Compare July 12, 2023 08:40
findepi added 4 commits July 12, 2023 22:01
Use blob contents different than their locations.
Whitespace, and especially trailing whitespace, may have confusing
consequences when used in file locations so it's not recommended.
However, whitespace, including trailing whitespace, is valid part of
file locations and Trino should be able to read such files (objects).

The removed assertions are mostly covered by the new test method being
added, except that this new test is run for S3-like filesystems only and
does not cover renames. Both aspects will be addressed in a following
commit.
@findepi findepi force-pushed the findepi/support-files-with-locations-ending-with-whitespace-701307 branch from 199d84e to c66d37d Compare July 12, 2023 20:04
@findepi
Copy link
Member Author

findepi commented Jul 12, 2023

(rebased to resolve conflicts)

@findepi findepi merged commit e61d64b into master Jul 14, 2023
@findepi findepi deleted the findepi/support-files-with-locations-ending-with-whitespace-701307 branch July 14, 2023 13:59
@github-actions github-actions bot added this to the 423 milestone Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants