-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Iceberg to use the new ParquetFileWriter #4424
Change Iceberg to use the new ParquetFileWriter #4424
Conversation
Do you have an example collision in mind? |
With the code in my PR, I saw pretty frequent failures in TestHiveCreateExternalTable accessing table "test_create_external" in ci automated testing. I was unable to reproduce the problem on my laptop. The failures in ci automated testing haven't happened with the tables renamed, and the renaming seems entirely harmless to me. It's not a very big statistical universe, and perhaps what I observed was just chance and the failures will resurface. @electrum also wondered how a naming conflict could have caused the test to fail. |
@@ -54,22 +54,22 @@ public void testCreateExternalTableWithData() | |||
File tableLocation = new File(tempDir, "data"); | |||
|
|||
@Language("SQL") String createTableSql = format("" + | |||
"CREATE TABLE test_create_external " + | |||
"CREATE TABLE test_create_external_with_data " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename test databases to prevent name collisions
-> Rename test tables ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh, good point - - It was the tables that were renamed.
I just rebased this PR on tip master, since there were conflicts and removed the commit that renamed the tables. If the Hive test failures recur, I'll submit a separate PR with the renames.
5a46032
to
a3cfb02
Compare
presto-hive/src/main/java/io/prestosql/plugin/hive/parquet/ParquetFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/HiveParquetPrimitiveTypeConverter.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/HiveParquetPrimitiveTypeConverter.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/HiveParquetPrimitiveTypeConverter.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/ParquetSchemaConverter.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/ParquetSchemaConverter.java
Outdated
Show resolved
Hide resolved
a3cfb02
to
afe026c
Compare
Thanks very much for the detailed review, @electrum! I just force-pushed an update to the PR that acts on all your comments. |
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveParquetPrimitiveTypeConverter.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
afe026c
to
924874c
Compare
I force-pushed fixes for your remaining comments @electrum |
924874c
to
9d8cd35
Compare
...-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergParquetPrimitiveTypeConverter.java
Outdated
Show resolved
Hide resolved
4349e93
to
d247c41
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, otherwise looks good
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergPageSink.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergParquetFileWriter.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergRecordFileWriter.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/PrimitiveTypeMapBuilder.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/PrimitiveTypeMapBuilder.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/PrimitiveTypeMapBuilder.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/ParquetWriter.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/ParquetWriters.java
Outdated
Show resolved
Hide resolved
This commit changes the Iceberg connector to use the new ParquetFileWriter, and fixes bugs in handling of decimal and timestamp columns for both Orc and Parquet. This commit removes DomainConverter and the test TestDomainConverter, superseded by fixes and conversions in ExpressionConverter, MessageTypeConverter and new class TimestampValueWriter. This commit adds new TestIcebergSmoke tests for decimal and timestamp, and fixes the existing tests. With this commit, all TestIcebergSmoke tests pass.
d247c41
to
81db7fe
Compare
This commit changes the Iceberg connector to use the new ParquetFileWriter, and fixes bugs in handling of decimal and timestamp columns for both Orc and Parquet.
This commit removes DomainConverter and the test TestDomainConverter, superseded by fixes and conversions in ExpressionConverter, MessageTypeConverter and new class TimestampValueWriter.
This commit adds new TestIcebergSmoke tests for decimal and timestamp, and fixes the existing tests. With this commit, all TestIcebergSmoke tests pass.
#1324