Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce IcebergPageSource and IcebergColumnHandle #1675

Merged
merged 2 commits into from
Oct 23, 2019

Conversation

lxynov
Copy link
Member

@lxynov lxynov commented Oct 5, 2019

This PR is ready for review. Below are some explanations upon it.

Read side

  • Partition key columns and regular columns
    • Partition key columns are those which are the source of identity partitioning. They are special in Iceberg because data files may not store values for them. Especially, for tables migrated from Hive, data files won't have data for partitioning columns.
    • Regular columns are those which are not partition key columns.
  • Partition spec evolution in Iceberg
    • Iceberg tables support partition spec evolution, which means partition key columns may change over time, thereby different data files may have different partition key columns.
  • Logic on the read side
    • Partition key columns are determined at the split (file) level.
    • For each split, a list of partition keys is computed from metadata. Each partition key is a pair of iceberg ID and partition value.
    • IcebergPageSource prefills partition key columns and gets regular column values from a delegate.
    • Partition values are fetched from Iceberg metadata files, serialized into strings and passed to IcebergPageSource. IcebergPageSource deserializes them into Presto objects. Note that in Iceberg, only columns with primitive types can be used as the source of partitioning.

Write side

  • There aren't many changes on the write side.
  • Note that on the read side we don't need to convert Presto types to Hive types any more because we've got rid of HiveColumnHandle. But on the write side we have to still do this since we use Hive's Parquet writer in Presto.
  • Both row delete and partition delete are not supported yet so I removed the testDelete test in TestIcebergDistributed. (Originally Presto throws an exception saying "This connector only supports delete where one or more partitions are deleted entirely" for a delete operation upon Iceberg tables. But actually, even partition delete is not supported)`

Issue: #1655
Umbrella issue: #1324
cc: @wagnermarkd @electrum @Parth-Brahmbhatt

@cla-bot cla-bot bot added the cla-signed label Oct 5, 2019
@lxynov lxynov added the WIP label Oct 5, 2019
@lxynov lxynov force-pushed the iceberg-simplification branch from 22f1062 to e314705 Compare October 7, 2019 20:55
@lxynov lxynov removed the WIP label Oct 7, 2019
@lxynov lxynov force-pushed the iceberg-simplification branch 2 times, most recently from 10c214a to ecaba2f Compare October 7, 2019 23:34
}
partitionKeys.add(new HivePartitionKey(name, partitionValue));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about storing a deserialized presto type and value object here itself? it reduces the propagation of untyped values a bit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea but for now the value stored in IcebergPartitionKey is not technically a "presto" value. It's more an "Iceberg" value. The deserializer (IcebergPageSource#deserializePartitionValue) not only deserializes but also converts values from Iceberg representation to Presto representation.

I think we should do the conversion firstly and then serialize/deserialize Presto values. But I prefer to do it in a future separate PR. Because the value conversion also relates to DomainConverter and ExpressionConverter. We should do a uniform refactoring in the future.

@lxynov lxynov force-pushed the iceberg-simplification branch from ecaba2f to 46f8387 Compare October 21, 2019 22:19
@lxynov
Copy link
Member Author

lxynov commented Oct 21, 2019

@phd3 Thank you for the review! I've updated this PR.

@electrum Could you help review this? Hope we can get this merged sooner to avoid future conflicts with the master branch.

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, otherwise looks good. Thanks for cleaning this up and fixing partition handling.

Note that I merged #1561 but that should be a one line change in IcebergMetadata.applyFilter when you rebase.

@@ -35,7 +35,7 @@ protected boolean supportsViews()
@Override
public void testDelete()
{
assertQueryFails("DELETE FROM orders WHERE orderkey % 2 = 0", "This connector only supports delete where one or more partitions are deleted entirely");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IcebergMetadata still implements delete. What happens if you run it after this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It throws an exception This connector does not support updates or deletes from the default ConnectorMetadata##getUpdateRowIdColumnHandle.

@@ -93,6 +95,7 @@
public class IcebergPageSink
implements ConnectorPageSink
{
private static final TypeTranslator TYPE_TRANSLATOR = new HiveTypeTranslator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fork this into Iceberg so that we can remove the TIMESTAMP_WITH_TIME_ZONE hack. It can be a static utility method. We should also be able to remove the binding in IcebergModule.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added a static utility method to TypeConverter.

}
}

private static Object deserializePartitionValue(Type type, String valueString, String name, TimeZoneKey timeZoneKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there should be a better way to do this, especially the decimal part, but this seems like the best we can do here for now.

The parameter constraint in PartitionTable#cursor is targetted at
columns in the final partition table, rather than the original
data table.
@lxynov lxynov force-pushed the iceberg-simplification branch from 46f8387 to 7b29cd8 Compare October 22, 2019 22:06
@lxynov
Copy link
Member Author

lxynov commented Oct 22, 2019

@electrum Thank you for the feedback. I've addressed the comments.

This commit simplifies logic and fixing bugs in case of Iceberg
Partition Spec evolution.
@lxynov lxynov force-pushed the iceberg-simplification branch from 7b29cd8 to 2ea9bce Compare October 22, 2019 22:34
@lxynov lxynov changed the title Introduce IcebergPageSource, IcebergColumnHandle and IcebergPartitionKey Introduce IcebergPageSource and IcebergColumnHandle Oct 22, 2019
@electrum electrum merged commit f9e1dae into trinodb:master Oct 23, 2019
@electrum
Copy link
Member

Merged, thanks!

@lxynov lxynov deleted the iceberg-simplification branch October 23, 2019 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants