-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iceberg Connector #1324
Comments
@linxingyuan1102 should it also be a TODO for: "Iceberg table should also allow to give table location?" |
@manishmalhotrawork sure, done Just a note, partition pruning in Iceberg is tricky because of partition spec evolution. We need more thoughts and discussion on this. |
@lxynov Is it planned to add support of hdfs only iceberg tables (like in spark https://iceberg.apache.org/spark/ & |
@lxynov This feature is a blocker to |
Is there any plan for supporting |
https://github.com/trinodb/trino/pull/6977/files @pan3793 i think this is related work |
Will you support table configuration properties ? |
Hey. I don't see in the list support of the UPDATE or CHANGE statement for ALTER TABLE. It would be very handy, since data evolves a lot. I can see that the functionality exists in the IcerbergAPI: https://iceberg.apache.org/javadoc/master/org/apache/iceberg/UpdateSchema.html I might be missing something. |
@RomantsovArtur, for schema evolution in Trino you can use See my section on Schema Evolution in this blog: https://blog.starburst.io/trino-on-ice-ii-in-place-table-evolution-and-cloud-compatibility-with-iceberg If you're looking for updates for partition evolution, we are already tracking #7580 here. Feel free to reach out to me on Trino Slack if you're looking for something specific. |
@bitsondatadev Thank you for your reply! We are looking for some logic like: As you can see from the link I provided above - Iceberg API is available, but, unfortunately, Trino does not support this logic. I read the doc you attached. Thank you for the beautiful blog post. The use case we are trying to achieve is the case when you have a table that is constantly written to and read by different clients, and we want to have an atomic type update rather than
Please note that I'm speaking about the case when we need to evolve many tables on a regular basics. Some are very huge, 100 b + records. |
Made a new issue for this. First step is to add the syntax. Then this should be easy to hook up to Iceberg. |
Thank you for the quick reply! Looks great 🚀 |
Posting here as it seems the central location to enable full support for iceberg as a Trino connector: Is there already support for rewrite_data_files procedure? |
Row Level Delete where added to Iceberg, this means we that DELETE/UPSERT/MERGO INTO are unlock. |
We don't use this issue for tracking Iceberg work anymore, so let me close it.
|
That being said, I really appreciate all the effort that was put into maintaining this initial roadmap. That said, we should align in how we view larger efforts! Thanks all! |
TODOs for the Iceberg Connector
ThriftHiveMetastore
.getTableLayouts()
implementation needs to be updated forapplyFilter()
.HdfsContext
calls that use/tmp
need to be fixed.HiveConfig
needs to be removed. We might need to split out separate config classes in the Hive connector for the components that are reused in Iceberg.HiveColumnHandle
. This will require replacing or abstractingHivePageSource
, which is currently used to handle schema evolution and prefilled column values (identity partitions).UUID type is not implemented and will be dropped from the Iceberg specification.CREATE TABLE LIKE
.NOT NULL
columns.NOT NULL
enforcementlocation
orexternal_location
table property Iceberg integration: allow to specify LOCATION property on CREATE TABLE #2501TestHiveBucketing
.IN
predicates #9743$partitions
when Iceberg table partitioned on timestamp with time zone #9703$partitions
for varbinary non-partition column #9756IcebergSplitSource
throws awayCombinedScanTask
combinations #8486The text was updated successfully, but these errors were encountered: