Can I create hive table on top of delta? #18

skp33 · 2019-04-29T19:56:48Z

We have a datalake based on lambda architecture to solve real time and batch data sink problem. For which we are using hive as a datastore.

So my question is that, if I use delta, is it possible to create a hive table on top of that?

tdas · 2019-04-29T21:56:21Z

Hive metastore support is still not yet available in 0.1.0 but we want to make it work well with Hive Metastore soon.

skp33 · 2019-05-02T06:47:51Z

Thanks @tdas, looking forward to see this feature.

nonsleepr · 2019-05-06T14:03:13Z

You can manually create an external Hive table, you would have to create partitions manually as well though.

E.g.:

CREATE EXTERNAL TABLE delta_table (
  col1 STRING,
  col2 INT)
PARTITION BY (`date` INT, hour INT)
STORED AS PARQUET
LOCATION '/path/to/delta/table/location';

ALTER TABLE delta_table ADD PARTITION (`date`=20190506, hour=10);

mukulmurthy · 2019-05-06T17:30:27Z

Hi @nonsleepr,

Reading a Delta table by looking at the files directly is not guaranteed to return a consistent snapshot of the table. The only currently canonical way to read a Delta table is to go through the streaming or batch DataFrame reader APIs.

spmp · 2019-07-23T01:49:05Z

Hive metastore support is still not yet available in 0.1.0 but we want to make it work well with Hive Metastore soon.

@tdas can you give any indication of how soon is soon please? Is this work happening in a branch somewhere that we can try out and possibly contribute to?

Otherwise is there a recommended intermediate path such as Delta -> JBDC -> Hive ??

Cheers

prabgemini · 2019-08-21T23:08:28Z

Any update on when this feature will be available?

Also now that the delta lake supports vacuum, can we assume that if we execute vacuum on a delta table and then create an external hive table on top of that, it should be consistent?

tdas · 2019-08-22T00:23:17Z

We are working with Spark community to add all the necessary pluggable interfaces needed to add table support for Delta. We are hoping that this should be available with Spark 3.0.0 which is targeted for Q4.

You could use "Vacuum with 0 retention" to clean off all data not needed by the latest version. After that, it is possible to treat it is as an external parquet table in Hive metastore. Just do not run vacuum while a concurrent write to the table is going on as yet-to-be-committed files may get deleted.

ivoson · 2020-01-19T07:13:47Z

Hi, @tdas
do we have any update on this topic? It will be great if anything new can be shared, thanks

tdas · 2020-01-20T01:19:27Z

It's hard to share anything before Spark 3.0 has been finalized in any form since that is necessary for all the hive metastore support to work. Spark 3.0 is still a few months away from being released. That said, once there is an RC for Spark 3.0, we will try to have a branch in this repo with all the changes needed to migrate Delta to Spark 3.0. Then there may be something for you try out.

…matically (delta-io#18)

pranavanand · 2020-03-24T20:12:59Z

#85 is the issue tracking metastore support.

tdas · 2020-03-26T01:19:25Z

I am closing this issue in favor of the #85. Please subscribe to that issue.

…relation (delta-io#18)

Add build support for Scala 2.11. Closes delta-io#18

tdas added the question Questions on how to use Delta Lake label Apr 29, 2019

LantaoJin added a commit to LantaoJin/delta that referenced this issue Mar 24, 2020

[CARMEL-2308] CRUD on delta table should update table statistics auto…

053f048

…matically (delta-io#18)

tdas closed this as completed Mar 26, 2020

LantaoJin added a commit to LantaoJin/delta that referenced this issue Mar 12, 2021

[CARMEL-4261] Inner join in UPDATE should pushdown the projection to …

04cdb51

…relation (delta-io#18)

tdas pushed a commit to tdas/delta that referenced this issue May 31, 2023

Cross build Scala 2.11 and 2.12 (delta-io#22)

85bdedd

Add build support for Scala 2.11. Closes delta-io#18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I create hive table on top of delta? #18

Can I create hive table on top of delta? #18

skp33 commented Apr 29, 2019

tdas commented Apr 29, 2019

skp33 commented May 2, 2019

nonsleepr commented May 6, 2019

mukulmurthy commented May 6, 2019

spmp commented Jul 23, 2019 •

edited

Loading

prabgemini commented Aug 21, 2019

tdas commented Aug 22, 2019

ivoson commented Jan 19, 2020

tdas commented Jan 20, 2020

pranavanand commented Mar 24, 2020

tdas commented Mar 26, 2020

Can I create hive table on top of delta? #18

Can I create hive table on top of delta? #18

Comments

skp33 commented Apr 29, 2019

tdas commented Apr 29, 2019

skp33 commented May 2, 2019

nonsleepr commented May 6, 2019

mukulmurthy commented May 6, 2019

spmp commented Jul 23, 2019 • edited Loading

prabgemini commented Aug 21, 2019

tdas commented Aug 22, 2019

ivoson commented Jan 19, 2020

tdas commented Jan 20, 2020

pranavanand commented Mar 24, 2020

tdas commented Mar 26, 2020

spmp commented Jul 23, 2019 •

edited

Loading