Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Correct schema metadata to be synced to Hive Metastore for delta tables #1746

Open
2 of 3 tasks
sirsha-chatterjee opened this issue May 7, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@sirsha-chatterjee
Copy link

sirsha-chatterjee commented May 7, 2023

Feature request

Overview

Delta tables schemas are currently being stored in HMS (Hive Metastore) as a single array:

col array

Motivation

Currently, when the delta tables are created from delta jar, schema are not properly updated to HMS, which leads to an issue in discovery for tables and tables' columns for discovery for hive users.

Steps to reproduce:

spark-sql --packages io.delta:delta-core_2.12:2.2.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
CREATE TABLE IF NOT EXISTS delta_table_dwh.company_name (
  id INT,
  cname STRING
) USING DELTA 

HMS:

SELECT column_name, type_name
FROM COLUMNS_V2
WHERE CD_ID IN (
    SELECT CD_ID
    FROM SDS
    WHERE SD_ID = (
        SELECT SD_ID
        FROM TBLS
        WHERE tbl_name = 'company_name'
    )
)
ORDER BY column_name ASC;

Output:

+-------------+---------------+
| column_name | type_name     |
+-------------+---------------+
| col         | array<string> |
+-------------+---------------+

Expected Output:

+-------------+---------------+
| column_name | type_name     |
+-------------+---------------+
| cname       | string        |
| id          | bigint        |
+-------------+---------------+

Further details

Willingness to contribute

The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

  • Yes. I can contribute this feature independently.
  • Yes. I would be willing to contribute this feature with guidance from the Delta Lake community.
  • No. I cannot contribute this feature at this time.
@sirsha-chatterjee sirsha-chatterjee added the enhancement New feature or request label May 7, 2023
@allisonport-db
Copy link
Collaborator

Linking this to #1478

@hurcy
Copy link

hurcy commented May 18, 2023

@sirsha-chatterjee I want this feature too! I hope I can contribute this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants