Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glue Iceberg Table: Table is broken after any update #1919

Open
padaszewski opened this issue Feb 2, 2024 · 25 comments
Open

Glue Iceberg Table: Table is broken after any update #1919

padaszewski opened this issue Feb 2, 2024 · 25 comments
Labels

Comments

@padaszewski
Copy link

Name of the resource

AWS::Glue::Table

Resource Name

No response

Issue Description

Hi there!
When I try to update something on my iceberg table, the update causes the table to break and the table format to disappear. Basically, it's no longer an iceberg table and no operations on the table are possible.

Expected Behavior

When I update the table, the update does not remove the table input and I can work with the iceberg table as I should.

Observed Behavior

Before the update (after initial deployment):
image

After any update:
image

Notice the table format prop. Table management prop is also away.

Athena before update:
Zrzut ekranu 2024-02-2 o 14 23 06

Athena after update:
image
image

Test Cases

Simple CDK Stack to reproduce this behavior (uncomment one column to update, or do any other update):

export class CdkTestingStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const myTestDatabase = new CfnDatabase(this, 'myTestDatabase', {
      catalogId: Aws.ACCOUNT_ID,
      databaseInput: {
        name: 'mytestdatabase'
      }
    })

    const myLocationBucket = new Bucket(this, 'myLocationBucket', {
      removalPolicy: RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    })

    const myTestTable = new CfnTable(this, 'myTestTable', {
      databaseName: 'mytestdatabase',
      catalogId: Aws.ACCOUNT_ID,
      tableInput: {
        name: 'mytesttable',
        storageDescriptor: {
          columns: [
            {
              name: 'name',
              type: 'string'
            },
            // {
            //   name: 'ts',
            //   type: 'timestamp'
            // }
          ],
          location: `s3://${myLocationBucket.bucketName}/mytesttable/`,
        },
        tableType: 'EXTERNAL_TABLE',
      },
      openTableFormatInput: {
        icebergInput: {
          metadataOperation: 'CREATE',
          version: '2'
        }
      }
    })

  }
}

Other Details

No response

@padaszewski padaszewski added the bug label Feb 2, 2024
@padaszewski
Copy link
Author

padaszewski commented Feb 2, 2024

@sfgarcia @oleksiiburov @dmschauer
Tagging You, as You were active on other Iceberg issues. Hope you don't mind. Maybe You have some workaround other than creating this with Athena query.

@dmschauer
Copy link

@padaszewski My workaround would be indeed to use a custom resource with the Athena API (issuing queries via awswrangler in a Lambda function). A custom implementation for creating the table and deleting the table is straight-forward. I already implemented such a custom resource. Covering schema changes to the existing table via this custom resource could also be implemented but it's more complex (would work by comparing existing columns and types to recently supplied columns and types and issuing corresponding ALTER TABLE statements). But I see you're looking for a solution that avoids Athena so I think that won't help here.

@padaszewski
Copy link
Author

Thx @dmschauer for the reply. If AWS doesn't ship this along with the iceberg table partitioning feature request, then there is currently no other way than using athena with CR on deployment to achieve this. Iceberg tables are critical for our use case and it's sad that such a great thing is not well supported via IaC.

@sfgarcia
Copy link

Hi @padaszewski. I would also like that AWS fully supported managing Iceberg tables (create/update) through IaC. At my team we don't have our Iceberg tables as IaC (we create and update them with Athena queries) due to this limitation.

@padaszewski
Copy link
Author

Hi @sfgarcia, thx for the reply. We decided to do the same, but with CustomResources as IaC.

@svdgraaf
Copy link

Just a +1 here, this is still an issue. In addition, when creating a resource with a reference to a schema version, the columns do not appear to be loaded into the metadata file.

@jhosmanfriasbravo
Copy link

hey! +1
👀 👀 👀

@blaxx
Copy link

blaxx commented Apr 23, 2024

Same here, would love to be able to create/update partitioned Iceberg tables using the CDK.

@cyberst
Copy link

cyberst commented Apr 26, 2024

I would love to be able to create/update partitioned Iceberg tables using the CloudFormation/CDK too.

@mehdimld
Copy link

mehdimld commented May 3, 2024

+1

@ijtarano
Copy link

ijtarano commented May 28, 2024

+1
big concern for Cepsa's team...

@emiliogarcia-cps
Copy link

+1

10 similar comments
@jmartinez-cps
Copy link

+1

@armaseg
Copy link

armaseg commented May 28, 2024

+1

@FAGUILERAM2022
Copy link

+1

@aitormagan
Copy link

+1

@JesusAndres2
Copy link

+1

@etjess
Copy link

etjess commented Jun 5, 2024

+1

@romancepsa
Copy link

+1

@Rizxcviii
Copy link

+1

@raycomh
Copy link

raycomh commented Jul 8, 2024

+1

@Smotrov
Copy link

Smotrov commented Aug 7, 2024

+1

@igor-p-boemm
Copy link

+1

@Forfend
Copy link

Forfend commented Nov 11, 2024

+1
Big issue

@LaurensVanAcker
Copy link

+1, blocked by this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Researching
Development

No branches or pull requests