From 8bddd454cbda2d7dd7f491a1c174f7e7f49bec57 Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Wed, 25 Oct 2023 16:40:18 -0700 Subject: [PATCH 1/8] commit --- PROTOCOL.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index d8a5830c149..ca66b7d0cfa 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -70,6 +70,7 @@ - [Column Invariants](#column-invariants) - [CHECK Constraints](#check-constraints) - [Generated Columns](#generated-columns) + - [Default Columns](#default-columns) - [Identity Columns](#identity-columns) - [Writer Version Requirements](#writer-version-requirements) - [Requirements for Readers](#requirements-for-readers) @@ -1343,6 +1344,17 @@ When enabled: - The value of `delta.generationExpression` SHOULD be parsed as a SQL expression. - Writers MUST enforce that any data writing to the table satisfy the condition `( <=> ) IS TRUE`. `<=>` is the NULL-safe equal operator which performs an equality comparison like the `=` operator but returns `TRUE` rather than NULL if both operands are `NULL` +## Default Columns + +Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables. To enable Default Columns: +- The table must be on Writer Version 4, or +- The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. + +When enabled: +- The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. +- The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. +- Writers MUST enforce that any data writing to the table satisfy the condition `( <=> ) IS TRUE` where the `default value` is the result of evaluating the default value expression at the time of writing each row. `<=>` is the NULL-safe equal operator which performs an equality comparison like the `=` operator but returns `TRUE` rather than NULL if both operands are `NULL` + ## Identity Columns Delta supports defining Identity columns on Delta tables. Delta will generate unique values for Identity columns when users do not explicitly provide values for them when writing to such tables. To enable Identity Columns: @@ -1403,6 +1415,7 @@ Feature | Name | Readers or Writers? [Column Invariants](#column-invariants) | `invariants` | Writers only [`CHECK` constraints](#check-constraints) | `checkConstraints` | Writers only [Generated Columns](#generated-columns) | `generatedColumns` | Writers only +[Default Columns](#default-columns) | `allowColumnDefaults` | Writers only [Change Data Feed](#add-cdc-file) | `changeDataFeed` | Writers only [Column Mapping](#column-mapping) | `columnMapping` | Readers and writers [Identity Columns](#identity-columns) | `identityColumns` | Writers only @@ -1600,7 +1613,7 @@ valueType| The type of element used for the key of this map, represented as a st ### Column Metadata A column metadata stores various information about the column. -For example, this MAY contain some keys like [`delta.columnMapping`](#column-mapping) or [`delta.generationExpression`](#generated-columns). +For example, this MAY contain some keys like [`delta.columnMapping`](#column-mapping) or [`delta.generationExpression`](#generated-columns) or [`CURRENT_DEFAULT`](#default-columns). Field Name | Description -|- delta.columnMapping.*| These keys are used to store information about the mapping between the logical column name to the physical name. See [Column Mapping](#column-mapping) for details. From e2d9ca4772ee536be7877fa5630468b272554b0c Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 13:36:08 -0700 Subject: [PATCH 2/8] respond to code review comments --- PROTOCOL.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index ca66b7d0cfa..62c4c0a2a79 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1347,13 +1347,12 @@ When enabled: ## Default Columns Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables. To enable Default Columns: -- The table must be on Writer Version 4, or - The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. When enabled: - The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. -- Writers MUST enforce that any data writing to the table satisfy the condition `( <=> ) IS TRUE` where the `default value` is the result of evaluating the default value expression at the time of writing each row. `<=>` is the NULL-safe equal operator which performs an equality comparison like the `=` operator but returns `TRUE` rather than NULL if both operands are `NULL` +w- Writers MUST enforce that for before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. ## Identity Columns From b202cc1b7e009fdf92133235272fbb75b754ac91 Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 14:24:01 -0700 Subject: [PATCH 3/8] respond to code review comments --- PROTOCOL.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index 62c4c0a2a79..2f685ca2e2c 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1346,13 +1346,15 @@ When enabled: ## Default Columns -Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables. To enable Default Columns: -- The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. +Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables. + +Enablement: + - The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. When enabled: -- The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. -- The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. -w- Writers MUST enforce that for before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. + - The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. + - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. Any engine that assigns this value can use its own SQL dialect of choice to represent the expression as a string, and use that same dialect to evaluate that expression later for future writes. If one engine writes the string metadata using its own SQL dialect and another engine then reads it back later when performing writes, the results are undefined. + - Writers MUST enforce that for before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. By the same token, if the engine specified the explicit `DEFAULT` SQL keyword for any column, the expression result must be substituted in the same way. ## Identity Columns From 4226b05b5a339e722c8b1489e98e862ebf93f3ab Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 14:25:16 -0700 Subject: [PATCH 4/8] respond to code review comments --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index 2f685ca2e2c..0242288f4db 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1346,7 +1346,7 @@ When enabled: ## Default Columns -Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables. +Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables, or when the user explicitly specifies the `DEFAULT` SQL keyword for the column. Enablement: - The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. From 9641106319ca793e63f1bd83708865515728474b Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 14:25:52 -0700 Subject: [PATCH 5/8] respond to code review comments --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index 0242288f4db..dc5f4ce0b0b 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1346,7 +1346,7 @@ When enabled: ## Default Columns -Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables, or when the user explicitly specifies the `DEFAULT` SQL keyword for the column. +Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables, or when the user explicitly specifies the `DEFAULT` SQL keyword for any such column. Enablement: - The table must be on Writer Version 7, and a feature name `allowColumnDefaults` must exist in the table `protocol`'s `writerFeatures`. From bd5de08d16c78ae30988b5f3be036e6ab1b9ede8 Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 14:26:25 -0700 Subject: [PATCH 6/8] respond to code review comments --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index dc5f4ce0b0b..940a60e1a21 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1353,7 +1353,7 @@ Enablement: When enabled: - The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. - - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. Any engine that assigns this value can use its own SQL dialect of choice to represent the expression as a string, and use that same dialect to evaluate that expression later for future writes. If one engine writes the string metadata using its own SQL dialect and another engine then reads it back later when performing writes, the results are undefined. + - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. Any engine that assigns this value can use its own SQL dialect of choice to represent the expression as a string, and use that same dialect to evaluate that expression later for future writes. If one engine writes the string metadata using its own SQL dialect and another engine then consumes it later when performing writes, the results are undefined. - Writers MUST enforce that for before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. By the same token, if the engine specified the explicit `DEFAULT` SQL keyword for any column, the expression result must be substituted in the same way. ## Identity Columns From 86bad40069ca035d47e12650f5098a976faf2a29 Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 14:26:47 -0700 Subject: [PATCH 7/8] respond to code review comments --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index 940a60e1a21..e8e8124efc0 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1354,7 +1354,7 @@ Enablement: When enabled: - The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. Any engine that assigns this value can use its own SQL dialect of choice to represent the expression as a string, and use that same dialect to evaluate that expression later for future writes. If one engine writes the string metadata using its own SQL dialect and another engine then consumes it later when performing writes, the results are undefined. - - Writers MUST enforce that for before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. By the same token, if the engine specified the explicit `DEFAULT` SQL keyword for any column, the expression result must be substituted in the same way. + - Writers MUST enforce that before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. By the same token, if the engine specified the explicit `DEFAULT` SQL keyword for any column, the expression result must be substituted in the same way. ## Identity Columns From e5d23e5616608be4f6d62ec2fa3d2ca1f86e8b9e Mon Sep 17 00:00:00 2001 From: Daniel Tenedorio Date: Thu, 26 Oct 2023 16:13:02 -0700 Subject: [PATCH 8/8] respond to code review comments --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index e8e8124efc0..e2b402cf8a8 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -1353,7 +1353,7 @@ Enablement: When enabled: - The `metadata` for the column in the table schema MAY contain the key `CURRENT_DEFAULT`. - - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. Any engine that assigns this value can use its own SQL dialect of choice to represent the expression as a string, and use that same dialect to evaluate that expression later for future writes. If one engine writes the string metadata using its own SQL dialect and another engine then consumes it later when performing writes, the results are undefined. + - The value of `CURRENT_DEFAULT` SHOULD be parsed as a SQL expression. - Writers MUST enforce that before writing any rows to the table, for each such requested row that lacks any explicit value (including NULL) for columns with default values, the writing system will assign the result of evaluating the default value expression for each such column as the value for that column in the row. By the same token, if the engine specified the explicit `DEFAULT` SQL keyword for any column, the expression result must be substituted in the same way. ## Identity Columns