-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49183][SQL] V2SessionCatalog.createTable should respect PROP_IS_MANAGED_LOCATION #47684
Conversation
Do we need to update the API doc to make it more precise?
|
@@ -170,7 +170,8 @@ class V2SessionCatalog(catalog: SessionCatalog) | |||
val storage = DataSource.buildStorageFormatFromOptions(toOptions(tableProperties)) | |||
.copy(locationUri = location.map(CatalogUtils.stringToURI)) | |||
val isExternal = properties.containsKey(TableCatalog.PROP_EXTERNAL) | |||
val tableType = if (isExternal || location.isDefined) { | |||
val isManagedLocation = properties.containsKey(TableCatalog.PROP_IS_MANAGED_LOCATION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to test if the value is true or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed how PROP_EXTERNAL
is handled.
@@ -120,7 +120,7 @@ case class ShowCreateTableExec( | |||
private def showTableLocation(table: Table, builder: StringBuilder): Unit = { | |||
val isManagedOption = Option(table.properties.get(TableCatalog.PROP_IS_MANAGED_LOCATION)) | |||
// Only generate LOCATION clause if it's not managed. | |||
if (isManagedOption.forall(_.equalsIgnoreCase("false"))) { | |||
if (isManagedOption.isEmpty || !isManagedOption.get.equalsIgnoreCase("true")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small fix to match the document. true
means effective, non true
means not effective.
@@ -52,7 +52,8 @@ public interface TableCatalog extends CatalogPlugin { | |||
|
|||
/** | |||
* A reserved property to indicate that the table location is managed, not user-specified. | |||
* If this property is "true", SHOW CREATE TABLE will not generate the LOCATION clause. | |||
* If this property is "true", it means it's a managed table even if it has a location. As an | |||
* example, SHOW CREATE TABLE will not generate the LOCATION clause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to also document non true case?
Is non true value invalid, or it is valid but ignored, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-true case is hard to document, as treating tables with location as external tables is a Hive-specific behavior.
sql(s"CREATE TABLE t (i INT) USING $v2Format") | ||
val table = catalog(SESSION_CATALOG_NAME).asTableCatalog | ||
.loadTable(Identifier.of(Array("default"), "t")) | ||
assert(!table.properties().containsKey(TableCatalog.PROP_EXTERNAL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to also test SHOW CREATE TABLE
or DESC TABLE
that this is a managed table without a location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v2 table property is already the source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks for the review, merging to master/3.5! |
…S_MANAGED_LOCATION Even if the table definition has a location, the table should be a managed table if `PROP_IS_MANAGED_LOCATION` is specified, in `V2SessionCatalog.createTable`. It's a bug fix. A custom `spark_catalog` may generate custom location for managed table and delegate the actual table creation to `V2SessionCatalog`. The table should still be a managed table if `PROP_IS_MANAGED_LOCATION` is specified. Yes, now users who use custom `spark_catalog` that generates custom location for managed table, can correctly create managed tables. a new test no Closes #47684 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit ed04e16) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
it seems the branch-3.5 CI failed..
|
fixing at #47722 |
### What changes were proposed in this pull request? fix compilation error caused by #47684 . Seems scala 2.12 has some subtle differences in the import priority. ### Why are the changes needed? N/A ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? pass compilation ### Was this patch authored or co-authored using generative AI tooling? no Closes #47722 from cloud-fan/fix. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…S_MANAGED_LOCATION ### What changes were proposed in this pull request? Even if the table definition has a location, the table should be a managed table if `PROP_IS_MANAGED_LOCATION` is specified, in `V2SessionCatalog.createTable`. ### Why are the changes needed? It's a bug fix. A custom `spark_catalog` may generate custom location for managed table and delegate the actual table creation to `V2SessionCatalog`. The table should still be a managed table if `PROP_IS_MANAGED_LOCATION` is specified. ### Does this PR introduce _any_ user-facing change? Yes, now users who use custom `spark_catalog` that generates custom location for managed table, can correctly create managed tables. ### How was this patch tested? a new test ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47684 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…S_MANAGED_LOCATION ### What changes were proposed in this pull request? Even if the table definition has a location, the table should be a managed table if `PROP_IS_MANAGED_LOCATION` is specified, in `V2SessionCatalog.createTable`. ### Why are the changes needed? It's a bug fix. A custom `spark_catalog` may generate custom location for managed table and delegate the actual table creation to `V2SessionCatalog`. The table should still be a managed table if `PROP_IS_MANAGED_LOCATION` is specified. ### Does this PR introduce _any_ user-facing change? Yes, now users who use custom `spark_catalog` that generates custom location for managed table, can correctly create managed tables. ### How was this patch tested? a new test ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47684 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…S_MANAGED_LOCATION ### What changes were proposed in this pull request? Even if the table definition has a location, the table should be a managed table if `PROP_IS_MANAGED_LOCATION` is specified, in `V2SessionCatalog.createTable`. ### Why are the changes needed? It's a bug fix. A custom `spark_catalog` may generate custom location for managed table and delegate the actual table creation to `V2SessionCatalog`. The table should still be a managed table if `PROP_IS_MANAGED_LOCATION` is specified. ### Does this PR introduce _any_ user-facing change? Yes, now users who use custom `spark_catalog` that generates custom location for managed table, can correctly create managed tables. ### How was this patch tested? a new test ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47684 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Even if the table definition has a location, the table should be a managed table if
PROP_IS_MANAGED_LOCATION
is specified, inV2SessionCatalog.createTable
.Why are the changes needed?
It's a bug fix. A custom
spark_catalog
may generate custom location for managed table and delegate the actual table creation toV2SessionCatalog
. The table should still be a managed table ifPROP_IS_MANAGED_LOCATION
is specified.Does this PR introduce any user-facing change?
Yes, now users who use custom
spark_catalog
that generates custom location for managed table, can correctly create managed tables.How was this patch tested?
a new test
Was this patch authored or co-authored using generative AI tooling?
no