Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use show table extended with table name list for get_catalog. #237

Merged
merged 4 commits into from
Dec 16, 2022

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Dec 12, 2022

Description

Uses show table extended with table name list for get_catalog.

  • Running describe table extended for all tables could be slower than show table extended with table name list.
  • Statistics that will appear in the generated docs are not included in describe table extended.

Copy link
Collaborator

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

except dbt.exceptions.RuntimeException as e:
errmsg = getattr(e, "msg", "")
if (
"[SCHEMA_NOT_FOUND]" in errmsg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an error code in the error message, will we get the error message details as well ("Database ... not found")?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message was changed:

[SCHEMA_NOT_FOUND] The schema <schemaName> cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.


from dbt.adapters.spark.column import SparkColumn


@dataclass
class DatabricksColumn(SparkColumn):
TYPE_LABELS: ClassVar[Dict[str, str]] = {
"LONG": "BIGINT",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a port of dbt-labs/dbt-spark#358.

@ueshin
Copy link
Collaborator Author

ueshin commented Dec 16, 2022

Thanks! merging.

@ueshin ueshin merged commit 5b85bfd into databricks:main Dec 16, 2022
@ueshin ueshin deleted the get_catalog branch December 16, 2022 20:11
ueshin added a commit that referenced this pull request Dec 16, 2022
### Description

Uses `show table extended` with table name list for `get_catalog`.

- Running `describe table extended` for all tables could be slower than `show table extended` with table name list.
- Statistics that will appear in the generated docs are not included in `describe table extended`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants