-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snowflake column comments #2321
Snowflake column comments #2321
Conversation
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin. CLA has not been signed by users: @snowflakeseitz |
@cla-bot check |
The cla-bot has been summoned, and re-checked this pull request! |
Just out of curiosity, how does one do that? I can see how to set a comment for the whole view, but not for columns. Is the same syntax viable for tables? |
@beckjake exactly, you would have to inject it into the SQL statement itself: |
@drewbanin let me know what I need to get this over the finish line :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @snowflakeseitz - I just dropped some comments inline here. Let me know what you think :)
It would be really good to add some tests here! Can you think of a good way to test this functionality? I can imagine an integration test which:
- Creates a table/view with comments
- Runs some information schema queries to ensure the comments have been set successfully
|
||
{% endmacro %} | ||
|
||
{% macro snowflake__create_view_as(relation, sql) -%} | ||
{%- set secure = config.get('secure', default=false) -%} | ||
{%- set copy_grants = config.get('copy_grants', default=false) -%} | ||
{%- set sql_header = config.get('sql_header', none) -%} | ||
{%- set raw_persist_docs = config.get('persist_docs', {}) -%} | ||
{%- set relation_comment = get_relation_comment(raw_persist_docs, model) -%} | ||
{%- set column_comment = get_relation_column_comments(raw_persist_docs, model) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it appears that we're not going to be able to set column comments on views, I do think we can set a view-level comment!
Can you add a call to snowflake__alter_relation_comment
at the bottom of this macro to render the SQL required to add a view comment?
@@ -156,3 +172,19 @@ | |||
alter table {{ relation }} alter {{ adapter.quote(column_name) }} set data type {{ new_column_type }}; | |||
{% endcall %} | |||
{% endmacro %} | |||
|
|||
{% macro snowflake__alter_relation_comment(relation, relation_comment) -%} | |||
comment on table {{ relation }} IS $${{ relation_comment | replace('$', '[$]') }}$$; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use relation.type
instead of table
here? That will let us use this macro for both tables and views :)
Same for the implementation of snowflake__alter_column_comment
below!
hey @snowflakeseitz - do you think you'll be able to revisit this PR? I think we can indeed proceed without support for column-level comments on views for now, and there are certainly ways we can add support for them in the future too. Let me know if you're able to respond to the comments I dropped inline in this PR :) |
@drewbanin I fell off the face of the earth last week with work :) I will make these changes and test them by tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passes local testing for me. I left one comment around an argument naming inconsistency, though it doesn't actually raise an error.
Discussion: where should this happen?
Does it make sense that the comment on
+ alter
statements should be called by snowflake__create_table_as
and snowflake__create_view_as
, as opposed to the Snowflake table
and view
materializations?
On BigQuery, we can embed the object description within the create
statement's options()
. If we think that BQ should set the standard, then it makes sense that we'd append comment on
+ alter
where there are here, since it's closest to parity.
If we instead say that the standard should be Postgres/Redshift/Snowflake, where we're effectively adding descriptions by post hooks, I think it would make more sense (and be more modular) to move config.get('persist_docs')
, the calls to get_relation_comment
/get_relation_column_comment
, and alter_relation_comment
/alter_column_comment
into the materialization instead, and treat them like hooks.
(I know that Snowflake clustering operates as a series of alter
statements within the tight bound of snowflake__create_table_as
, but I've always viewed that as a bit of a hack.)
I don't think that line of thinking should block this PR, which is just about good to go. Since an ulterior goal is scoping out future contributions that might add this functionality on Postgres and Redshift, I do think we're setting some precedent here.
Future work
@drewbanin We discussed yesterday two possible approaches for implementing column comments for the Snowflake view materialization. I imagine we'll scope that for a future PR, building off the work here.
It also looks like this branch is now 59 commits ahead of (and several thousand line diffs away from) dev/octavius-catto
. I just want to confirm that that's a result of those commits having been squashed in the base branch.
@@ -16,3 +16,19 @@ | |||
{% endif %} | |||
|
|||
{% endmacro %} | |||
|
|||
|
|||
{% macro get_relation_column_comments(persist_docs, model) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the exception below would return an error, because the macro argument is named persist_docs
but the exception expects raw_persist_docs
.
As it is, I haven't been able to actually get this exception/error. It's preempted by a parsing error, now that model + project configs have tighter type enforcement:
Invalid config field: "persist_docs" must be a dict
All the same, the argument should be named raw_persist_docs
or persist_docs
consistently. I'm in favor of raw_persist_docs
here, since that's what you pass from create_table_as
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved, I stuck with persist_docs
because that what was used before my PR
Let me chime in with my $0.02 here, since I'm likely going to be doing the postgres/redshift one. When I first thought about this, I was firmly in favor of this happening in a special macro called by materializations, similar to how hooks work, and letting BigQuery be the weird special case. But then I realized that It also makes me think maybe we should rename these macros, but I would really prefer not to do that, plugins have just seen so much churn lately. |
That's a really good point. Insofar as I recall having discussed the desire for a future state in which only materializations pull directly from model config, and pass those values around to macros as keyword arguments. I don't know if that's still something we want, or if we want it enough to make meaningful steps in that direction—or, in this case, to not make meaningful steps away from that direction. |
You'd have to pass something suspiciously similar to the full generic model config to a lot of these things if you wanted to handle per-adapter differences nicely while allowing third parties to create their own custom adapter-generic materializations that pass along adapter-specific configs cleanly. I suppose you could have a generic adapter_macro that takes arbitrary keyword arguments and/or a model, returns some sort of object that you should then pass to
Or, instead of a macro maybe you'd just pass in the global config object and give materialization authors an easy way to make their own "config"s to pass to I'm not sure what that means for this PR, but I'm inclined to accept its answer of doing what bigquery already does and assuming that |
51fa43e
to
caef584
Compare
{{ return(adapter_macro('alter_column_comment', relation, column_dict)) }} | ||
{% endmacro %} | ||
|
||
{% macro default__alter_column_comment(relation, column_name, new_column_type) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be relation, column_dict
, not relation, column_name, new_column_type
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this is fixed now
@snowflakeseitz FYI these tests are failing! I can take a deeper look and follow up with some more info about what's going wrong here |
@snowflakeseitz I had the same issue you did when I was working on Here's the relevant commit |
@beckjake good catch, I just committed your suggestion! |
|
||
|
||
{% macro alter_column_comment(relation, column_dict) -%} | ||
{{ return(adapter_macro('alter_column_comment', relation, column_dict)) }} | ||
{% endmacro %} | ||
|
||
{% macro default__alter_column_comment(relation, column_dict) -%} | ||
{{ exceptions.raise_not_implemented( | ||
'alter_column_comment macro not implemented for adapter '+adapter.type()) }} | ||
{% endmacro %} | ||
|
||
{% macro alter_relation_comment(relation, relation_comment) -%} | ||
{{ return(adapter_macro('alter_relation_comment', relation, relation_comment)) }} | ||
{% endmacro %} | ||
|
||
{% macro default__alter_relation_comment(relation, relation_comment) -%} | ||
{{ exceptions.raise_not_implemented( | ||
'alter_relation_comment macro not implemented for adapter '+adapter.type()) }} | ||
{% endmacro %} | ||
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snowflakeseitz Can you remove these, or the ones from the bottom of the file and the comment on the other ones? I merged the redshift/postgres PR and I must have messed it up, git didn't detect the duplicate changes as I expected it to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I just rebase to your changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I think you'll have to remove them during the rebase anyway.
{{ return(none) }} | ||
{% endif %} | ||
|
||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as my above comment applies here - I think these two are the only overlap!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the discussion above @beckjake. I'm on board with shipping the implementation as-is for 0.17.0. Given your finding in #2378 on the redshift/pg version, I think we should reassess where these comment
statements best belong—within create_table_as
or as implicit hooks—as part of revisiting materializations more broadly for 0.18.0.
d47fb6f
to
ffaaacc
Compare
@drewbanin this has been rebased to @beckjake most recent changes! Hopefully, we can get this merged soon! |
I'll kick off the test suite now, assuming it passes I'll merge it! |
resolves:
persist_docs
config for columns for all core databases #1722Description
Adding in support for Persist Docs into the Snowflake adapter and laying out a general framework for other adapters to support this feature.
Some caveats:
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.