-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inject query comments (#1643) #1864
Conversation
82bd794
to
fdef2f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! The core query comment feature works great, and I feel so good about the changes you made to how the context is built up. Super nice work!
I have a couple of comments here, and the big one is around moving this config from profiles.yml to dbt_project.yml. My instinct is that this isn't a huge change, but definitely let me know if it's more involved than I'm picturing.
from dbt.contracts.graph.compiled import CompileResultNode | ||
|
||
|
||
default_query_comment = ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we push this into the global project? Or is that a bad idea?
Also, let's make the default super simple. Can we just include:
- app='dbt'
- dbt_version
- node_id=node.unique_id
- target_name=target.name
These other fields are going to be really helpful in specific use-cases, but I don't think they need to be provided in the default implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we push this into the global project? Or is that a bad idea?
I think we can realistically do this, or we can move it to dbt_project.yml
, but not really both.
self.query_comment: str = initial | ||
|
||
def add(self, sql: str) -> str: | ||
# Make sure there are no trailing newlines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could equivalently place this text inside of a block comment (/* ... */
) to remove this logic. I'm not super concerned that someone would provide a closing block comment in one of these comment attributes. I think a block comment makes more sense here because the resulting json block in the SQL query will still be machine-readable.
|
||
class QueryStringSetter: | ||
def __init__(self, config: HasCredentials): | ||
if config.config.query_comment is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we grab the query comment string from a config called query_comment
defined in the dbt_project.yml
file instead of profiles.yml
? My thinking here is twofold:
- I imagine this is something users will want to version control for all dbt users (both human and machine).
- If a user has multiple profiles defined in their
dbt_project.yml
file, then they'd be unable to use differentquery_comment
configs for each profile.
@@ -255,7 +255,10 @@ def from_raw_profile_info(cls, raw_profile, profile_name, cli_vars, | |||
target could not be found | |||
:returns Profile: The new Profile object. | |||
""" | |||
# user_cfg is not rendered since it only contains booleans. | |||
# user_cfg is not rendered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't totally follow what's going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- user_cfg doesn't just contain booleans, so I fixed the comment
- this method is called directly by some unit tests, so it needs to try to extract the config from the raw_profile to get valid results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this have changed when we moved the query comment from the profile into the project? If this is fine, then it does not offend me, just want to make sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was wrong before! user_cfg still contains non-booleans (the printer width, for example).
run_started_at = None | ||
invocation_id = None | ||
|
||
if dbt.tracking.active_user is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have sworn we made a change a while back to include invocation_id and run_started_at even if users have opted out of anonymous event tracking. Is it a substantial change to make that the case here? I imagine invocation_id
and run_started_at
will both be super useful in the context of query comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did, and we do. active_user
is only None early on during initialization.
|
||
def get_target(self) -> Dict[str, Any]: | ||
target = dict( | ||
self.config.credentials.connection_info(with_aliases=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super smart to use connection_info
here - nice call
@@ -45,7 +45,8 @@ def type(self): | |||
return 'bigquery' | |||
|
|||
def _connection_keys(self): | |||
return ('method', 'database', 'schema', 'location') | |||
return ('method', 'database', 'schema', 'location', 'priority', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I definitely don’t want to do both - this config should only be provided in
the dbt_project.yml file
On Tue, Oct 29, 2019 at 11:35 AM Jacob Beck ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In core/dbt/adapters/base/query_headers.py
<#1864 (comment)>
:
> @@ -0,0 +1,101 @@
+from threading import local
+from typing import Optional, Callable
+
+from dbt.clients.jinja import QueryStringGenerator
+
+from dbt.contracts.connection import HasCredentials
+# this generates an import cycle, as usual
+from dbt.context.base import QueryHeaderContext
+from dbt.contracts.graph.compiled import CompileResultNode
+
+
+default_query_comment = '''
can we push this into the global project? Or is that a bad idea?
I think we can realistically do this, *or* we can move it to
dbt_project.yml, but not really both.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1864>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALYIEZY7AH32QNYZQZVTNTQRBJ5HANCNFSM4JE36O5Q>
.
--
Drew Banin
Fishtown Analytics
|
fdef2f3
to
ae58199
Compare
3c377b3
to
e6daba3
Compare
Make a fake "macro" that we parse specially with a single global context Macro takes an argument (the node, may be none) Users supply the text of the macro in their 'user_config' under a new 'query_comment' No macros available query generator is an attribute on the connection manager - has a thread-local comment str - when acquiring a connection, set the comment str new 'connection_for' context manager: like connection_named, except also use the node to set the query string Updated unit tests to account for query comments Added a hacky, brittle integration test - log to a custom stream and read that Trim down the "target" context value to use the opt-in connection_info - Make sure it contains a superset of the documented stuff - Make sure it does not contain any blacklisted items Change some asserts to raise InternalExceptions because assert error messages in threads are useless
Add special handling to 'dbt debug' for this behavior Rework the dependencies/requirements for adapters since they now require more of a config object tests...
Fix the tests
df68f85
to
ab9fcb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ship it!
Fixes #1643
In a user's dbt_project.yml, a new string field is accepted:
query-comment
. The string is used as the body of a macro that is evaluated with the following context:dbt_version
env_var
modules
run_started_at
invocation_id
return
fromjson
tojson
log
var
target
It will be provided with two arguments:
connection_name
, a stringnode
, which will either benull
or a compiled/parsed node object.The parsed macro is not available in any other context, and has no access to any macros.
Cleaned up contexts so things have a sensible inheritance pattern
query generator is an attribute on the connection manager
new 'connection_for' context manager: like
connection_named
, except also use the node to set the query stringconnection_named
sets the query string using the connection name only.