Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-45] [Feature] Expose Snowflake query id in case of dbt test failure #82

Closed
1 task done
tommyh opened this issue Jan 15, 2022 · 6 comments
Closed
1 task done
Labels
Stale type:enhancement New feature or request

Comments

@tommyh
Copy link

tommyh commented Jan 15, 2022

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

When we get a dbt test failure, I often look through the debug logs to find the exact query which was run. Having the dbt_internal_test query with failures, should_warn, and should_error are an amazing first step, but the issue is that when I rerun the query, it often passes. If the debug logs (or the friendly logs) contained the sfqid, then I could easily modify the query to use snowflake's timetravel feature, so I can recreate the failing test and debug it further.

Right now, I see the following info is logged:

022-01-14 00:14:37.419997 (Thread-39): 00:14:37  On test.[redacted].c630bf19e2: /* {"app": "dbt", "dbt_version": "1.0.1", "profile_name": "user", "target_name": "default", "node_id": "[redacted]"} */

Related issue: dbt-labs/dbt-core#2358

Describe alternatives you've considered

Right now, I look through the snowflake query history to find the sfqid, which works but is a bit tedious.

Who will this benefit?

Analytics engineers running dbt tests being able to track down issues faster. Speed is critical because the standard edition of snowflake only allows for 1 day of timetravel, so you need to debug it quickly.

The easier it is to track down intermittent test issues the better, so that people are encouraged to write more.

Are you interested in contributing this feature?

Sure! I have a bit of experience with python (mostly ruby/javascript), but could figure it out if I know:

  • people think this is a good idea (so it's not a waste of everyone's time)
  • a point in the right direction

Anything else?

I'm loving dbt! Thanks for all of the hard work.

@github-actions github-actions bot changed the title Jan 15, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jan 17, 2022

@tommyh Thanks for opening, and for linking the prior art in dbt-labs/dbt-core#2358!

I think there's some additional overlap with the proposal in #7, which would include the Snowflake query id in the result object / run_results.json. I think that would work well for test results, including test failures, since dbt's results object colocates a lot of useful contextual information for tracking test results over time. A log message makes more sense to me in the case where dbt encounters an exception that requires immediate intervention, and which might prevent it from writing metadata artifacts.

I'm going to transfer this issue over to the dbt-snowflake repo, since that's where any associated code change would need to happen. We might also decide that the proposal in #7 / implementation in #40 gets you enough of the way there.

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Jan 17, 2022
@jtcohen6 jtcohen6 added the type:enhancement New feature or request label Jan 17, 2022
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@smitthakkar1
Copy link

@jtcohen6 thank you for your answer above. I upgraded to dbt-snowflake 1.2.0 (same as dbt) and have access to query_id under adapter_response of result object but I get null query id when a query fails and it shows query id only for successful model run. Is that expected ?

@jtcohen6
Copy link
Contributor

jtcohen6 commented Nov 15, 2022

@smitthakkar1 you're right! That's because adapter_response is actually never populated for failing queries. We can take one of two approaches here:

I prefer the second approach if possible; I'll open a dbt-core issue about this.

Update: Just kidding: We already do this, in dbt-snowflake, just in the same way we do in dbt-bigquery! This is an official pattern core adapter interface allows an adapter to define a custom exception_handler. Even though the query_id doesn't appear in run_results.json, it does still appear in the debug-level logs (... though not in structured form ...):

@contextmanager
def exception_handler(self, sql):
try:
yield
except snowflake.connector.errors.ProgrammingError as e:
msg = str(e)
logger.debug("Snowflake query id: {}".format(e.sfqid))
logger.debug("Snowflake error: {}".format(msg))

@smitthakkar1
Copy link

@jtcohen6 Thanks for your reply.

We already do this, in dbt-snowflake. Even though the query_id doesn't appear in run_results.json, it does still appear in the debug-level logs

Any idea on how to access those debug level logs via dbt template and print it ?

@jaypeedevlin
Copy link
Contributor

Given this is already in the debug logs, would y'all be open to optionally appending the query ID to the error message based on some config or env var?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale type:enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants