Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2735] [Bug] Compile query error doesn't include node information #7940

Closed
2 tasks done
ChenyuLInx opened this issue Jun 23, 2023 · 3 comments · Fixed by #7960
Closed
2 tasks done

[CT-2735] [Bug] Compile query error doesn't include node information #7940

ChenyuLInx opened this issue Jun 23, 2023 · 3 comments · Fixed by #7960
Assignees
Labels
bug Something isn't working logging
Milestone

Comments

@ChenyuLInx
Copy link
Contributor

ChenyuLInx commented Jun 23, 2023

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

This might be one symptom of a more general error handling problems in dbt-core.
The specific situation happened here is that when compiling a inline dbt sql, and run into an execption, dbt will fire an event that doesn't contain information about what node run into this issue. See Steps to Repro for command.

This is because uncatched exceptions are catched in requires.postflight linked below instead of in the place where we compile query.

except DbtException as e:

Expected Behavior

I would expect the fired event contains information about which node run into issue. for example, maybe the json log looks like

{
    "data": {
            "exc": "Compilation Error in sql operation inline_query (from remote system.sql)\n  'a' is undefined. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with \"dbt deps\"."
            "unique_id": "sqloperation.jaffle_shop.inline_query",
        },
    "info": {
            "category": "",
            "code": "Z002",
            "extra": {},
            "invocation_id": "780f6edf-c3f2-43a9-a742-eeefd1619c80",
            "level": "error",
            "msg": "Compilation Error in sql operation inline_query (from remote system.sql)\n  'a' is undefined. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with \"dbt deps\".",
            "name": "compile error",
            "pid": 464,
            "thread": "somethread",
            "ts": "2023-06-16T19:58:11.410810Z"
    }
}

Current issue is because we do not have much exception handling logic in various tasks in general. This is a dbt-core general problem.
But specifically for this case, we can make sure we handle exception and fire proper event then re raise another exception at

compiler = self.adapter.get_compiler()

and
block_parser = SqlBlockParser(

to fire event that contains node information in place.

Steps To Reproduce

  1. in a jaffle_shop project
  2. run
    dbt --log-format json compile  --inline "select * from {{ref(1)}}"
    or
    dbt --log-format json compile --inline "select * from {{ref("model_not_exisit")}}"
  3. The last json formatted log is
{
    "data": {
        "exc": "Compilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>"
    },
    "info": {
        "category": "",
        "code": "Z002",
        "extra": {},
        "invocation_id": "dcbb5b3d-79f5-4a6d-85c2-65e589c09ae5",
        "level": "error",
        "msg": "Encountered an error:\nCompilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
        "name": "MainEncounteredError",
        "pid": 68385,
        "thread": "MainThread",
        "ts": "2023-06-23T16:47:56.709209Z"
    }
}
@ChenyuLInx ChenyuLInx added bug Something isn't working triage labels Jun 23, 2023
@github-actions github-actions bot changed the title [Bug] Compile query error doesn't include node information [CT-2735] [Bug] Compile query error doesn't include node information Jun 23, 2023
@dbeatty10
Copy link
Contributor

Thanks for reporting this @ChenyuLInx ! Do you have both the interest and bandwidth to tackle this, or are you hoping another member of the team can dive into it?

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 24, 2023

The exception itself has an optional node property:

self.node = node

Not every exception will be associated with a node, but many of them will.

Currently, MainEncounteredError is just including the stringified representation of the exception:

except FailFastError as e:
fire_event(MainEncounteredError(exc=str(e)))
raise ResultExit(e.result)
except DbtException as e:
fire_event(MainEncounteredError(exc=str(e)))
raise ExceptionExit(e)
except BaseException as e:
fire_event(MainEncounteredError(exc=str(e)))
fire_event(MainStackTrace(stack_trace=traceback.format_exc()))
raise ExceptionExit(e)

We could also include other structured attributes of the node, above all its unique_id:

        except DbtException as e:
            node_unique_id = None
            if hasattr(e, "node"):
                unique_id = e.node.unique_id
            fire_event(MainEncounteredError(exc=str(e), unique_id=unique_id))
            raise ExceptionExit(e)

This also requires updating the proto contract of MainEncounteredError to be:

// Z002
message MainEncounteredError {
    string exc = 1;
    string unique_id = 2;
}
{
  "data": {
    "exc": "Compilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "unique_id": "sqloperation.my_dbt_project.inline_query"
  },
  "info": {
    "category": "",
    "code": "Z002",
    "extra": {},
    "invocation_id": "00255fea-e009-4830-8b46-417bb54fda67",
    "level": "error",
    "msg": "Encountered an error:\nCompilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "name": "MainEncounteredError",
    "pid": 79540,
    "thread": "MainThread",
    "ts": "2023-06-24T14:48:44.520796Z"
  }
}

Better! node_info

        except DbtException as e:
            node_info = {}
            if hasattr(e, "node"):
                node_info = e.node.node_info
            fire_event(MainEncounteredError(exc=str(e), node_info=node_info))
            raise ExceptionExit(e)
// Z002
message MainEncounteredError {
    string exc = 1;
    string unique_id = 2;
}
{
  "data": {
    "exc": "Compilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "node_info": {
      "materialized": "view",
      "meta": {},
      "node_finished_at": "",
      "node_name": "inline_query",
      "node_path": "sql/inline_query",
      "node_relation": {
        "alias": "inline_query",
        "database": "jerco",
        "relation_name": "",
        "schema": "dbt_jcohen"
      },
      "node_started_at": "",
      "node_status": "None",
      "resource_type": "sqloperation",
      "unique_id": "sqloperation.my_dbt_project.inline_query"
    }
  },
  "info": {
    "category": "",
    "code": "Z002",
    "extra": {},
    "invocation_id": "6ff4c60a-0021-491d-8409-9b1610f56348",
    "level": "error",
    "msg": "Encountered an error:\nCompilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "name": "MainEncounteredError",
    "pid": 95528,
    "thread": "MainThread",
    "ts": "2023-06-26T09:14:10.393510Z"
  }
}

Out of scope(?): full exception object

We've talked before about surfacing all structured attributes of the exception within log messages for those exceptions. Then exc_dict could simply be a proto struct, and this works!

It does mean that we're surfacing the full internal contents of the exception, but we're already doing that within programmatic invocations by including the full exception (Python) object in dbtRunnerResult.

I had imagined this is a bigger lift, and it is much riskier for serialization, so we don't have to consider it in scope for the narrow ask right now.

        except DbtException as e:
            # this is some jank
            exc_dict = {k: v.to_dict() if hasattr(v, "to_dict") else v for k, v in e.__dict__.items()}
            fire_event(MainEncounteredError(exc=str(e), exc_type=e.type, exc_dict=exc_dict))
            raise ExceptionExit(e)
// Z002
message MainEncounteredError {
    string exc = 1;
    string exc_type = 2;
    google.protobuf.Struct exc_dict = 3;
}
{
  "data": {
    "exc": "Compilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "exc_dict": {
      "msg": "The name argument to ref() must be a string, got <class 'int'>",
      "node": {
        "alias": "inline_query",
        "build_path": null,
        "checksum": {
          "checksum": "2c66619d39bc71e1b1334606d9fc45797146841f91cf6ee47502d6828134ee0b",
          "name": "sha256"
        },
        "columns": {},
        "compiled_path": null,
        "config": {
          "alias": null,
          "column_types": {},
          "contract": {
            "enforced": false
          },
          "database": null,
          "docs": {
            "node_color": null,
            "show": true
          },
          "enabled": true,
          "full_refresh": null,
          "grants": {},
          "group": null,
          "incremental_strategy": null,
          "materialized": "view",
          "meta": {},
          "on_configuration_change": "apply",
          "on_schema_change": "ignore",
          "packages": [],
          "persist_docs": {},
          "post-hook": [],
          "pre-hook": [],
          "quoting": {},
          "schema": null,
          "tags": [],
          "unique_key": null
        },
        "config_call_dict": {},
        "contract": {
          "checksum": null,
          "enforced": false
        },
        "created_at": 1687771170.900467,
        "database": "jerco",
        "deferred": false,
        "depends_on": {
          "macros": [],
          "nodes": []
        },
        "description": "",
        "docs": {
          "node_color": null,
          "show": true
        },
        "fqn": [
          "my_dbt_project",
          "sql",
          "inline_query"
        ],
        "group": null,
        "language": "sql",
        "meta": {},
        "metrics": [],
        "name": "inline_query",
        "original_file_path": "from remote system.sql",
        "package_name": "my_dbt_project",
        "patch_path": null,
        "path": "sql/inline_query",
        "raw_code": "select * from {{ref(1)}}",
        "refs": [],
        "relation_name": null,
        "resource_type": "sqloperation",
        "schema": "dbt_jcohen",
        "sources": [],
        "tags": [],
        "unique_id": "sqloperation.my_dbt_project.inline_query",
        "unrendered_config": {}
      },
      "stack": []
    },
    "exc_type": "Compilation"
  },
  "info": {
    "category": "",
    "code": "Z002",
    "extra": {},
    "invocation_id": "98a18ff5-67c7-4310-821b-aa2be247f65a",
    "level": "error",
    "msg": "Encountered an error:\nCompilation Error in sqloperation inline_query (from remote system.sql)\n  The name argument to ref() must be a string, got <class 'int'>",
    "name": "MainEncounteredError",
    "pid": 95807,
    "thread": "MainThread",
    "ts": "2023-06-26T09:19:30.903318Z"
  }
}

@ChenyuLInx
Copy link
Contributor Author

ChenyuLInx commented Jun 27, 2023

I went with option 2 but opted for a even narrower fix to fire an event when error happens and re-raise a general error for the main process to handle later on.
As I don't think we should be having main thread firing events that is for specific node. The thread running that node should be responsible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working logging
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants