-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC: Error handling improvements #1341
RPC: Error handling improvements #1341
Conversation
b397868
to
e27a1d8
Compare
84dad1b
to
5316cd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging logistics here largely look alright to me, but it would be good to also get @cmcarthur's eyes on the queue/multiprocessing aspects of this.
I have some immediate thoughts after playing around with this, and would love to discuss feasibility before we make any of these changes. I had a hard time responding to this PR, since I:
- don't have a great handle on exactly what the requirements are yet (though this is very close, and I feel well equipped to solidify these reqs in short order)
- don't have great insight into how feasible the things i think we need are
I wrote the the rest of this comment as a series of demands, as it was a little more straightforward than hedging everything with "i think... if it's not infeasible..." :). So, let's definitely discuss what our options and needs are before proceeding with any of the things I've said below!
Thoughts:
- We should make
logs
a dict with three elements:debug
,info
, andmanual
- The existing
logs
array should becomelogs['debug']
- We should siphon off info-level logs into the
info
array - Only the logs generated by the
{{ log(..., info=true) }}
context function should be routed to themanual
array
The manual
logs will be useful for macros that spit out textual information (like a describe
macro which returns a string of descriptive statistics about a table).
Instead of re-routing the info-level {{ logs() }}
calls in the rpc server, we could alternatively make a print()
context function for this purpose. I don't feel strongly about that either way! These functions should work sensibly in CLI or RPC contexts.
Other random thoughts:
- We currently drop a backtrace into the debug logs which is good, but we'll want to tidy that up for info/manual level logs if possible
- Can we nix the time string for info/manual logs?
- What should we do about warnings? I know we shelved some warning conversations here.... let's possibly revisit. I can imagine wanting to see these in the "manual" output too...
The distinction between info and manual logs:
Compiling a SQL string that calls a macro or two results in info-logs like:
2019-03-07 21:08:17,991 (MainThread): Found 13 models, 2 tests, 1 archives, 0 analyses, 204 macros, 0 operations, 2 seed files, 1 sources, 1 None
There's limited utility to info-level logs like these being shown in an interactive dbt client session. A separate array of "manual" logs lets us preserve these logs for debugging use cases, but hide them in the typical compile/execute case.
Things for us to consider:
- Are there system-generated info-level logs besides warnings that we should return to dbt clients in interactive sessions? Is it easier to circumvent the one
Found 13 models, 2 tests....
log line in the RPC context and instead just return info-level logs to the client in general? Is this something we'd feel good about maintaining long-term? - Is
{{ log(...) }}
the way we want to surface user-generated information from macros in interactive sessions?
Does this mean we should duplicate logs, or that users have to manually reconcile ordering between the
What do you mean by this? What exactly output do you want from an exception stack trace and where do you want to see it? Do you want them in the
That sounds like it would be harder for no benefit? Currently I just added a log handler, so anything that would get logged goes there and gets sent over the queue. I assume
Yeah. In |
All dbt errors now have proper error codes/messages The raised message at runtime ends up in result.error.data.message The raised message type at runtime ends up in result.error.data.typename result.error.message is a plaintext name for result.error.code dbt.exceptions.Exception.data() becomes result.error.data Collect dbt logs and make them available to requests/responses
👍 @beckjake what does the typical INFO output from a |
@cmcarthur The only INFO line I see is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. make a MessageType class instead of using random strings :) also from reading this it looks to me like concurrent requests will work great. have you tested that?
core/dbt/task/rpc_server.py
Outdated
if error is not None: | ||
self.queue.put(['error', error.error]) | ||
else: | ||
self.queue.put(['result', result]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you enumerate the message types in a separate class?
exceeded, raise an RPCTimeoutException. | ||
""" | ||
while True: | ||
get_timeout = self._next_timeout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the timeout solution here is great.
core/dbt/task/rpc_server.py
Outdated
self.process.join() | ||
|
||
self.process = None | ||
self.queue = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these two lines significant from a GC perspective? if so it should probably go into the finally
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, they're not significant, I should just remove them.
5316cd4
to
3360aa3
Compare
3360aa3
to
7e18128
Compare
Fixes "called by <Unknown>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM! The rpc CLI client was super helpful here. Ship it
Fixes #1309
Fixes #1310
All dbt errors now have proper error codes/messages
The raised message at runtime ends up in result.error.data.message
The raised message type at runtime ends up in result.error.data.typename
result.error.message is a plaintext name for result.error.code
dbt.exceptions.Exception.data() becomes result.error.data
Include logs in the RPC responses - for errors under
response.error.data.logs
and for results underresponse.result.logs
. The results are enqueued back to the main process, so even timeouts get logs.I refactored safe_run by just chunking everything up as part of this, as I wanted to override particular parts of the error handling chain (mostly to keep raising errors farther and father up the stack of
safe_*
)