Add performance profiling to dbt #1001

drewbanin · 2018-09-14T00:04:16Z

Feature

Feature description

Application performance is important, but it can be difficult to pin down the performance impact of code changes. Profiling will help us benchmark dbt runtimes, and then evaluate the performance impact of prospective code changes.

I'm sure there are lots of low-hanging fruit, as we haven't dedicated a ton of brain cycles to performance recently. Adding profiling is the first step in any prospective performance optimization.

Who will this benefit?

Folks who use dbt; folks who have important things to do with their time; whiterose

cmcarthur · 2018-09-28T17:39:31Z

test this on windows

drewbanin · 2018-11-14T18:55:52Z

@beckjake can you share a link to the profiler you mentioned earlier today? I'd be cool with either merging #1020 as-is, or potentially using a different profiler if it makes our lives easier.

beckjake · 2018-11-14T19:01:38Z

This is it: https://github.com/P403n1x87/austin It's a stack profiler, which I think is probably ideal for our use cases. I'll do some testing. The big downside is that due to the nature of it we'd have to compile or make users have it installed to do profiling. But at least it'll be correct in the presence of threads, which seems valuable.

…

On Wed, Nov 14, 2018, 11:55 AM Drew Banin ***@***.*** wrote: @beckjake <https://github.com/beckjake> can you share a link to the profiler you mentioned earlier today? I'd be cool with either merging #1020 <#1020> as-is, or potentially using a different profiler if it makes our lives easier. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1001 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbmzl2SoNXirYGdLvX72pxdB1jMAMDqks5uvGc5gaJpZM4WoZqO> .

beckjake · 2018-11-19T15:05:05Z

I've looked at it some more, I think we should just merge #1020 with some minor modifications - in particular I would like to add a flag (--single-thread?, probably suppressed in --help output) to change from using a thread pool's map to just using the default map(), keeping all execution in the main thread. Obviously that'll be super slow but at least we'll get actual timing information. We can then point people at austin if they want more detailed/useful threaded perf information, since it requires no actual python imports, etc.

My reasoning here is that at some point (maybe even as soon as #1128 merges!) we'll have hit all the low-hanging fruit in parsing/setup and we'll want performance information from dbt execution, which #1020 can't currently give us.

drewbanin · 2018-11-19T15:14:20Z

Good call @beckjake. Do you feel well-equipped to pick up #1020?

There's some overlap in the --single-thread functionality you're describing with #813. I don't know that we have to do both of those simultaneously, but it could be useful to keep it in mind.

beckjake · 2018-11-19T15:16:22Z

Yeah, I'm doing it now (I think it will be very simple, I'll open a PR against the profiler branch soon).

I don't think #813 will interact with this, as what I'm planning should just preserve any existing order and not care about the node selector behavior at all (which is what I think matters here)

drewbanin mentioned this issue Sep 14, 2018

Slow performance when running dbt (in docker) #1002

Closed

cmcarthur self-assigned this Sep 20, 2018

cmcarthur added estimate: 4 and removed estimate: 4 labels Sep 28, 2018

cmcarthur added this to the Stephen Girard milestone Oct 17, 2018

drewbanin modified the milestones: Stephen Girard, Grace Kelly Nov 2, 2018

cmcarthur mentioned this issue Nov 14, 2018

dbt builtin profiler #1020

Merged

beckjake closed this as completed in #1020 Nov 20, 2018

drewbanin mentioned this issue Dec 8, 2018

Improve record keeping for resource timing #1179

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance profiling to dbt #1001

Add performance profiling to dbt #1001

drewbanin commented Sep 14, 2018

cmcarthur commented Sep 28, 2018

drewbanin commented Nov 14, 2018

beckjake commented Nov 14, 2018 via email

beckjake commented Nov 19, 2018

drewbanin commented Nov 19, 2018

beckjake commented Nov 19, 2018

Add performance profiling to dbt #1001

Add performance profiling to dbt #1001

Comments

drewbanin commented Sep 14, 2018

Feature

Feature description

Who will this benefit?

cmcarthur commented Sep 28, 2018

drewbanin commented Nov 14, 2018

beckjake commented Nov 14, 2018 via email

beckjake commented Nov 19, 2018

drewbanin commented Nov 19, 2018

beckjake commented Nov 19, 2018