Skip to content
This repository has been archived by the owner on Dec 15, 2023. It is now read-only.

Rework dbt-adapter-tests to improve the experience of adapter development #25

Open
jtcohen6 opened this issue Sep 19, 2021 · 3 comments

Comments

@jtcohen6
Copy link
Contributor

While trying (unsuccessfully) to get the bottom of the hangup described in #24, @swanderz and colleagues provided some great feedback about how these tests could better aid with the experience of creating, publishing, and maintaining an adapter.

Problems

  • We really need better documentation at a bare-bones, tactical level about how to build an adapter, beyond the high-level overview that our current docs offer
  • Building an adapter today presumes deep practical knowledge of dbt-core. We should seek to (a) make that expectation explicit, and (b) ease that learning curve
  • Any testing suite needs to err on the side of more explicit code and more explicit logging. We shouldn't optimize for fewest lines of code in the adapter repo, we should optimize for maintainer understanding

UX

I believe it's helpful to separate the experience of creating a new adapter into four separate steps:

  1. Bootstrap working adapter: Implement just enough properties and methods on top of the base adapter to get something usable. While the create_adapter_plugins.py script is a nice shortcut to the right file structure, we're missing:

    • Clarity in the base adapter about abstract methods, properties, classes, such as could be achieved via docstrings. (Or: should we say explicitly that dbt-postgres is the model implementation, whether or not you subclass it?)
    • Clear pointers to where/how adapter code hooks into dbt-core "global" code, e.g. the macros in include/global_project
    • Fine-grained functional unit tests for each of the required methods (and macros). The first most obvious one: Is connections.py working, such that dbt and my database speak SQL to each other?
  2. Learning core concepts: Now that the adapter has been bootstrapped, and is usable at a basic level, use it alongside a toy project (e.g. jaffle_shop) to learn about basic dbt operations. What's a model, what's a test, what are the most common tasks (run, test)? What's the actual SQL that dbt is sending for each of these? An experienced dbt-core user can skip this step, but someone brand new to dbt should expect to spend some time here.

  3. Check for functional parity: dbt Labs should maintain a suite of integration tests that check for lowest-common-denominator functionality. Once an adapter passes all of those tests (or has a good reason not to), we can feel good about publishing it and kicking off a beta. (This step is closest to the aim of the current test suite, but the current suite makes it very hard to debug or introspect failures.)

  4. Extending the test framework: There are always going to be adapter-specific functional pieces that won't be covered by the generic suite. Additionally, once an adapter is published, there will be bugs and community contributions that warrant additional testing. It's crucial that the Labs-supported test suite exposes enough foundational constructs to make it easy to write totally new custom integration tests (similar to what we've done in dbt-spark). The interfaces for writing those tests should feel similar to the interfaces when implementing/overriding the suite's built-in tests. We should not optimize for fewest/subtlest lines of code. Instead, we should leverage well-established pytest constructs, and aim for clarity and transparency, even at the cost of verbosity.

@dataders
Copy link

dataders commented Sep 19, 2021

shout out to all our custom adapter developers & maintainers:
@JLDLaughlin @mikaelene @tglunde @techindicium @silentsokolov @findinpath @dbeatty10 @codeforkjeff @fabrice-etanchaud @kaysef @Tomme @aurany

If y'all have any particular pain points about the adapter creation process, this would be a good forum to do so!

@findinpath
Copy link

I had a rather easy job, because I built dbt-trino on top of dbt-presto adapter.

I just went over https://docs.getdbt.com/docs/contributing/building-a-new-adapter site.

I think it would be a nice addition to provide a small playground project with multiple branches where each of the concepts described in the site is implemented for a simple adapter.
Going from one step to the next (by switching the branches) gives a good clue about the actual sequence of steps to implement in order to provide the basic functionality for a new adapter.

Getting together all the available community adapters (with links to their project code) of dbt may be helpful.
If I have a problem that I don't know how to solve, maybe I can see in another available dbt adapter how they tackled eventually my problem.
https://github.com/Tomme/dbt-athena is the dbt adapter for AWS Athena, but it is not listed under the available adapters.

Another helpful thing for dbt

@codeforkjeff
Copy link

codeforkjeff commented Sep 20, 2021

Definitely agree with the need for improved steps 1, 3, 4. I don't know that 2 is really necessary: if you're writing an adapter, you probably already are familiar with the basics of dbt.

Some random feedback:

  • The pytest output for failures is very difficult to decipher. I usually have to guess.

  • Some documentation to clarify the relationship between python adapter methods and jinja macros would be helpful. I've spent a lot of time trying to understand this and I'm still not sure I have it 100% down. Many adapter methods simply call macros of the same name; in some cases, there's a bit of wrapper code around the macro invocation to do some other stuff.

    When writing an adapter, it may make sense to override the adapter method (bypassing the macro), instead of overriding the macro. This is true for stuff in dbt-sqlite, but it's definitely a weird case.

    EDIT: I see that the above has already been implemented on the docs website; it's been a while since I've looked at it and didn't even realize there's a whole new set of pages for how to build an adapter. Very nicely done!

  • I don't think there's currently a test to exercise "docs generate." dbt-sqlite was passing the test suite but a user reported a crash when running "docs generate" because there was a bug in my override of the _get_one_catalog method, which is probably unusual for an adapter. Still, I think it's a code path worth exercising.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants