-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle multi-steps Custom DBT Transformations #5590
Comments
Btw @zestyping, I think you mentioned something on custom dbt transformations to john recently, this issue might be of interest to you as an FYI? |
We have the same use case for this issue: Our models have a dependency on the dbt package I've used the workaround as described (compiling the package locally and then persisting the Our desired outcome (however it is achieved) is to be able to easily use packages and dependencies that require the I work with cc: @zestyping |
Adding a bash script input to your custom operation should leave you the freedom to run multiple dbt commands for the same operation and sequence a dbt deps with a dbt run in the meantime, you can follow that trick or let us know on slack if we can help you get that working. |
Any news on this? If I understand correctly, there's no way of running a DBT project with dependencies which makes Custom DBT Transformations almost useless. According to dbt-labs/dbt-core#4784 and it's comment, even if we create our own Docker image with |
Does the trick with |
@ChristopheDuong it works but it means that locally we have to work with a What if we use other tools using the same git repo that are not compatible with this workaround? Side note: in DBT >= 1.0.0 the default config has changed from https://docs.getdbt.com/reference/project-configs/packages-install-path |
Hey 👋 just wanted to add here as we've also recently run across this particular issue since updating to dbt v1.0. |
FYI, for the latest airbyte version (and with dbt >= v1.0.0), override this instead |
This is very annoying. |
We want to stay out of the business of orchestrating complex dbt workflows. We recommend using airflow or dagster to do this. Docs |
Is this still the suggested workaround? I've generated and exported the dbt files. And I get this error:
It doesn't appear that |
Tell us about the problem you're trying to solve
One typical use case for multi-steps custom DBT transformations is when using custom dbt transformations that require additional dbt packages, the user can create two (or more) custom transformations where the first step install deps while the following ones actually runs the transformations (using those dependency packages). However, this usually fails where
dbt run
transformation is showing dependencies are not installed.(note that custom transformations are not currently working in Kube deployments, see #5091 and multi-steps between multiple pods is even more difficult)
Describe the solution you’d like
Allow the user in the UI to specify a bash script to execute in a single operation instead of a unique dbt command.
The user would therefore be able to configure a sequence of dbt commands (multi-steps) within the same operation run instead of splitting them over multiple operations
Describe the alternative you’ve considered or used
A current trick (when running through docker, not kube) is to make sure to specify the following variable in the
dbt_project.yml
file of the user's custom step:modules-path: "../dbt_modules"
This will make sure that the first
dbt deps
step is able to persist the cloned package in the workspace folder of the sync (outside of thegit_repo
folder) which can therefore be accessible by a seconddbt run
custom transformation step.Additional context
When building normalization docker image:
dbt_utils
:airbyte/airbyte-integrations/bases/base-normalization/dbt-project-template/packages.yml
Line 4 in c2aebd6
/tmp/dbt_modules
as specified here:airbyte/airbyte-integrations/bases/base-normalization/dbt-project-template/dbt_project.yml
Line 24 in c2aebd6
airbyte/airbyte-integrations/bases/base-normalization/Dockerfile
Line 21 in c2aebd6
As a result, the
dbt deps
command will perform agit clone
of the package (once) when building the docker image as part of Airbyte CI process when releasing a new docker image for normalization.If the user were to re-use the generated normalization project by exporting it and include it back as a custom step of the sync. This becomes confusing because the
/tmp/dbt_modules
directory is not persisted between two custom dbt transformations steps.The solution in this scenario is the user need to tweak the exported project by modifying the
dbt_project.yml
file and edit to reflect the following change:modules-path: "../dbt_modules"
.This will make sure that the first
dbt deps
step is able to persist the cloned package in the workspace folder of the sync (outside of thegit_repo
folder) which can therefore be accessible by a seconddbt run
custom transformation step.(note that this change can't be done in the airbyte repository or it will require to run
dbt deps
everytime we do a sync running normalization to download the package into the sync workspace, instead of doing it only once at "compile" time of the docker image)Documentation should probably be updated to reflect these in #4351
The text was updated successfully, but these errors were encountered: