-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-664] [Feature] copy local dbt deps deeply, without symlinking #5271
Comments
@alexrosenfeld10 Thanks for the really clear write-up, as always! We had a chance to discuss this a bit yesterday. dbt already creates "deep" links on Windows, which doesn't support real symlinking: dbt-core/core/dbt/deps/local.py Line 50 in 75f3e8c
dbt-core/core/dbt/deps/local.py Lines 58 to 64 in 75f3e8c
We are a bit hesitant about mixing symlinks and "deep" links, since these could lead to situations that are confusing to debug, both for users and for us. The "deep" links on Windows are not preferable IMO, as they can cause other related errors (file permissions) when trying to clean/replace installed dependencies: #4372 (comment)
This feels reasonable to me... alternatively, could all of the projects exist at the same hierarchy level, rather than the one reused package living at the top level?
Out of curiosity, is the top-level project here a project of shared sources / upstream models (pointers to data warehouse objects)? Or shared macros (dbt source code) used by all of those sub-packages? If it's the former, I've got some bigger ideas of how we could better support the pattern of multiple projects, owned by different teams, that roll up to one mono-DAG / monorepo: #5244 The parallels to #4538 here are instructive, insofar as both of these issues are finding rough edges around |
Thanks @jtcohen6, and same to you!
Yes, that's possible, but we have a project per-domain and it's coupled with other per-domain entities, including some bespoke per-domain configurations (for example, which events should we stream into our data warehouse for this domain). I'd prefer to keep them all co-located in nested folders, as it's preferable for the various domain teams for things like PR review codeowners, etc.
I definitely agree with the ideas in #5244, but it's a big lift and there's a lot to do! I think we could soften the rough edges by adding support for "deep copy this package because I explicitly asked you to" type functionality in presumably a much shorter time frame, softening those rough edges while not ruling out the more compelling (but longer term) answers (I realize there are issues wrt/Windows, but that seems to me like something to be debugged in isolation to this issue). |
btw, I decided to go with this as a workaround: packages:
- git: "https://{{ env_var('DBT_ENV_SECRET_GITHUB_TOKEN') }}@github.com/my-company/my-data-mesh-project.git"
warn-unpinned: false
subdirectory: "my-reporting-team-internal/dbt/common_utils" This means:
# dbt will require this to be set when it parses the project because it appears in the packages.yml file
# however, we don't need a real value because the dependencies are resolved in the CI
# at time and are already part of the artifact instead of being resolved just prior to runtime.
- name: DBT_ENV_SECRET_GITHUB_TOKEN
value: "dummy-value" |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Is this your first time opening an issue?
Describe the Feature
Currently
dbt deps
does a few different things depending on the dependency type. For local dependencies, it creates symlinks pointing to the local dbt project. I have a monorepo use case where I’d like to publish each sub-project separately, but they all reference a top level project. As a result, when I run dbt deps and then publish the directory the deps aren’t actually included in the artifact, instead it's just a symlink. When I pull the artifact down and run it in my cron job, the symlink is (obviously) broken.There should be an option on
local
type dependencies that overrides this behavior and actually copies the dependency in, similar to how the git / hub based packages workDescribe alternatives you've considered
My workaround here is I'm just going to publish the entire monorepo and then have my execution job navigate to the sub-project directory, OR using the GitHub PAT type auth for cloning. That's (slightly) less preferred because it means any of the developers in my org using this monorepo will have to set up their own PAT and env var in order to run the project, when in reality.. the dependency is in fact already on their machine locally already in the repo itself 😆
Who will this benefit?
No response
Are you interested in contributing this feature?
Sure! If time allows.
Anything else?
No response
The text was updated successfully, but these errors were encountered: