-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1772] [Feature] Unexpected DuplicateProjectDependencyError in git
packages with common transitive dependency
#6552
Comments
@GiorgioBaldelli Thanks for opening! There have been a handful of issues similar to this one, although none that's exactly the scenario you're outlining here:
That's more or less correct. The simple fact is, there are more user-friendly things we can do when you specify a Hub As @dbeatty10 mentioned in #6502 (comment), you can optionally host the dbt package hub yourself, using these instructions, and set the In the meantime, since you know that Project A depends on Project B which depends on # project_a/packages.yml
packages:
# this will be installed via project_b
# - git: "https://gitlab.com/example/example_package.git"
# revision: main
- git: "https://gitlab.com/example/project_b.git"
revision: main
# project_b/packages.yml
packages:
- git: "https://gitlab.com/example/example_package.git"
revision: main I'm going to reclassify this from a |
Thanks for the response, jtcohen6. Knowing that it’s possible to self-host dbt package hub is useful information.
That would be a straightforward solution, I agree. In our case, it’s not as straightforward unfortunately: as a part of our dbt project deployment process, we apply some generic, strict linting rules to check if |
Not having access to the Hub Api in a Production or CI environment it is essential to have this issue sorted, or else we are force to only use a single package. |
Would be cool to understand why it has been marked as wontfix, as the functionality to use git repos inside the package yml file seem a replacer for the hub api or local packages. |
Hi Jeremy !
I am going to try installing the hub locally. Implementing git packages deduplication would really be a game changer, because for now this breaks dbt incredible modulary. Comment se passe la vie à Marseille ? C'est bientôt la Chandeleur, le moment de remonter la rue Sainte, prier la Vierge Noire et déguster les "navettes" encore chaudes de Saint-Victor ! Salutations! (@jtcohen6 ) |
We closed this as But I was mistaken! dbt does (tries its best to do) exactly this: Lines 60 to 85 in a8abc49
Namely, dbt does use dbt-core/core/dbt/deps/resolver.py Lines 123 to 133 in a8abc49
So then the question is: Why are you seeing the duplicate project name error, which we explicitly check for & raise there? When I try to reproduce the actual issue locally, to pinpoint where we'd need to make the change, I can't — it's working fine for me: packages:
- git: "https://github.com/fivetran/dbt_hubspot_source"
revision: v0.3.0
# this is also a dependency of fivetran/dbt_hubspot_source@v0.3.0
- git: "https://github.com/fivetran/dbt_fivetran_utils.git"
warn-unpinned: false
If I drop a breakpoint between these two lines, all looks good:
@GiorgioBaldelli @xesf @fabrice-etanchaud Any chance you're doing something fun, like using Windows? (Fabrice, ça fait longtemps ! ça va à Marseille, j'espère la meme à Niort) |
git
packages with common transitive dependency
Ahah, fun on Windows ! Shame on my company, because I have to use Windows there, although all my family's PCs are running Linux Q4OS ! I try it on linux and tell you... |
@jtcohen6 thanks for looking into this! Is it safe to assume then that this part of the documentation is not entirely accurate: https://docs.getdbt.com/docs/build/packages#hub-packages-recommended
|
Yes ! it works. Thank you Jérémy. |
It does sound like it's worth updating that documentation—although it remains true that only Hub packages allow you to specify more-complex version requirements, with version resolution. With
That sounds like confirmation this issue is Windows-only? My guess would be, something to do with lacking the right file permissions to remove/overwrite the already-installed package? |
@jtcohen6 I was using mac m1. |
Same here. In my attempts, I was running my dbt commands on a container application running on a mac m1. I was mounting a local directory that contained the dbt project definition, models, package config, etc. |
Hi all, I am experiencing the same issue using the "git" reference in packges.yml, however, I actually have a different repository design that I want to implement, but seems to be a limitation outside of GitHub (my team is using Azure DevOps Git repositories which is why we rely on the "git" package tag instead of "hub" or "package") Project A (common repo):
Project B (application specific):
Project C (application specific):
Project D (documentation for all applications):
As of right now, this type of implementation throws an error stating "Found a dependency with the same name as the root project <>. Package names must be unique in a project. Please rename one of these packages." My second idea was to then combine both the common repo (Project A), and the documentation repo (Project D) into one, but the underlying issue still remains where Project B and C need to import from this new combined repo, while this new combine repo also needs to import from B and C (duplicate dependency) throwing the same error. Reading the comments above, it seems that GitHub resolves this issue due to its inherent ability to handle duplicate dependencies. However, my enterprise is not able to utilize GitHub, rather we have to use AzureDevOps hence the need for the "git" package tag and "revision" qualifier. If the "git" qualifier could perform similarly to the "hub" or "package" qualifier" does, or there was a way to instruct the dependencies, and/or ignore ones that already exist, I think this approach could work. Very unfortunate as of right now :/ Last thing to note: There's very little documentation around investigating the OpenSSL logs that the "dbt deps" command utilizes for troubleshooting why dbt authenticates in a random loop asking for your username/personal access token for Azure DevOps, only to fail (seemingly due to a certificate/proxy issue). You simply get a generic SSL error stating it was "unable to checkout spec=None" while sometimes stating "unable to checkout spec=" and yet still failing. Brute force retries over and over seem to all of a sudden work making the steps to replicate nearly impractical. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Is this a new bug in dbt-core?
Current Behavior
Let's say we have two projects, project A & project B.
Both projects A and B import the same, custom dbt package from a git repository, let's call it example_package.
If we now modify project A's packages.yml to import project B, the package configurations would look like this:
Project A:
Project B:
If we now attempt to run any dbt command on project A, for example
dbt deps
, we get the following error:Found duplicate project "example_package". This occurs when a dependency has the same project name as some other dependency.
Is there a way to handle duplicate packages when using the git sytax to import custom, non-standard packages?
It looks like dbt may be able to handle duplicates only if we attempt to import standard packages from dbt hub? The recommended solution in this issue does not address situations in which the package is custom and does not therefore exist on dbt's package hub.
Expected Behavior
dbt knows how to handle duplicate package import using the git syntax.
Steps To Reproduce
Use the example package definition & attempt to run
dbt deps
.Relevant log output
No response
Environment
No response
Which database adapter are you using with dbt?
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: