-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clickhouse profile mapping #353
Conversation
👷 Deploy Preview for amazing-pothos-a3bca0 processing.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #353 +/- ##
==========================================
+ Coverage 88.91% 89.07% +0.15%
==========================================
Files 39 41 +2
Lines 1299 1318 +19
==========================================
+ Hits 1155 1174 +19
Misses 144 144 ☔ View full report in Codecov by Sentry. |
This is looking great! Question for you - it looks like there’s a recommendation to use a SQLite connection for Clickhouse with Airflow (https://github.com/bryzgaloff/airflow-clickhouse-plugin). Do you think this should use generic or SQLite connection type? |
Wow! Clickhouse should defiantly support! Im voting up for this! |
Sorry for intervention but I don't think that using airflow-clickhouse-plugin) for dbt-clickhouse is a good idea actually. Hi @silentsokolov . Could somebody from your company make a decision about future of this feature? All of us thrilled to ha ve the dbt-clickhouse in airflow-cosmos :) |
I don't claim to fully understand this project :), but I think this PR correctly uses |
@jlaneve since clickhouse is has no official support on airflow it makes more sense from my perspective to use a generic http connector instead of SQLlite, which is somehow hacky. This implementation looks very solid for me as of now. |
Hey folks, curious if we still want this profile mapping now that we have support for user-provided profiles. Also worth noting that the Airflow Clickhouse integration recommends using a from cosmos.profiles import ClickhouseUserPasswordProfileMapping
profile_config = ProfileConfig(
profile_name="my_profile_name",
target_name="my_target_name",
profile_mapping=ClickhouseUserPasswordProfileMapping(
conn_id="my_sqlite_connection", # use sqlite bc that's what airflow-clickhouse-plugin uses
profile_args={
"additional_arg": "my_value",
},
),
) |
The sqlite conn and HTTP generic conn both have the same args if I am not wrong, therefore it is not a big deal to use one or another. In my current company we use the http conn for clickhouse to avoid mixing it up. In the end it is just a naming convention. |
**self.profile_args, | ||
} | ||
|
||
return self.filter_null(profile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, could you add the mock_profile
function & a test for it?
https://github.com/astronomer/astronomer-cosmos/blob/3786703609e69c1e8f4b2db1475fe8b6ea00a117/cosmos/profiles/base.py#L97C7-L97C7
This was introduced after this PR was created and is now required.
|
||
airflow_connection_type: str = "generic" | ||
default_port = 9000 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, could you add the attribute:
is_community = True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roadan, sorry for the delay. Could you please rebase this branch and address the comments?
@jlaneve, at least three community members would like this feature; what do you think about approving it once the feedback has been addressed and tests passed? It seems valuable, it is well documented and tested.
Would be very nice to see this PR merged. 👍 |
Would be very nice to see this PR merged. 👍 |
@roadan could i colaborate on Unresolved conversations ? |
Hey @roadan, It looks like we're very close to merging! Could you please rebase and add the mock_profile property? Let me know if there's anything I can do to help. thanks! |
Hi @roadan, I was wondering if you had a chance to read my previous comment. It seems like this feature is very important for the community. If it's alright with you, can I take it forward from here? Thanks! |
@roadan @vargacypher @alexisvannier @CorsettiS @genzgd @epikhinm @Gaploid Since this work is still relevant, and the PR got stale, @pankajastro is rebasing the original implementation in #1016, so we can get this work merged and released as part of Cosmos 1.5. If we release an alpha, would one of you be able to help us testing this feature? |
I could help on tests @tatiana |
This PR adds Clickhouse profile mapping using a `generic` connection type. To prevent cosmos from attaching all generic connections, it uses a required field named `clickhouse` mapped to `extra.clickhouse`. To ensure the profile is claimed, users must add the following JSON to the extra field in the connection: ```JSON { "clickhouse": "True" } ``` Co-authored-by: Yaniv Rodenski <roadan@gmail.com> Original PR by @roadan: #353 Closes #95
@vargacypher, thanks a lot; we just merged #1016 and let you know once we have an alpha with this feature! |
New Features * Speed up ``LoadMode.DBT_LS`` by caching dbt ls output in Airflow Variable by @tatiana in #1014 * Support to cache profiles created via ``ProfileMapping`` by @pankajastro in #1046 * Support for running dbt tasks in AWS EKS in #944 by @VolkerSchiewe * Add Clickhouse profile mapping by @roadan and @pankajastro in #353 and #1016 * Add node config to TaskInstance Context by @linchun3 in #1044 Bug fixes * Support partial parsing when cache is disabled by @tatiana in #1070 * Fix disk permission error in restricted env by @pankajastro in #1051 * Add CSP header to iframe contents by @dwreeves in #1055 * Stop attaching log adaptors to root logger to reduce logging costs by @glebkrapivin in #1047 Enhancements * Support ``static_index.html`` docs by @dwreeves in #999 * Support deep linking dbt docs via Airflow UI by @dwreeves in #1038 * Add ability to specify host/port for Snowflake connection by @whummer in #1063 Docs * Fix rendering for env ``enable_cache_dbt_ls`` by @pankajastro in #1069 Others * Update documentation for DbtDocs generator by @arjunanan6 in #1043 * Use uv in CI by @dwreeves in #1013 * Cache hatch folder in the CI by @tatiana in #1056 * Change example DAGs to use ``example_conn`` as opposed to ``airflow_db`` by @tatiana in #1054 * Mark plugin integration tests as integration by @tatiana in #1057 * Ensure compliance with linting rule D300 by using triple quotes for docstrings by @pankajastro in #1049 * Pre-commit hook updates in #1039, #1050, #1064 * Remove duplicates in changelog by @jedcunningham in #1068
This PR adds Clickhouse profile mapping using a `generic` connection type. To prevent cosmos from attaching all generic connections, it uses a required field named `clickhouse` mapped to `extra.clickhouse`. To ensure the profile is claimed, users must add the following JSON to the extra field in the connection: ```JSON { "clickhouse": "True" } ``` Co-authored-by: Yaniv Rodenski <roadan@gmail.com> Original PR by @roadan: astronomer#353 Closes astronomer#95
New Features * Speed up ``LoadMode.DBT_LS`` by caching dbt ls output in Airflow Variable by @tatiana in astronomer#1014 * Support to cache profiles created via ``ProfileMapping`` by @pankajastro in astronomer#1046 * Support for running dbt tasks in AWS EKS in astronomer#944 by @VolkerSchiewe * Add Clickhouse profile mapping by @roadan and @pankajastro in astronomer#353 and astronomer#1016 * Add node config to TaskInstance Context by @linchun3 in astronomer#1044 Bug fixes * Support partial parsing when cache is disabled by @tatiana in astronomer#1070 * Fix disk permission error in restricted env by @pankajastro in astronomer#1051 * Add CSP header to iframe contents by @dwreeves in astronomer#1055 * Stop attaching log adaptors to root logger to reduce logging costs by @glebkrapivin in astronomer#1047 Enhancements * Support ``static_index.html`` docs by @dwreeves in astronomer#999 * Support deep linking dbt docs via Airflow UI by @dwreeves in astronomer#1038 * Add ability to specify host/port for Snowflake connection by @whummer in astronomer#1063 Docs * Fix rendering for env ``enable_cache_dbt_ls`` by @pankajastro in astronomer#1069 Others * Update documentation for DbtDocs generator by @arjunanan6 in astronomer#1043 * Use uv in CI by @dwreeves in astronomer#1013 * Cache hatch folder in the CI by @tatiana in astronomer#1056 * Change example DAGs to use ``example_conn`` as opposed to ``airflow_db`` by @tatiana in astronomer#1054 * Mark plugin integration tests as integration by @tatiana in astronomer#1057 * Ensure compliance with linting rule D300 by using triple quotes for docstrings by @pankajastro in astronomer#1049 * Pre-commit hook updates in astronomer#1039, astronomer#1050, astronomer#1064 * Remove duplicates in changelog by @jedcunningham in astronomer#1068
New Features * Speed up ``LoadMode.DBT_LS`` by caching dbt ls output in Airflow Variable by @tatiana in #1014 * Support to cache profiles created via ``ProfileMapping`` by @pankajastro in #1046 * Support for running dbt tasks in AWS EKS in #944 by @VolkerSchiewe * Add Clickhouse profile mapping by @roadan and @pankajastro in #353 and #1016 * Add node config to TaskInstance Context by @linchun3 in #1044 Bug fixes * Support partial parsing when cache is disabled by @tatiana in #1070 * Fix disk permission error in restricted env by @pankajastro in #1051 * Add CSP header to iframe contents by @dwreeves in #1055 * Stop attaching log adaptors to root logger to reduce logging costs by @glebkrapivin in #1047 Enhancements * Support ``static_index.html`` docs by @dwreeves in #999 * Support deep linking dbt docs via Airflow UI by @dwreeves in #1038 * Add ability to specify host/port for Snowflake connection by @whummer in #1063 Docs * Fix rendering for env ``enable_cache_dbt_ls`` by @pankajastro in #1069 Others * Update documentation for DbtDocs generator by @arjunanan6 in #1043 * Use uv in CI by @dwreeves in #1013 * Cache hatch folder in the CI by @tatiana in #1056 * Change example DAGs to use ``example_conn`` as opposed to ``airflow_db`` by @tatiana in #1054 * Mark plugin integration tests as integration by @tatiana in #1057 * Ensure compliance with linting rule D300 by using triple quotes for docstrings by @pankajastro in #1049 * Pre-commit hook updates in #1039, #1050, #1064 * Remove duplicates in changelog by @jedcunningham in #1068 (cherry picked from commit 18d2c90)
Description
This PR adds Clickhouse profile mapping using a
generic
connection type. To prevent cosmos from attaching all generic connections, it uses a required field namedclickhouse
mapped toextra.clickhouse
.To ensure the profile is claimed, users must add the following JSON to the extra field in the connection:
Related Issue(s)
closes #95
Breaking Change?
Checklist
[ x] I have made corresponding changes to the documentation (if required)
[ x] I have added tests that prove my fix is effective or that my feature works