Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Delta Sharing protocol #22692

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
a564b0c
Initial support for Delta Sharing protocol
alexott Mar 27, 2022
ae9343f
additional manual testing, started to add documentations, etc.
alexott Mar 28, 2022
41c41ec
Started DeltaSharingOperator
alexott Mar 31, 2022
4b4b7b3
Implemented download of data, added docs
alexott Apr 1, 2022
f6980ec
add tests + some refactoring
alexott Apr 1, 2022
b29f360
rename Delta Sharing operator
alexott Apr 1, 2022
38e2190
Refactor the structure of Delta Sharing provider
alexott Apr 1, 2022
72af477
Add possibility to specify Delta Sharing profile file as an argument
alexott Apr 2, 2022
f6efb58
Add a system test for Delta Sharing provider
alexott Apr 2, 2022
db7777d
Use system test as an example for documentation
alexott Apr 3, 2022
e5d4e4e
Initial support for Delta Sharing protocol
alexott Mar 27, 2022
9b8ca93
additional manual testing, started to add documentations, etc.
alexott Mar 28, 2022
7bc4b66
Started DeltaSharingOperator
alexott Mar 31, 2022
d980bad
Implemented download of data, added docs
alexott Apr 1, 2022
1ec73a5
add tests + some refactoring
alexott Apr 1, 2022
75dcba4
rename Delta Sharing operator
alexott Apr 1, 2022
2d4eaea
Refactor the structure of Delta Sharing provider
alexott Apr 1, 2022
6658e96
Add possibility to specify Delta Sharing profile file as an argument
alexott Apr 2, 2022
1231ff7
Delta Sharing: remove example DAGs in favor of system tests
alexott Oct 16, 2022
22f1e51
updating code to match latest main
alexott Oct 16, 2022
7f8ff12
more fixes by pre-commit hook
alexott Oct 16, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/airflow_providers_bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ body:
- databricks
- datadog
- dbt-cloud
- delta-sharing
- dingding
- discord
- docker
Expand Down
17 changes: 9 additions & 8 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -614,14 +614,15 @@ airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassan
apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, jira,
kerberos, kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp,
microsoft.winrm, mongo, mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas,
papermill, password, pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce,
samba, segment, sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd,
tableau, tabular, telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
datadog, dbt.cloud, delta.sharing, deprecated_api, devel, devel_all, devel_ci, devel_hadoop,
dingding, discord, doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github,
github_enterprise, google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc,
jenkins, jira, kerberos, kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql,
microsoft.psrp, microsoft.winrm, mongo, mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle,
pagerduty, pandas, papermill, password, pinot, plexus, postgres, presto, qds, qubole, rabbitmq,
redis, s3, salesforce, samba, segment, sendgrid, sentry, sftp, singularity, slack, snowflake, spark,
sqlite, ssh, statsd, tableau, tabular, telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex,
zendesk
.. END EXTRAS HERE

Provider packages
Expand Down
17 changes: 9 additions & 8 deletions INSTALL
Original file line number Diff line number Diff line change
Expand Up @@ -98,14 +98,15 @@ airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassan
apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, jira,
kerberos, kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp,
microsoft.winrm, mongo, mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas,
papermill, password, pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce,
samba, segment, sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd,
tableau, tabular, telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
datadog, dbt.cloud, delta.sharing, deprecated_api, devel, devel_all, devel_ci, devel_hadoop,
dingding, discord, doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github,
github_enterprise, google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc,
jenkins, jira, kerberos, kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql,
microsoft.psrp, microsoft.winrm, mongo, mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle,
pagerduty, pandas, papermill, password, pinot, plexus, postgres, presto, qds, qubole, rabbitmq,
redis, s3, salesforce, samba, segment, sendgrid, sentry, sftp, singularity, slack, snowflake, spark,
sqlite, ssh, statsd, tableau, tabular, telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex,
zendesk
# END EXTRAS HERE

# For installing Airflow in development environments - see CONTRIBUTING.rst
Expand Down
8 changes: 4 additions & 4 deletions airflow/providers/databricks/operators/databricks.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,8 +267,8 @@ class DatabricksSubmitRunOperator(BaseOperator):
This field will be templated.
:param databricks_conn_id: Reference to the :ref:`Databricks connection <howto/connection:databricks>`.
By default and in the common case this will be ``databricks_default``. To use
token based authentication, provide the key ``token`` in the extra field for the
connection and create the key ``host`` and leave the ``host`` field empty. (templated)
token based authentication, put the personal access token in the password field for the
connection and put an URL of Databricks workspace in the ``host`` field.
:param polling_period_seconds: Controls the rate which we poll for the result of
this run. By default the operator will poll every 30 seconds.
:param databricks_retry_limit: Amount of times retry if the Databricks backend is
Expand Down Expand Up @@ -549,8 +549,8 @@ class DatabricksRunNowOperator(BaseOperator):
returns the ID of the existing run instead. This token must have at most 64 characters.
:param databricks_conn_id: Reference to the :ref:`Databricks connection <howto/connection:databricks>`.
By default and in the common case this will be ``databricks_default``. To use
token based authentication, provide the key ``token`` in the extra field for the
connection and create the key ``host`` and leave the ``host`` field empty. (templated)
token based authentication, put the personal access token in the password field for the
connection and put an URL of Databricks workspace in the ``host`` field.
:param polling_period_seconds: Controls the rate which we poll for the result of
this run. By default, the operator will poll every 30 seconds.
:param databricks_retry_limit: Amount of times retry if the Databricks backend is
Expand Down
16 changes: 16 additions & 0 deletions airflow/providers/delta/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
25 changes: 25 additions & 0 deletions airflow/providers/delta/sharing/CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.


Changelog
---------

1.0.0
.....

Initial version of the provider.
16 changes: 16 additions & 0 deletions airflow/providers/delta/sharing/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
16 changes: 16 additions & 0 deletions airflow/providers/delta/sharing/hooks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
Loading