Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 New Destination: TiDB #15592

Merged
merged 34 commits into from
Aug 31, 2022

Conversation

Daemonxiao
Copy link
Contributor

@Daemonxiao Daemonxiao commented Aug 12, 2022

What

Add new destination TiDB with supporting normalization.

How

Describe the solution

  1. Destination: by JDBC connector to transform row data.
  2. Normalization: using dbt-tidb to normalize JSON data.

Recommended reading order

  1. TiDBDestination.java
  2. TiDBSQLNameTransformer.java
  3. TiDBSqlOperations.java
  4. transform.py
  5. stream_processor.py
  6. the rest

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Tests

Integration

Destination Integration Test

N28oapP4CQ

Normalization Integration Test

airbyte$ NORMALIZATION_TEST_TARGET=tidb ./gradlew :airbyte-integrations:bases:base-normalization:integrationTest

Pxge8OEhRG

@Daemonxiao Daemonxiao requested review from a team as code owners August 12, 2022 09:06
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker normalization labels Aug 12, 2022
@marcosmarxm
Copy link
Member

Awesome contribution @Daemonxiao I'll ask the team to review next week!

@sajarin sajarin added internal and removed bounty labels Aug 12, 2022
@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 16, 2022

/test connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2868383644
❌ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2868383644
🐛 https://gradle.com/s/zm6v25pfm27zg

Build Failed

Test summary info:

Could not find result summary

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 16, 2022

/test connector=connectors/destination-tidb

🕑 connectors/destination-tidb https://github.com/airbytehq/airbyte/actions/runs/2868389139
❌ connectors/destination-tidb https://github.com/airbytehq/airbyte/actions/runs/2868389139
🐛 https://gradle.com/s/fmp4m3mcw3oks

Build Failed

Test summary info:

Could not find result summary

@Daemonxiao
Copy link
Contributor Author

@marcosmarxm It seems something is wrong with pflake8. csachs/pyproject-flake8#13
So I haven't done "pflake" in my work.
How can I fix this error?

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 188, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 147, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pflake8/__init__.py", line 54, in <module>
    class ModifiedConfigFileFinder(flake8.options.config.ConfigFileFinder):
AttributeError: module 'flake8.options.config' has no attribute 'ConfigFileFinder'

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 29, 2022

/test connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2951445892
❌ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2951445892
🐛 https://gradle.com/s/6mkmba6dcf47o

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 SKIPPED [1] integration_tests/test_ephemeral.py:99: ephemeral materialization isn't supported in ClickHouse yet
	 SKIPPED [1] integration_tests/test_ephemeral.py:61: Destinations DestinationType.MYSQL is not in NORMALIZATION_TEST_TARGET env variable (MYSQL is also skipped)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.CLICKHOUSE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.CLICKHOUSE does not support nested streams
	 SKIPPED [2] integration_tests/test_normalization.py:134: DestinationType.MYSQL does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:134: DestinationType.ORACLE does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.ORACLE does not support nested streams
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.SNOWFLAKE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.TIDB is disabled as it doesnt support schema change in incremental yet (column type changes)
	 FAILED integration_tests/test_normalization.py::test_normalization[DestinationType.MSSQL-test_simple_streams]
	 �[31m============ �[31m�[1m1 failed�[0m, �[32m26 passed�[0m, �[33m10 skipped�[0m�[31m in 3428.77s (0:57:08)�[0m�[31m =============�[0m

@Daemonxiao
Copy link
Contributor Author

@marcosmarxm Integration test reported that test_normalization[DestinationType.MSSQL-test_simple_streams] was failed. Does something wrong with the MSSQL-normalization docker image? Plz, take a look.

@marcosmarxm
Copy link
Member

@Daemonxiao yep, I'll take a look. Looks your contributation is close to be merged!

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 30, 2022

/test connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2957987374
✅ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/2957987374
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         191     49    74%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 589    394    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1437    624    57%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1322    463    65%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         191     49    74%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 589    394    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1437    624    57%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/utils.py                             51      1    98%
normalization/transform_catalog/destination_name_transformer.py     166      5    97%
normalization/transform_catalog/stream_processor.py                 589     35    94%
normalization/transform_catalog/catalog_processor.py                147     12    92%
normalization/transform_catalog/dbt_macro.py                         22      3    86%
normalization/transform_catalog/table_name_registry.py              174     51    71%
normalization/transform_catalog/transform.py                         61     22    64%
normalization/transform_config/transform.py                         191     77    60%
-------------------------------------------------------------------------------------
TOTAL                                                              1437    206    86%

Build Passed

Test summary info:

	 =========================== short test summary info ============================
	 SKIPPED [1] integration_tests/test_ephemeral.py:99: ephemeral materialization isn't supported in ClickHouse yet
	 SKIPPED [1] integration_tests/test_ephemeral.py:61: Destinations DestinationType.MYSQL is not in NORMALIZATION_TEST_TARGET env variable (MYSQL is also skipped)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.CLICKHOUSE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.CLICKHOUSE does not support nested streams
	 SKIPPED [1] integration_tests/test_normalization.py:146: DestinationType.MSSQL is disabled as it doesnt fully support schema change in incremental yet
	 SKIPPED [2] integration_tests/test_normalization.py:134: DestinationType.MYSQL does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:134: DestinationType.ORACLE does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.ORACLE does not support nested streams
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.SNOWFLAKE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.TIDB is disabled as it doesnt support schema change in incremental yet (column type changes)
	 �[32m================= �[32m�[1m26 passed�[0m, �[33m11 skipped�[0m�[32m in 3419.01s (0:56:59)�[0m�[32m ==================�[0m
	 =========================== short test summary info ============================
	 SKIPPED [1] integration_tests/test_ephemeral.py:99: ephemeral materialization isn't supported in ClickHouse yet
	 SKIPPED [1] integration_tests/test_ephemeral.py:61: Destinations DestinationType.MYSQL is not in NORMALIZATION_TEST_TARGET env variable (MYSQL is also skipped)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.CLICKHOUSE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.CLICKHOUSE does not support nested streams
	 SKIPPED [1] integration_tests/test_normalization.py:146: DestinationType.MSSQL is disabled as it doesnt fully support schema change in incremental yet
	 SKIPPED [2] integration_tests/test_normalization.py:134: DestinationType.MYSQL does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:134: DestinationType.ORACLE does not support incremental yet
	 SKIPPED [1] integration_tests/test_normalization.py:81: Destinations DestinationType.ORACLE does not support nested streams
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.SNOWFLAKE is disabled as it doesnt support schema change in incremental yet (column type changes)
	 SKIPPED [1] integration_tests/test_normalization.py:143: DestinationType.TIDB is disabled as it doesnt support schema change in incremental yet (column type changes)
	 �[32m================= �[32m�[1m26 passed�[0m, �[33m11 skipped�[0m�[32m in 3269.37s (0:54:29)�[0m�[32m ==================�[0m

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Daemonxiao for this amazing contribution

@marcosmarxm
Copy link
Member

/publish connector=bases/base-normalization

@marcosmarxm
Copy link
Member

/publish connector=connectors/destination-tidb

@marcosmarxm
Copy link
Member

/publish connector=connectors/destination-tidb

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 31, 2022

/publish connector=connectors/destination-tidb

🕑 Publishing the following connectors:
connectors/destination-tidb
https://github.com/airbytehq/airbyte/actions/runs/2965803292


Connector Did it publish? Were definitions generated?
connectors/destination-tidb

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 31, 2022

/publish connector=bases/base-normalization

🕑 Publishing the following connectors:
bases/base-normalization
https://github.com/airbytehq/airbyte/actions/runs/2965804497


Connector Did it publish? Were definitions generated?
bases/base-normalization

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@marcosmarxm marcosmarxm merged commit d452403 into airbytehq:master Aug 31, 2022
robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
* Add new destination-tidb

* support sync

* Add normalization-tidb

* fix failed tests

* Add unnest marco

* fmt

* Add new destination-tidb

* support sync

* Add normalization-tidb

* fix failed tests

* Add unnest marco

* fmt

* fmt

* fix integration test

* Update docs/integrations/destinations/tidb.md

Co-authored-by: Xiang Zhang <angwerzx@126.com>

* Update doc

* Update doc

* Update doc

* bump normalization version

* update normalization changelog

* run format

* add dest def

* generat spec

Co-authored-by: Xiang Zhang <angwerzx@126.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
@sspaeti sspaeti mentioned this pull request Nov 2, 2022
23 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker community connectors/destination/tidb connectors/source/tidb internal normalization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants