Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Debezium heartbeats for source-postgres #19004

Merged
merged 28 commits into from
Nov 17, 2022

Conversation

rodireich
Copy link
Contributor

@rodireich rodireich commented Nov 5, 2022

What

This is adding heartbeat functionality to Debezium and implements it for source postgres CDC.

How

Processing incoming heartbeat events which add more predictability to our sync and prevent us from wrapping up too soon in some cases. See discussion in #15040.
This while keeping existing functionality for other CDC connectors - MySQL and MSSql which will have heartbeat added later on.

Recommended reading order

  1. DebeziumRecordIterator.java
  2. PostgresCdcTargetPosition.java

🚨 User Impact 🚨

New logic should improve our syncing in some of the harder cases. e.g large and busy database out of which we only sync a small amount of tables.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2022

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector version
  • Add changelog
  • Publish the new version

⚠ Sources (7)

Connector Version Changelog Publish
source-alloydb 1.0.17
source-alloydb-strict-encrypt 1.0.17
(not in seed)
source-mssql 0.4.25
source-mysql 1.0.13
source-mysql-strict-encrypt 1.0.13
(not in seed)
source-postgres 1.0.25
source-postgres-strict-encrypt 1.0.25
(not in seed)
  • See "Actionable Items" below for how to resolve warnings and errors.

✅ Destinations (0)

Connector Version Changelog Publish
  • See "Actionable Items" below for how to resolve warnings and errors.

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the seed file (e.g. source_definitions.yaml), so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that it is not a bug.

diff seed version
The connector exists in the seed file, but the latest version is not listed there. This usually means that the latest version is not published. Please use the /publish command to publish the latest version.

@rodireich
Copy link
Contributor Author

rodireich commented Nov 5, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3398677111
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3398677111
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich
Copy link
Contributor Author

rodireich commented Nov 7, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3408077843
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3408077843
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich
Copy link
Contributor Author

rodireich commented Nov 7, 2022

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3408078606
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/3408078606
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich
Copy link
Contributor Author

rodireich commented Nov 7, 2022

/test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/3408078869
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/3408078869
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich rodireich temporarily deployed to more-secrets November 7, 2022 06:30 Inactive
@rodireich rodireich temporarily deployed to more-secrets November 8, 2022 05:37 Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Nov 8, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3416728030
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3416728030
No Python unittests run

Build Passed

Test summary info:

All Passed

@rodireich rodireich temporarily deployed to more-secrets November 8, 2022 05:46 Inactive
@rodireich rodireich changed the title Initial working commit Implement Debezium heartbeats for source-postgres Nov 8, 2022
@rodireich rodireich temporarily deployed to more-secrets November 8, 2022 23:40 Inactive
@rodireich rodireich marked this pull request as ready for review November 9, 2022 06:48
@rodireich rodireich requested a review from a team as a code owner November 9, 2022 06:48
@rodireich rodireich temporarily deployed to more-secrets November 9, 2022 16:49 Inactive
@rodireich rodireich temporarily deployed to more-secrets November 9, 2022 18:11 Inactive
@alafanechere
Copy link
Contributor

alafanechere commented Nov 16, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3480058606
❌ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3480058606
🐛 https://gradle.com/s/4xgpmcrlnhuic

Build Failed

Test summary info:

Could not find result summary

@alafanechere alafanechere temporarily deployed to more-secrets November 16, 2022 14:08 Inactive
@alafanechere
Copy link
Contributor

@rodireich the acceptance-test-config.yml format was not corresponding to the newest one. (it was like a mix between the legacy one that is still supported and the new one 😄 ) . You'll find more details here.

@alafanechere
Copy link
Contributor

I'm under the impression that the preivous /test failed before SAT . Re-running in case it's flaky.

@alafanechere
Copy link
Contributor

alafanechere commented Nov 16, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3481159996
❌ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3481159996
🐛 https://gradle.com/s/3axcxossriv5o

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestSpec::test_enum_usage[inputs0] - TypeError: list ind...
FAILED test_core.py::TestSpec::test_backward_compatibility[inputs0] - hypothe...
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestConnection.test_check: Pending.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestDiscovery.test_discover: Pending.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestBasicRead.test_read: Pending.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestFullRefresh.test_sequential_reads: Pending.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: Pending.
============= 2 failed, 12 passed, 5 skipped in 175.07s (0:02:55) ==============

@alafanechere alafanechere temporarily deployed to more-secrets November 16, 2022 16:24 Inactive
@rodireich rodireich assigned rodireich and unassigned subodh1810 Nov 16, 2022
@rodireich
Copy link
Contributor Author

rodireich commented Nov 16, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3483393867
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/3483393867
Python tests coverage:

	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       139      5    96%   87, 93, 235, 239-240
	 source_acceptance_test/conftest.py                     196     92    53%   35, 41-43, 48, 54, 60, 66, 72-74, 93, 98-100, 106-108, 114-115, 120-121, 126, 132, 141-150, 156-161, 176, 200, 231, 237, 243-248, 256-261, 269-282, 287-293, 300-311, 318-334
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              398    111    72%   53, 58, 87-95, 100-107, 111-112, 116-117, 299, 337-354, 363-371, 375-380, 386, 419-424, 462-469, 512-514, 517, 582-590, 602-605, 610, 666-667, 673, 676, 712-722, 735-760
	 source_acceptance_test/tests/test_incremental.py       158     14    91%   52-59, 64-77, 240
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  94     10    89%   16-17, 32-38, 72, 75
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-68, 71-73, 76-78, 81-83, 86-88, 91-93, 96-114, 148-150
	 source_acceptance_test/utils/json_schema_helper.py     107     13    88%   30-31, 38, 41, 65-68, 96, 120, 192-194
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1563    349    78%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestConnection.test_check: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestDiscovery.test_discover: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestBasicRead.test_read: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestFullRefresh.test_sequential_reads: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: not found in the config.
================= 14 passed, 5 skipped, 21 warnings in 27.25s ==================

@rodireich rodireich temporarily deployed to more-secrets November 16, 2022 21:51 Inactive
@rodireich rodireich temporarily deployed to more-secrets November 16, 2022 23:22 Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Nov 16, 2022

/publish connector=connectors/source-postgres

🕑 Publishing the following connectors:
connectors/source-postgres
https://github.com/airbytehq/airbyte/actions/runs/3483948735


Connector Did it publish? Were definitions generated?
connectors/source-postgres

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@rodireich rodireich temporarily deployed to more-secrets November 16, 2022 23:26 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets November 17, 2022 00:10 Inactive
@rodireich rodireich merged commit c739c4c into master Nov 17, 2022
@rodireich rodireich deleted the 18987-implement-heartbeats-for-source-postgres-cdc branch November 17, 2022 00:20
@rodireich
Copy link
Contributor Author

rodireich commented Nov 17, 2022

/publish connector=connectors/source-postgres-strict-encrypt auto-bump-version=false

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@alafanechere
Copy link
Contributor

@rodireich I'm under the impression the cause of this failure is not the config format change but rather a flaky behavior on backward compatibility checks. The test strictness level does not change the behavior of spec testing and the same tests were run in your latest successful attempt.
I'm sorry for this flaky behavior and opened an issue to try to fix it if's occurring more often .

akashkulk pushed a commit that referenced this pull request Dec 2, 2022
* Initial working commit

* Code sanity. Provide no-on implementation to mysql, MSSql to allow compilation.

* Update test

* sanity

* sanity

* sanity

* sanity

* sanity

* changes per review comments

* Make heartbeat change waittime configurable.

* Trying to bypass test strictness test

* Trying to bypass test strictness test

* Trying to bypass test strictness test

* fix acceptance test config format

* add missing SAT test in config

* revert back changes in acceptance-test-config.yml

* Version and notes

* auto-bump connector version

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement heartbeats for source-postgres CDC
7 participants