Explicitly exit to stop hung background threads in MySQL source connector #11910
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
I have been working with Airbyte for around a month now and encountered a problem when loading larger (50M+ record) tables from MySQL to Snowflake. For these loads the source connector would hang after the log message "completed source: class io.airbyte.integrations.source.mysql.MySqlSource".
The problem I had matches this issue exactly: #4322 . It also matches this issue: #5754 . It is possibly related to this issue, though this is for a different connector (db2): #8218 .
In the end I discovered that for certain MySQL loads (seemingly larger ones, though smaller loads failed for me at times too) the Debezium engine and executor both fail to shutdown and leave some remaining threads hanging in the background. These threads then prevent the JVM from exiting when the MySQL source connector's main() function exits. I am not a Java dev but to test this I added the following debugging code:
MySqlSource.java:
DebeziumRecordPublisher.java:
I found that failed runs never closed the Debezium engine or executor properly (likewise, the message "Debezium engine shutdown." which comes from the completion callback to the Debezium engine above, is also never called in this situation). I also found that there were extra threads hanging around when the sync would fail, which were output by the changes to MySqlSource.java.
How
I considered three options for a fix:
System.exit(0)
to the end of the main method.I have taken option 3 and am submitting it in this PR. My reasoning is that I believe ultimately Debezium should be upgraded to 1.8, which is much more work than I am able to contribute right now. Adding a System.exit(0) to the end of the main method forces the JVM to close any background threads and allows the source container to exit without hanging. To my understanding AirByte only closes the engine once it has received all records of interest, which means there is very little risk that the background thread hasn't satisfactorily completed our goal.
Sadly I am not able to replicate this in a unit test, but if any additional supporting info is required I would be happy to provide.
Recommended reading order
MySqlSource.Java
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
No breaking changes.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.