Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectors-ci: fix postgres integration testing #25942

Merged
merged 19 commits into from
May 11, 2023

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented May 10, 2023

What

Closes #25381
This PR is an attempt to make source-postgres integration java test pass in the Dagger pipeline context, and still making them work for original /test run or locally.

Original problems:

  • The container under test (source-postgres) could not reach the testcontainers databases that are spawned by the java tests
  • Flakyness on CDC tests.

How

Fixing networking issues

There were inconsistencies in the way test set up the source-postgres configuration to connect to the test database.
The Java test need to access to test databases to seed it with test data. In this context the java test should resolve the database host/port targetting the docker host and the exposed ports.
The source-postgres need to connect to test databases to READ. In this context, to resolve the database host/port the internal container ip/exposed ports must be used.

flowchart TD
    subgraph master ["Dagger engine"]
        itj["Gradle integrationTestJava"]
        subgraph dockerHost ["Docker host"]
            cut["source-postgres"]
            testcontainer["Test postgres database"]
            cut --"Resolve with docker host internal ip addresses and mapped ports"-->testcontainer
        end
        itj --"Resolve with docker host hostname and exposed ports for INSERT"--> testcontainer
    end
Loading

In practice:

To fill the source-postgres config we get the container internal ip address/exposed port by using HostPortResolver utility class:

    config = Jsons.jsonNode(ImmutableMap.builder()
        .put(JdbcUtils.HOST_KEY, HostPortResolver.resolveHost(container))
        .put(JdbcUtils.PORT_KEY, HostPortResolver.resolvePort(container))
        .put(JdbcUtils.DATABASE_KEY, container.getDatabaseName())
        .put(JdbcUtils.SCHEMAS_KEY, List.of(NAMESPACE))
        .put(JdbcUtils.USERNAME_KEY, container.getUsername())
        .put(JdbcUtils.PASSWORD_KEY, container.getPassword())
        .put("replication_method", replicationMethod)
        .put(JdbcUtils.SSL_KEY, false)
        .put("is_test", true)
        .build());

To seed the test database we use the host/exposed port by using the .getHost() / .getFirstMappedPort() on the container object:

    try (final DSLContext dslContext = DSLContextFactory.create(
        config.get(JdbcUtils.USERNAME_KEY).asText(),
        config.get(JdbcUtils.PASSWORD_KEY).asText(),
        DatabaseDriver.POSTGRESQL.getDriverClassName(),
        String.format(DatabaseDriver.POSTGRESQL.getUrlFormatString(),
            container.getHost(),
            container.getFirstMappedPort(),
            config.get(JdbcUtils.DATABASE_KEY).asText()),
        SQLDialect.POSTGRES)) {
      final Database database = new Database(dslContext);

      database.query(ctx -> {
        ctx.execute("CREATE TABLE id_and_name(id INTEGER  primary key, name VARCHAR(200));");
        ctx.execute("INSERT INTO id_and_name (id, name) VALUES (1,'picard'),  (2, 'crusher'), (3, 'vash');");
        ctx.execute("CREATE TABLE starships(id INTEGER primary key, name VARCHAR(200));");
        ctx.execute("INSERT INTO starships (id, name) VALUES (1,'enterprise-d'),  (2, 'defiant'), (3, 'yamato');");
        ctx.execute("SELECT pg_create_logical_replication_slot('" + SLOT_NAME_BASE + "', 'pgoutput');");
        ctx.execute("CREATE PUBLICATION " + PUBLICATION + " FOR ALL TABLES;");
        return null;
      });

Fixing CDC test flakyness

Increasing the value of INITIAL_WAITING_SECONDS to 30 seconds in the connector config made tests more stable. I observed the flakyness locally and in dagger pipelines.

Improving Gradle performance in Dagger Pipeline

I introduced the use of the Gradle S3 cache in Dagger pipeline to benefit from cross build caching in the CI.
Even if caching is enabled by default in our main gradle.properties file the cache is not used for local run according to settings.gradle. To enable the use of the gradle remote cache I added the required env var to the gradle container.

🚨 User Impact 🚨

According to my manual tests these changes make source-postgres:integrationTestJava pass locally, in dagger pipelines and we /test.
Please note that these inconsistency in how we connect to test databases are also present on other java connectors. I focused on source-posgres because it's GA. I might proceed in changing other connectors if this PR is 👍 and I spot other /test vs Dagger pipeline inconsistencies.

@octavia-squidington-iii octavia-squidington-iii added the area/connectors Connector related issues label May 10, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 10, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan and you've followed all steps in the Breaking Changes Checklist
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • You, or an Airbyter, have run /test successfully on this PR - or on a non-forked branch
  • You, or an Airbyter, have run /publish successfully on this PR - or on a non-forked branch
  • You've updated the connector's metadata.yaml file (new!)
  • The Octavia bot updated the source_definitions.yaml or destination_definitions.yaml, or you ran processResources manually (deprecated)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@alafanechere
Copy link
Contributor Author

alafanechere commented May 10, 2023

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/4932259119
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/4932259119
No Python unittests run

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: not found in the config.
SKIPPED [2] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:100: The previous and actual specifications are identical.
SKIPPED [2] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:578: The previous and actual discovered catalogs are identical.
=================== 68 passed, 5 skipped in 79.72s (0:01:19) ===================

@alafanechere
Copy link
Contributor Author

🎉Achieved both successful /test above and successful Dagger pipeline run here. Some CDC testing about WAL and Data type look flaky locally / on Dagger pipeline. I'm going to re-run the Dagger pipeline a couple of time for sanity.

@alafanechere
Copy link
Contributor Author

Confirmed the flakyness of CdcInitialSnapshotPostgresSourceDatatypeTest > testDataContent() (failing in this run)

@airbytehq airbytehq deleted a comment from github-actions bot May 10, 2023
@airbytehq airbytehq deleted a comment from github-actions bot May 10, 2023
@alafanechere
Copy link
Contributor Author

alafanechere commented May 10, 2023

@airbytehq airbytehq deleted a comment from github-actions bot May 10, 2023
@alafanechere
Copy link
Contributor Author

/test connector=connectors/source-postgres

@alafanechere alafanechere marked this pull request as ready for review May 10, 2023 13:22
@alafanechere alafanechere requested review from rodireich, evantahler and a team May 10, 2023 13:24
@alafanechere
Copy link
Contributor Author

/test connector=connectors/source-postgres

Copy link
Contributor

@evantahler evantahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to punt on this review... I don't really know enough about what's going on in Java to be useful

@evantahler evantahler requested a review from a team May 10, 2023 19:53
Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about a comment.

Outside of that I have no objections.

.with_exec(["mkdir", "/airbyte"])
.with_mounted_directory("/airbyte", context.get_repo_dir(".", include=include))
.with_mounted_cache("/airbyte/.gradle", airbyte_gradle_cache, sharing=CacheSharingMode.LOCKED)
.with_workdir("/airbyte")
)
if context.is_ci:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere Can we leave a comment around this that explains

  1. What we are doing
  2. Why
  3. When we should remove it and whos responsible

return System.getProperty("os.name").toLowerCase().startsWith("mac")
? getIpAddress(container)
: container.getHost();
return getIpAddress(container);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to break the clickhouse integration tests on Mac I think.
The original change was #14701 back in July last year

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source-clickhouse that is

config = Jsons.jsonNode(ImmutableMap.builder()
.put("host", containerInnerAddress.left)
.put("port", containerInnerAddress.right)
.put("host", HostPortResolver.resolveHost(container))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to return the inner address and port of a container? This was the reason tests were originally failing when run on Mac.
Can you validate that running integration test on your Mac is still passing.
For example:

./gradlew --info :airbyte-integrations:connectors:source-postgres:integrationTestJava --tests "io.airbyte.integrations.io.airbyte.integration_tests.sources.PostgresSourceSSLCaCertificateAcceptanceTest"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it passed, this is why I made the change...
Screen Shot 2023-05-11 at 19 34 13

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks
I'm just making sure as I don't know dagger 😅

@@ -23,7 +23,7 @@ public class CdcInitialSnapshotPostgresSourceDatatypeTest extends AbstractPostgr
private static final String SCHEMA_NAME = "test";
private static final String SLOT_NAME_BASE = "debezium_slot";
private static final String PUBLICATION = "publication";
private static final int INITIAL_WAITING_SECONDS = 5;
private static final int INITIAL_WAITING_SECONDS = 30;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the difference on dagger expected?

Copy link
Contributor Author

@alafanechere alafanechere May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I faced flaky results on my local machine too and increasing the waiting time mitigate these problems on both local and dagger. I found a PR from Sergio on another connector that did it for the same reasons.

@rodireich
Copy link
Contributor

rodireich commented May 11, 2023

/test connector=connectors/source-clickhouse

🕑 connectors/source-clickhouse https://github.com/airbytehq/airbyte/actions/runs/4951414309
✅ connectors/source-clickhouse https://github.com/airbytehq/airbyte/actions/runs/4951414309
No Python unittests run

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestConnection.test_check: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestDiscovery.test_discover: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestBasicRead.test_read: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestFullRefresh.test_sequential_reads: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: not found in the config.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:100: The previous and actual specifications are identical.
================= 22 passed, 6 skipped, 30 warnings in 16.44s ==================

@rodireich
Copy link
Contributor

@alafanechere I see source-clickhouse failing locally on my Mac with and without your change (I wonder if this is happening to all or just on my setup).
I'm ok with merging this change since it doesn't seem to break anything that wasn't already broken.

@alafanechere
Copy link
Contributor Author

@alafanechere I see source-clickhouse failing locally on my Mac with and without your change (I wonder if this is happening to all or just on my setup).
I'm ok with merging this change since it doesn't seem to break anything that wasn't already broken.

🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

connectors-ci: source-postgres integrationTestJava with Dagger are failing
5 participants