Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live tests: GHA to run both validation & regression tests #38816

Merged
merged 8 commits into from
Jun 20, 2024

Conversation

clnoll
Copy link
Contributor

@clnoll clnoll commented May 31, 2024

What

Add the ability to run all live tests in CI.

How

Updates airbyte-ci to have a new command for running all validation + regression tests. Ultimately we may want to remove the regression tests altogether.

Also adds a GHA for running validation + regression tests.

Closes https://github.com/airbytehq/airbyte-internal-issues/issues/7854.

@clnoll clnoll requested review from a team as code owners May 31, 2024 15:28
Copy link

vercel bot commented May 31, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jun 20, 2024 5:44pm

Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions!

Comment on lines 585 to 593
run_pytest_with_proxy = dedent(
f"""
{run_proxy} &
proxy_pid=$!
{run_pytest}
pytest_exit=$?
kill $proxy_pid
wait $proxy_pid
exit $pytest_exit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📚 The explanation is a bit far from the weird stuff were doing. We should move this to another small function and likely discuss what were doing, and why, for future maintainers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call-out. I've moved it into its own method and enhanced the docstring.

@@ -292,5 +292,11 @@ def get_test_steps(context: ConnectorTestContext) -> STEP_TREE:
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),
StepToRun(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Do we have any reason to not run these tests we need to plan for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, they'll always be skipped unless specifically selected. Whenever we're ready for broader automation we'll want to talk about the frequency.

@clnoll clnoll requested a review from bnchrch May 31, 2024 18:12
Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% certain, but I would love us to pick this up and merge this without waiting for 2 weeks for Catherine to get back.

I'd love to get to deterministic output of these tests (pass/fail) for validation tests.

.github/workflows/validation_tests.yml Outdated Show resolved Hide resolved
run: |
./tools/bin/find_non_rate_limited_PAT \
${{ secrets.GH_PAT_BUILD_RUNNER_OSS }} \
${{ secrets.GH_PAT_BUILD_RUNNER_BACKUP }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing as with check_version_increment — this is going to fail for community connectors. Should we add GITHUB_TOKEN to the list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In it's current state this worklow can't run on fork as:

  • It is triggered by worfklow dispatch event, the checked out repo is the airbyte one
  • There's no input to make this worfklow run on a different repo

I think it's fair to port it to the community_ci.yml workflow once we validate it can automatically run these test suite on our own PRs. Which is not the scope of this current change.

connector_name:
description: Connector name (e.g. source-faker)
required: true
connection_id:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should connection_id be optional if we're supporting live validation tests too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set auto as a default. I think @clnoll original intent was to make users very intentional about which connection is used for testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep it required until we make these part of automated CI.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I get how LiveTests is different from RegressionTests.
I'd have expected two separate concrete steps (RegressionTests and ValidationTests) which would inherit from LiveTests if we want to share some execution logic between the two.
Having two different Step classes would help in differentiating the two test suites, their purpose, and in conditionally controlling their execution.

skipped_exit_code = 5
accept_extra_params = True
regression_tests_artifacts_dir = Path("/tmp/regression_tests_artifacts")
tests_artifacts_dir = Path("/tmp/regression_tests_artifacts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd move this attribyte to the RegressionTests subclass if the validation test suite is not producing artifact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifacts are test suite agnostic (I've updated the path and some variable names to reflect this). Currently the validation tests themselves aren't producing any test failure-specific artifacts (e.g. the catalog diff produced by regression tests) but we still want the command execution artifacts.

(See https://docs.dagger.io/manuals/developer/python/328492/services/ and https://cloud.google.com/sql/docs/postgres/sql-proxy#cloud-sql-auth-proxy-docker-image)
"""
run_proxy = "./cloud-sql-proxy prod-ab-cloud-proj:us-west3:prod-pgsql-replica --credentials-file /tmp/credentials.json"
def _test_command(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as should_read_with_state or test_evaluation_mode attributes are not set in __init__ I'd suggest making this function static and make the caller pass these parameters.
We have no guarantee that self.test_evaluation_mode is set...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's set in __init__ here, as is should_read_with_state.

@@ -305,14 +305,14 @@ async def _build_connector_acceptance_test(self, connector_under_test_container:
return cat_container.with_unix_socket("/var/run/docker.sock", self.context.dagger_client.host().unix_socket("/var/run/docker.sock"))


class RegressionTests(Step):
"""A step to run regression tests for a connector."""
class LiveTests(Step):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As LiveTests is a "macro" test suite composed of ValidationTests and RegressionTests I would suggest making it "abstract" and implement a specific ValidationTests step...

In it's current form I can't see how this step is execute a different thing than RegressionTests as the test input directory remains src/live_tests/regression_tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As LiveTests is a "macro" test suite composed of ValidationTests and RegressionTests I would suggest making it "abstract" and implement a specific ValidationTests step...

Done.

In it's current form I can't see how this step is execute a different thing than RegressionTests as the test input directory remains src/live_tests/regression_tests

Oops, that was an oversight on my part. Updated the input directory to src/live_tests.

Copy link
Contributor Author

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere thanks for the comments. I've added a new ValidationTest step, in addition to the LiveTest step that runs validation + regression.

For now I'm holding off on a GHA specifically for validation tests, but will add one if it feels like it will be needed. It feels likely that we'll want to deprecate the regression tests GHA instead, and opt to always run regression + validation together.

connector_name:
description: Connector name (e.g. source-faker)
required: true
connection_id:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep it required until we make these part of automated CI.

skipped_exit_code = 5
accept_extra_params = True
regression_tests_artifacts_dir = Path("/tmp/regression_tests_artifacts")
tests_artifacts_dir = Path("/tmp/regression_tests_artifacts")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifacts are test suite agnostic (I've updated the path and some variable names to reflect this). Currently the validation tests themselves aren't producing any test failure-specific artifacts (e.g. the catalog diff produced by regression tests) but we still want the command execution artifacts.

(See https://docs.dagger.io/manuals/developer/python/328492/services/ and https://cloud.google.com/sql/docs/postgres/sql-proxy#cloud-sql-auth-proxy-docker-image)
"""
run_proxy = "./cloud-sql-proxy prod-ab-cloud-proj:us-west3:prod-pgsql-replica --credentials-file /tmp/credentials.json"
def _test_command(self) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's set in __init__ here, as is should_read_with_state.

@@ -305,14 +305,14 @@ async def _build_connector_acceptance_test(self, connector_under_test_container:
return cat_container.with_unix_socket("/var/run/docker.sock", self.context.dagger_client.host().unix_socket("/var/run/docker.sock"))


class RegressionTests(Step):
"""A step to run regression tests for a connector."""
class LiveTests(Step):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As LiveTests is a "macro" test suite composed of ValidationTests and RegressionTests I would suggest making it "abstract" and implement a specific ValidationTests step...

Done.

In it's current form I can't see how this step is execute a different thing than RegressionTests as the test input directory remains src/live_tests/regression_tests

Oops, that was an oversight on my part. Updated the input directory to src/live_tests.

@clnoll clnoll requested a review from alafanechere June 18, 2024 18:29
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just requesting change because I believe we always want to run Validation and Regression separately.

Comment on lines 295 to 306
StepToRun(
id=CONNECTOR_TEST_STEP_ID.CONNECTOR_VALIDATION_TESTS,
step=ValidationTests(context),
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),
StepToRun(
id=CONNECTOR_TEST_STEP_ID.CONNECTOR_LIVE_TESTS,
step=LiveTests(context),
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a test step granularity decision to make.
My opinion is that we should have two distincts steps: ValidationTests and RegressionTests.
They are both sub-suites of live tests s but we might orchestrate these test suites differently according to the connector state (we might allow CI to pass if validation fails for the same reason as on master, but fail it if regression test fail etc.).

In others words: lets not make airbyte-ci aware that the live test suite can be used to run both regression and validation.

Suggested change
StepToRun(
id=CONNECTOR_TEST_STEP_ID.CONNECTOR_VALIDATION_TESTS,
step=ValidationTests(context),
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),
StepToRun(
id=CONNECTOR_TEST_STEP_ID.CONNECTOR_LIVE_TESTS,
step=LiveTests(context),
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),
StepToRun(
id=CONNECTOR_TEST_STEP_ID.CONNECTOR_VALIDATION_TESTS,
step=ValidationTests(context),
args=lambda results: {"connector_under_test_container": results[CONNECTOR_TEST_STEP_ID.BUILD].output[LOCAL_BUILD_PLATFORM]},
depends_on=[CONNECTOR_TEST_STEP_ID.BUILD],
),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you that we may want to have a flow where we only run validation tests if regression fails, but the simplest way to reuse the test artifacts is to invoke them both under the same pytest command. As currently written, running regression tests followed by validation tests in a separate step will rerun the airbyte commands so we can't reuse anything.

I believe we can pass the artifacts from one step to another without much problem, but I think we'll still have to modify the test suite to avoid pytest rerunning the commands. Correct me if I'm wrong, but if that sounds right to you I think I'd prefer to make that change separately, if we feel we need to make it.

Copy link
Contributor

@alafanechere alafanechere Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the simplest way to reuse the test artifacts

We could create a dagger cache volume on the artifact path which would make sure artifacts are shared between steps.

avoid pytest rerunning the commands

Do you mean the connector commands?
The connector runner uses dagger under the hood, so if the command inputs did not change between the step their output would be cached. (the behavior has to be tested).

But I hear your 😄 Then lets make a single LiveTest step and control what test suite run in it (validation/regression) via step arguments.

This delays the complexity of decoupling the test suites in separate steps if we ever want to make our CI logic more granular. But I'm fine with it.

skipped_exit_code = 5
accept_extra_params = True
regression_tests_artifacts_dir = Path("/tmp/regression_tests_artifacts")
tests_artifacts_dir = Path("/tmp/live_tests_artifacts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm not sure if this path refers to a local one or an in-container one. A comment or a rename might help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 updated to in_container_tests_artifacts_dir.

@@ -115,7 +115,7 @@ async def test(
raise click.UsageError("Cannot use both --only-step and --skip-step at the same time.")
if not only_steps:
skip_steps = list(skip_steps)
skip_steps += [CONNECTOR_TEST_STEP_ID.CONNECTOR_REGRESSION_TESTS]
skip_steps += [CONNECTOR_TEST_STEP_ID.CONNECTOR_REGRESSION_TESTS, CONNECTOR_TEST_STEP_ID.CONNECTOR_VALIDATION_TESTS, CONNECTOR_TEST_STEP_ID.CONNECTOR_LIVE_TESTS]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be helpful to keep this list in a constant named something like STEPS_SKIPPED_BY_DEFAULT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 done.


title = "Regression tests"

def _test_command(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd recommend returning the a list and joining it at call time. Just because the usual with_exec input is a list. (the fact we join it is an implementation detail at the container execution logic level)

Comment on lines 714 to 722
self.connection_id or "",
"--control-version",
self.control_version or "",
"--target-version",
self.target_version or "",
"--pr-url",
self.pr_url or "",
"--run-id",
self.run_id or "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these or "" look strange to me, shall we remove the option name for the list if attributes are falsy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done. This allowed me to do a bit more cleanup of the repeated test command code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share what motivated these changes? They do not look related to the scope of the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a previous commit I was importing this stuff from the file-based CDK, which meant that to run this test suite we needed the CDK and all all the file-based dependencies (some of which are pretty large). That felt overly burdensome since we only really need this part of it, for the test that records conform to the schema.

This and the related tests are all copied from here and here.

echo "USE_LOCAL_CDK_FLAG=" >> $GITHUB_ENV
fi

- name: Run Live Tests [WORKFLOW DISPATCH]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Run Live Tests [WORKFLOW DISPATCH]
- name: Run Validation Tests [WORKFLOW DISPATCH]

github_token: ${{ secrets.GH_PAT_MAINTENANCE_OSS }}
s3_build_cache_access_key_id: ${{ secrets.SELF_RUNNER_AWS_ACCESS_KEY_ID }}
s3_build_cache_secret_key: ${{ secrets.SELF_RUNNER_AWS_SECRET_ACCESS_KEY }}
subcommand: connectors ${{ env.USE_LOCAL_CDK_FLAG }} --name ${{ github.event.inputs.connector_name }} test --only-step connector_live_tests --connector_live_tests.connection-id=${{ github.event.inputs.connection_id }} --connector_live_tests.pr-url=${{ github.event.inputs.pr_url }} ${{ env.STREAM_PARAMS }} --connector_live_tests.test-evaluation-mode=diagnostic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
subcommand: connectors ${{ env.USE_LOCAL_CDK_FLAG }} --name ${{ github.event.inputs.connector_name }} test --only-step connector_live_tests --connector_live_tests.connection-id=${{ github.event.inputs.connection_id }} --connector_live_tests.pr-url=${{ github.event.inputs.pr_url }} ${{ env.STREAM_PARAMS }} --connector_live_tests.test-evaluation-mode=diagnostic
subcommand: connectors ${{ env.USE_LOCAL_CDK_FLAG }} --name ${{ github.event.inputs.connector_name }} test --only-step connector_validation_tests --connector_validation_tests.connection-id=${{ github.event.inputs.connection_id }} --connector_validation_tests.pr-url=${{ github.event.inputs.pr_url }} ${{ env.STREAM_PARAMS }} --connector_validation_tests.test-evaluation-mode=diagnostic

clnoll and others added 2 commits June 20, 2024 08:04
…s/test/steps/common.py

Co-authored-by: Augustin <augustin@airbyte.io>
@clnoll clnoll force-pushed the catherine/live-tests-gha branch 2 times, most recently from a6b2737 to 8f416b0 Compare June 20, 2024 16:54
@clnoll clnoll merged commit c7ecc41 into master Jun 20, 2024
29 checks passed
@clnoll clnoll deleted the catherine/live-tests-gha branch June 20, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment