-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard Tests: Implement v2 test suites #2181 #2792
Standard Tests: Implement v2 test suites #2181 #2792
Conversation
records_3 = filter_output(output, type_=Type.RECORD) | ||
states_3 = filter_output(output, type_=Type.STATE) | ||
|
||
assert not records_3, "There should be no records after second incremental read" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, here should be a better check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this verifies the incremental sync invariant described here
airbyte-integrations/bases/standard-test/standard_test/tests/test_incremental.py
Show resolved
Hide resolved
airbyte-integrations/bases/standard-test/standard_test/tests/test_incremental.py
Show resolved
Hide resolved
records_3 = filter_output(output, type_=Type.RECORD) | ||
states_3 = filter_output(output, type_=Type.STATE) | ||
|
||
assert not records_3, "There should be no records after second incremental read" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this verifies the incremental sync invariant described here
airbyte-integrations/bases/standard-test/standard_test/tests/test_incremental.py
Outdated
Show resolved
Hide resolved
@@ -51,6 +50,13 @@ def test_check(self, connector_config, docker_runner: ConnectorRunner): | |||
assert len(con_messages) == 1, "Connection status message should be emitted exactly once" | |||
assert con_messages[0].connectionStatus.status == Status.SUCCEEDED | |||
|
|||
def test_check_without_network(self, connector_config, docker_runner: ConnectorRunner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my guess is that this check will work well for most sources but sources which don't always require internet like source-file might not need it. I see a couple of ways around this:
- add a flag to the config
requires_internet
that can be used to disable this check - don't use this check and rely on the other ones which malform the configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for №2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few comments but feel free to merge once they are addressed
@@ -1,6 +1,6 @@ | |||
# Running Standard Source Tests | |||
```bash | |||
PYTHONPATH=. pytest standard_test.py --standard_test_config=/connectors/source-hubspot/ -vvv | |||
python -m pytest standard_test/tests --standard_test_config=../../connectors/source-hubspot/ -vvv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the hubspot bit be removed?
@@ -51,6 +52,13 @@ def test_check(self, connector_config, docker_runner: ConnectorRunner): | |||
assert len(con_messages) == 1, "Connection status message should be emitted exactly once" | |||
assert con_messages[0].connectionStatus.status == Status.SUCCEEDED | |||
|
|||
# def test_check_without_network(self, connector_config, docker_runner: ConnectorRunner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we delete instead of comment out?
|
||
|
||
@pytest.mark.timeout(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only problem with this is if the docker image takes a long time to download. Maybe we should have a fixture that downloads the docker image first so downstream tests can set reasonable timeouts like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed with pull_docker_image
airbyte-integrations/bases/standard-test/standard_test/tests/test_core.py
Show resolved
Hide resolved
@@ -69,9 +78,9 @@ def test_discover(self, connector_config, catalog, docker_runner: ConnectorRunne | |||
assert catalog_messages[0].catalog == catalog, "Catalog should match the one that was provided" | |||
|
|||
|
|||
@pytest.mark.timeout(300) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very short timeout given that some sources might have lots of streams or be reading lots of data. WDYT? Maybe this method shouldn't have a timeout? or maybe the timeout should be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not forcing timeout defeats the purpose of it. Another aspect is that we want to keep tests fast and scalable. They shouldn't take more than 20 minutes. If a connector can't guarantee this time either it is defective or it's test account configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree in principle, but pragmatically speaking I worry this may be difficult/full of friction to do in practice, so my suggestion is borne out of practicality rather than philosophy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make it configurable? 300 second timeout sounds fine in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as agreed with @sherifnada I will create a ticket for this, for now we will move on
|
||
|
||
@pytest.fixture(name="cursor_paths") | ||
def cursor_paths_fixture(inputs, configured_catalog_for_incremental): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we put return type hints for all these fixtures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
result = {} | ||
|
||
for stream in configured_catalog_for_incremental.streams: | ||
path = cursor_paths.get(stream.stream.name, [stream.cursor_field[-1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the cursor field could be nested so we should return the full cursor field instead of just the last element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, this is intentional, the default behavior in states don't use nested cursors, wherein records it is often the case.
I think the chance of events is the following:
(flat cursor in records) > (nested cursor in records) => (flat cursor in state) > (nested cursor in states)
so default behaviour here should appear more frequently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see 👍🏼
records = filter_output(output, type_=Type.RECORD) | ||
states = filter_output(output, type_=Type.STATE) | ||
|
||
assert states, "Should produce at least one state" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this test be merged with the one below? I understand that ideally it's not, but we have to be pragmatic because these tests can take a long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
return catalog | ||
|
||
|
||
@pytest.mark.timeout(300) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as basic read about timeout -- this will be too short for some sources
helper = JsonSchemaHelper(schema=stream.stream.json_schema) | ||
record_value = helper.get_cursor_value(record=record.record.data, cursor_path=stream.cursor_field) | ||
state_value = helper.get_state_value(state=latest_state[stream_name], cursor_path=cursor_paths[stream_name]) | ||
assert record_value <= state_value, "First incremental sync should produce records younger or equal to cursor value from the state" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea with this check!
dc82433
to
a825b25
Compare
result = {} | ||
|
||
for stream in configured_catalog_for_incremental.streams: | ||
path = cursor_paths.get(stream.stream.name, [stream.cursor_field[-1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see 👍🏼
@pytest.fixture(name="configured_catalog_for_incremental") | ||
def configured_catalog_for_incremental_fixture(configured_catalog) -> ConfiguredAirbyteCatalog: | ||
catalog = incremental_only_catalog(configured_catalog) | ||
for stream in catalog.streams: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we already do this filtering elsewhere? I must have missed it
closes #2181
What
Describe what the change is solving
It helps to add screenshots if it affects the frontend.
How
Describe the solution
Pre-merge Checklist
Recommended reading order
test.java
component.ts